Automatically editing code
I really enjoy making automated changes to large codebases. When sed
won’t
suffice and the task would take too much elbow grease to do by hand, my
favorite tool (for python) is redbaron. In this post, I’ll walk
through a simple sounding operation that would be tough without a tool like
redbaron.
Let’s say your favorite library decides that an argument which previously had a
default value, must now be explicitly stated. For me, this was when Django
started requiring on_delete
to be set in Django
1.9.
Your code may sometimes be simple enough that a tool like sed
can add this
new argument, but if the argument is already specified on a different line,
sed
won’t be able to easily detect it and avoid inserting a duplicate. In the
below code, the publisher
field is already compliant, but the author
field
is missing the newly required argument.
class Book:
author = field.ForeignKey("bookstore.User")
publisher = field.ForeignKey("bookstore.Publisher",
null=True,
on_delete=models.CASCADE,
editable=False,
)
Here’s where redbaron shines. You tell it to find anything that looks like
ForeignKey(...)
, scan the arguments for on_delete
, and if none match, then
append the new argument. But redbaron
doesn’t see the snippet exactly as you
might expect, so the first step is looking at the AST that it generates.
ClassNode()
name='Book'
parenthesis=False
value ->
* AssignmentNode()
target ->
NameNode()
value='author'
value ->
AtomtrailersNode()
value ->
* NameNode()
value='fields'
* NameNode()
value='ForeignKey'
* CallNode()
value ->
* CallArgumentNode()
value ->
StringNode()
value='"bookstore.User"'
...
This might be a little more complex than you would expect if you’ve used ast
to parse files before. There’s nothing here that looks like “a call to
ForeignKey,” but there is a list of atoms, one of which is ForeignKey and
another of which is a CallNode
.
Given this AST, you can add the argument with a script like
fp = open("models.py")
code = redbaron.RedBaron(fp.read())
for namenode in code.find_all("name", value="ForeignKey"):
callnode = namenode.parent.find("call")
if not callnode:
continue # not called
if callnode.find(
"callargument",
target=lambda t: t and t.value == "on_delete"):
continue # already has on_delete specified
callnode.append("on_delete=models.CASCADE")
The indentation produced by the above code may look bad. Fiddling with the indentation is possible with redbaron, but I’m fortunate to mostly run these scripts against auto-formatted codebases.
This example is deliberately concise but hopefully piques your interest in
redbaron
. If you read the script carefully, I’m sure you can come up with
some tricky code that dodges the simple filters I’ve written, but I’ve found
tricky code is rare in professional codebases and scripts like this find almost
all instances of the patterns I look for.
If you’re not ready to write your own code-editing scripts just yet,
pyupdate has a bunch of great rules like replacing yield
within
a for loop with yield from
where possible.