Automatically editing code

I really enjoy making automated changes to large codebases. When sed won’t suffice and the task would take too much elbow grease to do by hand, my favorite tool (for python) is redbaron. In this post, I’ll walk through a simple sounding operation that would be tough without a tool like redbaron.


Let’s say your favorite library decides that an argument which previously had a default value, must now be explicitly stated. For me, this was when Django started requiring on_delete to be set in Django 1.9.

Your code may sometimes be simple enough that a tool like sed can add this new argument, but if the argument is already specified on a different line, sed won’t be able to easily detect it and avoid inserting a duplicate. In the below code, the publisher field is already compliant, but the author field is missing the newly required argument.

class Book:
  author = field.ForeignKey("bookstore.User") 
  publisher = field.ForeignKey("bookstore.Publisher",
    null=True, 
    on_delete=models.CASCADE,
    editable=False,
  )

Here’s where redbaron shines. You tell it to find anything that looks like ForeignKey(...), scan the arguments for on_delete, and if none match, then append the new argument. But redbaron doesn’t see the snippet exactly as you might expect, so the first step is looking at the AST that it generates.

ClassNode()
  name='Book'
  parenthesis=False
  value ->
    * AssignmentNode()
        target ->
          NameNode()
            value='author'
        value ->
          AtomtrailersNode()
            value ->
              * NameNode()
                  value='fields'
              * NameNode()
                  value='ForeignKey'
              * CallNode()
                  value ->
                    * CallArgumentNode()
                        value ->
                          StringNode()
                            value='"bookstore.User"'
...

This might be a little more complex than you would expect if you’ve used ast to parse files before. There’s nothing here that looks like “a call to ForeignKey,” but there is a list of atoms, one of which is ForeignKey and another of which is a CallNode.

Given this AST, you can add the argument with a script like

fp = open("models.py")
code = redbaron.RedBaron(fp.read())
for namenode in code.find_all("name", value="ForeignKey"):
  callnode = namenode.parent.find("call")
  if not callnode:
    continue  # not called

  if callnode.find(
      "callargument", 
      target=lambda t: t and t.value == "on_delete"):
    continue  # already has on_delete specified

  callnode.append("on_delete=models.CASCADE") 

The indentation produced by the above code may look bad. Fiddling with the indentation is possible with redbaron, but I’m fortunate to mostly run these scripts against auto-formatted codebases.

This example is deliberately concise but hopefully piques your interest in redbaron. If you read the script carefully, I’m sure you can come up with some tricky code that dodges the simple filters I’ve written, but I’ve found tricky code is rare in professional codebases and scripts like this find almost all instances of the patterns I look for.

If you’re not ready to write your own code-editing scripts just yet, pyupdate has a bunch of great rules like replacing yield within a for loop with yield from where possible.

Updated: