How to add sugar to the Python syntax with main wrappers

Eugen Hotaj
Jul 17 · 6 min read

I love Python. It’s an extremely powerful language that’s very simple to write and understand. It also has an extensive ecosystem of libraries that allow you to do pretty much anything you can think of (even fly). Recently, it has become the lingua franca of Data Science and Machine Learning. I use it constantly in my day job, so I’d say I’m fairly proficient at it (I do have Python readability at Google after all 😏¹).

However, Python is not perfect. I’ve recently started learning Lisp in my free time.² While learning, I stumbled across what I think is a very cool (but minor) feature of Lisp. When defining functions, Lisp allows you to use previous arguments as default values.

For example, suppose you’re writing a make_rectangle function which takes as input the x, y coordinates of the top-left corner, width, and height and returns the x, y coordinates of the 4 corners of the rectangle. We can do this pretty easily in Python:

Function to make a rectangle in Python.

Now suppose that we want the function to return squares if it’s only passed the width argument. If you never programmed before, you might expect that a reasonable thing to is to set height=width, like so:

A reasonable, but invalid, syntax for allowing `make_rectangle` to handle squares.

Unfortunately, it’s not valid Python. If you try to run it, you’ll get an error:

NameError: name 'width' is not defined.

Personally, I feel that this is a failing of the language. Sure, it’s pretty easy to work around the syntax and get a similar behavior:

The Pythonic way to allow `make_rectangle` to handle squares.

But that pisses me off! The language should do what I want, not the other way around. Instead of bending to the will of Python, I’ve created a __main__ wrapper that lets me write the code I want.

As always, full working code is available on my GitHub.

Main Wrappers

A __main__ wrapper is a Python program which wraps the __main__ function of another Python program. The wrapper can run whatever code it wants before calling through to the main program. It turns out that a lot of tools are written this way, such as pdb, profile, etc. If you’re interested in all the details, David Beazley has an interesting talk on the topic [1].

Since the wrapper runs before the main program, it’s possible for it to modify the code of the main program before executing it. This means that we can use the __main__ wrapper to transform code like improved_make_rectangle.py into valid Python like valid_make_rectangle.py. This way we can write in the more pleasant syntax while still having it execute correctly by the Python interpreter.

Discourse On Method

(A quick tangent on other possible approaches; feel free to skip.)

I thought a lot about how to add this functionality without the wrapper, perhaps through function decorators, context managers, etc. However, I came to realize that it’s likely impossible since improved_make_rectangle.py is not valid Python.

Another avenue I thought about exploring was to modify the Python interpreter itself. I didn’t go with this approach for primarily two reasons:

  1. It sounds a lot more tedious to pull off since you’d have to modify the underlying C code.
  2. It’s a less portable because it involves forking the Python interpreter. Anyone who might want to use the new syntax can only do so via the forked interpreter. Conversely, the __main__ wrapper is just regular Python.

If you know any better ways to implement this functionality, please let me know in the comments. I’m very interested in finding out!

Extending the Python Syntax

The first thing we need to do is create a wrapper which is able to run other programs. As demonstrated in [1], this is fairly simple:

A general template for writing `__main__` wrappers.

Most of the code just makes sure the Python environment is set up correctly when we execute the main program in line 17. The interesting part happens in lines 7–10: here we read the main program code, preprocess (i.e. rewrite) it to be valid Python, and compile it to be executed. We can now wrap any Python program by running it like this:

python3 -m main_wrapper some_random_program.py

Of course, the preprocess function is up to us to write. What we want to do is go through every line of code in the main program and find all function headers, i.e. all lines that start with def (ignoring any leading whitespace).³ Then for each header we find, we want to see if any of it’s arguments reference previously defined ones. If so we want to transform them into the make_rectangle_valid.py equivalent by setting them to None and for each one adding argument = argument or prev_argument to the top of the function definition. Here’s how this looks in code:

The preprocessor which transforms our extended syntax into valid Python.

We’re making a lot of implicit assumptions that would need to be ironed out in a more robust implementation. For example, we’re assuming that the main program’s code uses 4 white space characters for each indentation level (line 9) and that the overall code follows the PEP 8 style guide (otherwise line 12 would fail if kwargs had whitespace around the =, such as some_arg = 12). While we could relax these assumptions with some effort, what we have here is a good first step.

Finally, you may have noticed the cryptic call to the _buid_arg_to_prev_arg function on line 10. This function is responsible for parsing a function header and extracting all arguments which reference previously defined arguments. To do this, it first removes everything that’s not a function argument, such as white space, parentheses, the def statement, etc. Then, for each argument, it checks if it references any previously defined argument, and if so adds it to a dictionary. Finally, the dictionary is returned. This translates into the following code:

A function which builds a dictionary from parameter to previously defined parameter.

We now have all the components in place to make our wrapper work. Here is the improved_make_rectangle.py code again with a simple __main__ method to sanity check the implementation for both rectangles and squares:

Putting it all together.

As before, running without the wrapper raises a NameError:

$ python3 wrapper_test.py
Traceback (most recent call last):
File "wrapper_test.py", line 1, in <module>
def make_rectangle(x, y, width, height=width):
NameError: name 'width' is not defined

However, when we run with the wrapper, the messages correctly print out in the terminal:

$ python3 -m wrap_main wrapper_test.py
Rectangle: ((0, 0), (100, 0), (100, 200), (0, 200))
Square: ((0, 0), (100, 0), (100, 100), (0, 100))

Conclusion

In this article we added some extra syntactic sugar to Python by writing a __main__ wrapper. As it currently stands, the wrapper is pretty brittle and would fail spectacularly if you tried to use it in real world situations. Also the amount of code we had to write to get rid of a few extra lines is pretty large. So, all things considered, the wrapper is probably not worth it. However, __main__ wrappers themselves are a general meta-programming technique which allow you to extend Python to do very cool and interesting things. If nothing else, they’re fun to write 🙂.

Thanks for reading!

Eugen Hotaj
July 15, 2019

P.S. For the sake of brevity, I’ve skipped over some minor things in this article. If you want the full details, please check out the full, working code on my GitHub.

If you liked this article, then follow me to get notified about new posts!

Footnotes

¹ Of all things to brag about, having Python readability at Google is probably the dumbest. In case it wasn’t obvious, I’m only kidding.

² If the gods wrought the universe in Lisp, I want to find out what all the fuss is about. If you’re also curious, Practical Common Lisp is a great, free book.

³ It’s worth pointing out that finding function definitions this way is very brittle. While this is OK for quick prototyping, something more robust would need to follow the Python reference.

References

[1] D. Beazley, Modules and Packages: Live and Let Die!, PyCon (2015).

Better Programming

Advice for programmers.

Eugen Hotaj

Written by

Research Engineer at Google

Better Programming

Advice for programmers.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade