ADVANCED PYTHON PROGRAMMING

Functions at Last

This time, we discuss functions — all the different signatures they can have, and the different ways they can be invoked.

Dan Gittik
12 min readApr 8, 2020

--

Having talked about assignment, conditions and loops, we’re finally ready to tackle what computer science is really about: abstracting problems away in parametrized units of code.

In 1952, David Wheeler defined the closed subroutine: a self-contained part of a programme, which is capable of being used in different programmes. As obvious as it sounds to us, back then it was completely revolutionary—insomuch that the instruction that made this possible was dubbed the “Wheeler Jump”. We’ve come a long way since then: in Python, functions are infinitely more versatile and complex; and yet there’s a lot to say about notions as basic as arguments and invocations.

Default Arguments

As you probably know, Python functions support default arguments, so that if a parameter is omitted, it defaults to some predefined value, like so:

Default Argument They Are a-Changin’

Default arguments are cool—until you use a mutable object for a default value. It’s a rookie mistake, but bear with me: there’s more to it than meets the eye. Let’s say we have this:

The reason this happens is that default arguments are evaluated when the function is defined—so they’re assigned once, and reused forever. A mutable value, like a list, can change over time—for example, have 1 appended to it—which makes the function unpredictable. In such cases, instead of setting the parameter to the default value we actually want, we’d set it to indicate no value was passed—usually withNone—and take care of that situation dynamically, in the function’s body:

This looks good, but it still has a bug—and a much more subtle one at that. See if you can spot it before you keep reading; in the meantime, let’s jog our memory about if statements: each object in Python has a boolean value, and it’s a better practice to test for false (like in not users) instead of testing for some particular implementation detail (like len(users) == 0). But sometimes, we are not testing for false—we are testing for absence, indicated by None. Have a look:

Passing in an empty list, which evaluates to false, causes the function to discard its argument and allocate a new empty list instead, which is hardly the desirable behavior. Contrast testing for false to testing for absence:

And the bug is gone:

However, this doesn’t always work—namely, if None is also a valid value, how can we indicate no value was passed? Consider this function:

Of course, this is wrong—we’re testing for false instead of absence again, so print_arg(0) results in no argument instead of argument: 0.But even this:

Is not enough, because print_arg(None) results in no argument instead of argument: None. In this case, the simplest thing to do is define a plain object, useless for all intents and purposes—except for its unique identity, which we can test against.

This one actually works! But to be honest, more often than not, I’d still use None: it’s Python’s way of indicating absence. Rather than get bogged down in such details, I prefer to change the function’s contract: instead of “prints its argument, or ‘no argument’ if no argument was passed”, it’d be “prints its argument, or ‘no argument’ if None was passed”.

Defaults Immutable

Surely, immutable default arguments are OK though, right? I mean, it’s more elegant to write:

Than:

Maybe; maybe so. Immutable default arguments are definitely OK in that they don’t change, so the function is deterministic; but if we take a quick detour to the world of software design, we’ll see that non-trivial default arguments, even if they are immutable, introduce unnecessary coupling.

This example is based on a true story: we had a component that was in charge of handling the filesystem, and provided a function along those lines:

Pretty straightforward: open either a compressed file or a regular one, and write the data there. However, over time we developed another component, which was in charge of handling serialization—that is, encoding objects to bytes so that they can be written to the filesystem. Here’s a piece of it:

Simple enough: use JSON to dump the value to string, encode it to bytes, and delegate it to the original write function. The problem happened when at some point, we were having storage issues, and decided to change the compress parameter to be True by default. We did just that:

But, surprisingly, while some files were indeed compressed—other files weren’t. Presented this way, it’s quite obvious what we did wrong; but back then, with all the noise and complexity of the system, it took a while to figure out: write_json's default argument was still False, and it was being passed down to write as an actual argument, overriding its default value.

So, even though we tried to separate our components, we’ve unwittingly introduced a coupled hierarchy, where all the functions’ default arguments must stay synchronized in order for the system to work in a predictable manner. Of course, rather than just changing the defaults everywhere, we took care of the root cause—all functions passed None to indicate absence, and only in the last one, write, was it finally resolved to its default value:

Insisting on only ever having None default arguments might sound excessive, and like any extreme opinion—it’s probably wrong in some cases. But it’s important to be aware of the caveats and limitations, especially since many people feel that “immutable default arguments are OK” by omission, just because so much has been said about mutable ones.

Variadic Arguments

Another interesting feature in Python is variadic arguments—capturing an arbitrary number of values, or arbitrarily named values, by one function with a fixed signature.

Knowing Their Place

Imagine we’re writing a function that averages numbers—we can make a version for 2, 3, or n numbers, like so:

This is fine, except calling average_n([1, 2, 3]) is a tad less pleasant than calling average_3(1, 2, 3)—but then, it’s impractical to have all the different functions for all the different cases. What if I told you we could do this:

The * next to the xs lets it capture any arguments passed by position (that is, without specifying the parameter name explicitly) into a tuple:

More generally, it captures the rest of the positional arguments, making it useful in all sorts of signatures:

Call Them by Their Name

Then, we have arguments passed in by keyword— that is, explicitly specifying the parameter name. Imagine a function that, given a tag, and optionally some text and some attributes, turns it into HTML:

If we want to wrap some text in a paragraph, we can do it like so:

And if we’d like to have an image with some attributes, we’d do:

That is—if it weren’t for keyword arguments! With it, we can skip that confusing None in the middle, and just do:

But what if I told you we can do even better?

At first, this seems improbable: there are hundreds of possible attributes, and we wouldn’t want our function to have hundreds of optional parameters. The truth is, just like * lets us capture the rest of the arguments passed by position, so does ** let us capture the rest of the arguments passed by keyword. All we have to do is this:

Or, more generally:

A word of warning—playing with how arguments are bound to parameters can introduce mix-ups, so be on the lookout for errors such as these:

One Signature to Rule Them All

Put together, this allows us to define what I call the Omnisignature™—one signature to rule them all, no matter how you invoke it:

This signature is infinitely important for perfect forwarding and decorators—but before we get there, let’s talk about invocations.

Invocation Incantations

We’ve had great fun using *args and **kwargs, but there are two sides to every story: and while we simplified some use-cases, we’ve made others much worse—so let’s address the other side of signatures: invocations.

Exploding Arguments

Previously, if we had a list of numbers, we could just pass it to average_n; what are we supposed to do now?

Ugh. Luckily, Python uses the same * for the inverse operation: if in a signature it means “capture extra positional arguments into a tuple”, in an invocation it means “explode a sequence into separate arguments by position”. So…

…Is similar to average(numbers[0], numbers[1], ..., numbers[N]), however long the list. This works for regular functions, too—but then the number of parameters has to match the number of items, of course:

Similarly, what if we have a dictionary of attributes, and we want to pass it to html? We’d have to do:

But that’s terrible! Luckily, ** is just as versatile: in a signature it means “capture extra keyword arguments into a dictionary”, and in an invocation it means “explode a dictionary into separate arguments by keyword”, which uses the key for the name and the value for the argument:

Again, perfectly usable under normal circumstances:

Perfect Forwarding

The Omnisignature™, together with exploding arguments, lets us achieve the coveted “perfect forwarding” property, where one function can delegate its arguments to another precisely as they were passed:

This is great for deferring invocations: passing them on to threads, processes, or queues, and having someone else make the actual call—but do so exactly as we specified. Moreover, this is what makes decorators possible—but we’ll talk about decorators next time. For now, we have one last thing to cover:

Only Parameters

Newer versions of Python support keyword-only arguments (for Python ≥ 3) and position-only arguments (for Python ≥ 3.8). Both can be quite useful—but are a bit of an “advanced” topic, so many people don’t bother using them, which is a shame: after everything we covered, it’s almost self-explanatory.

Say My Name

What if you have a function that receives an arbitrary number of arguments — that is, *args—but also wants to receive some parameters by keyword?

If we just go ahead and reimplement the print function ourselves, we’d get:

Like I said, pretty self-explanatory. The catch is that, by placing the parameters sep and end after *args, we’ve effectively made them keyword-only: *args will swallow any argument passed by position, so the only way to reach them would be to call them by their names. And that’s not a bad thing; in fact, sometimes it’s much more readable. Take this, for example:

This nice function abstracts away the hassle of creating a server: instantiating a socket, configuring it, binding it, etc. And it’s great! listen(8000) is a compact and readable way to define a server listening on port 8000.

But what about listen(8000, 5000, False)? What does it mean? If you’re an experienced developer, you’d know to avoid such unreadable code, and write listen(8000, backlog=5000, reuseaddr=False) , using keyword arguments even though you don’t have to. But look here:

That weird * in the middle of the signature is our way of telling Python that the rest of the parameters are keyword-only. If you’d have *args in the signature, it wouldn’t be necessarily; but without it, this “anonymous alternative” separates the keyword-only parameters, without capturing extra positional arguments. This way, even inexperienced developers will have to adhere to your better standards:

Feature-OCD

If we have keyword-only parameters, shouldn’t we have position-only parameters, too? What does it even mean? This scenario is less obvious, which is why it took it so long to make it into an official release. It creates a nice sense of symmetry:

Anything before the / is position-only; anything after it is like before:

But OCD is not a good enough reason to add new syntax to a language; the truth is, this can actually be useful—especially when you have **kwargs in the signature, which can conflict with your other parameter names. Remember our HTML function from before? What if I’d want to create an image, with its text attribute set to Hello, world!?

This happens because text is actually a proper parameter, so Hello, world! is assigned to it rather than being passed to the **attributes catch-all dictionary. If only there was a way to make it so it can’t be passed by name…

Now, both tag and text are position-only parameters! You can still pass arguments to them, of course…

…You just have to do so positionally; any keywords will be interpreted as attribute names!

Cool, huh?

Conclusion

This was our first run at functions—namely, all the different ways we can communicate arguments into these parametrized units of code. Next time, we’ll discuss the units of code themselves: what are they? What do they know? Do they know things? Let’s find out! We’ll also talk about decorators—one of the most useful design patterns in Python—and really take some time to understand it inside out. Stay tuned!

The Advanced Python Programming series includes the following articles:

  1. A Value by Any Other Name
  2. To Be, or Not to Be
  3. Loopin’ Around
  4. Functions at Last
  5. To Functions, and Beyond!
  6. Function Internals 1
  7. Function Internals 2
  8. Next Generation
  9. Objects — Objects Everywhere
  10. Objects Incarnate
  11. Meddling with Primal Forces
  12. Descriptors Aplenty
  13. Death and Taxes
  14. Metaphysics
  15. The Ones that Got Away
  16. International Trade

--

--

Dan Gittik

Lecturer at Tel Aviv university. Having worked in Military Intelligence, Google and Magic Leap, I’m passionate about the intersection of theory and practice.