Underscores in Python
In this post we’ll go through the different uses of the _
character in Python. Like with many things in Python, we’ll see that different usages of _
are mostly (not always!) a matter of convention. Here are the five different cases we’ll cover:
- Single lone underscore (e.g.
_
) - Single underscore before a name (e.g.
_total
) - Single underscore after a name (e.g.
total_
) - Single underscore in numeric literals (e.g.
100_000
) - Double underscore before a name (e.g.
__total
) - Double underscore before and after a name (e.g.
__init__
)
1. Single lone underscore (_)
This is typically used in 3 cases:
In the interpreter
The _
name points to the result of the last executed statement in an interactive interpreter session. This was first done by the standard CPython interpreter, but others have followed as well.
As a name
This is somewhat related to the previous point. _
is used as a throw-away name. This will allow the next person reading your code to know that, by convention, a certain name is assigned but not intended to be used. For instance, you may not be interested in the actual value of a loop counter:
i18n
You may also see _
being used as a function. In that case, it is often the name used for the function that does internationalisation and localisation string translation lookups. This seems to have originated from and follow the corresponding C convention. For instance, as seen in the Django documentation for translation, you may have:
The second and third purposes can conflict, so you should avoid using _
as a throw-away name in any code block that also uses it for i18n lookup and translation.
2. Single underscore before a name (e.g. _total)
A single underscore before a name is used to specify that the name is to be treated as “private” by a programmer. It’s kind of* a convention so that the next person (likely yourself) reading your code knows that a name starting with _
is for internal use. As the Python documentation notes:
a name prefixed with an underscore (e.g.
_spam
) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
*I say kind of a convention because it actually does mean something to the interpreter; if you from <module/package> import *
, none of the names that start with an _
will be imported unless the module’s/package’s __all__
list explicitly contains them. See ‘Importing *
in Python’ for more on this.
3. Single underscore after a name (e.g. total_
)
A single underscore after a name is used to avoid a name from shadowing another name. It’s certainly a convention. For example, if you want to name something format
, in order to avoid shadowing the Python’s builtin format
, you’d instead call it format_
.
On a funny note, my preferred Twitter username is @s16h
, but that was already taken, so, my Twitter username is actually @s16h_ 🤣 follow me while you’re here!
4. Single underscore in numeric literals (e.g. 100_000)
PEP 515 proposed extending Python’s syntax so that underscores can be used as visual separators for digit grouping purposes in integral, floating-point and complex number literals. The rationale being:
This is a common feature of other modern languages, and can aid readability of long literals, or literals whose value should clearly separate into parts, such as bytes or words in hexadecimal notation.
So, you can do stuff like:
5. Double underscore before a name (e.g. __total
)
The use of double underscore (__
) in front of a name (specifically a method name) is not a convention; it has a specific meaning to the interpreter. Python mangles these names and it is used to avoid name clashes with names defined by subclasses. As the Python documentation notes, any identifier of the form __spam
(at least two leading underscores, and at most one trailing underscore) is textually replaced with _classname__spam
, where classname
is the current class name with leading underscore(s) stripped.
Take the following example:
As expected, _internal_use
doesn’t change, but __method_name
is mangled to _ClassName__method_name
. Now, if you create a subclass of A
, say B
(argh, bad, bad names!) then you can’t easily override A
‘s __method_name
:
The intended behaviour here is almost equivalent to final
methods in Java and normal (non-virtual) methods in C++.
6. Double underscore before and after a name (e.g. __init__
)
These are special method names used by Python. As far as you’re concerned, this is just a convention, a way for the Python system to use names that won’t conflict with user-defined names. You then typically override these methods and define the desired behaviour for when Python calls them. For example, you often override the __init__
method when writing a class.
There is nothing to stop you from writing your own special-method-looking name (but, please don’t):
It’s best to stay away from this type of naming and let only Python-defined special names follow this convention.
That’s all! If you enjoyed reading this, follow me on Twitter. Otherwise, follow me on Twitter, and tell me how much you hated reading this.