Python 3, Tips and Best Practices to avoid common errors
Improve the quality of your code with these Must-Know Python Tips
In this article, you can find some tips to avoid common errors in Python3 and some insights into the Pythonic way of writing programs. We are going to talk about function default arguments, walrus operator, and formatting expressions.
1. Do not use dynamic default arguments
Sometimes you may need to use a non-static type as a default argument value. For example, let’s imagine we need to define a function that prints a given message, and we want to include information about when the function was called (a sort of log function).
from datetime import datetime
from time import sleep
def print_time(when: datetime = datetime.now()):
print(f'This function was called at: {when}')print_time()
sleep(0.5)
print_time()>>>
This function was called at: 2021-01-26 10:22:00.443739
This function was called at: 2021-01-26 10:22:00.443739
It is not working as expected, isn’t it? In this case, timestamps are the same because datetime.now()
is executed only a single time when the function is defined. In particular, default argument values are evaluated only when the def
statement they belong to is executed. They won’t be evaluated again.
You can achieve the desired output by assigning None
as default value.
from typing import Optional def print_time(when: Optional[datetime] = None):
if when is None:
when = datetime.now()
print(f'This function was called at: {when}')print_time()
sleep(0.5)
print_time()>>>
This function was called at: 2021–01–26 10:22:03.311862
This function was called at: 2021–01–26 10:22:03.814828
Something similar happens if the default argument is mutable.
Keep in mind that objects like int, float, string, bool, tuple, and bytes are immutable, while dict, list, set are mutable.
Look at the following example:
from typing import Listdef foo(element: float, data: List[float] = []) -> List:
data.append(element)
return datafoo(5)
>>>
[5]foo(5)
>>>
[5, 5]foo(5)
>>>
[5, 5, 5]
Multiple calls to the same function will append the element to the list and never reinitialize it. Hence, the output is a list with as many items as many times the function is called.
The reason is that functions in Python are first-class objects. When a def
statement is executed, Python creates a new function object and evaluates the default values.
In our example, when we execute the function with foo(5)
we are changing the default argument appending 5
to the predefined list. Python keeps in memory the so-defined argument until the same def
statement is re-executed.
Fortunately, there is a better way to handle this problem than by re-executing the def
statement every-single time. Even in this case, the solution is to replace the mutable object with a placeholder, for example by None
.
from typing import List, Optionaldef foo(element: float, data: Optional[List] = None) -> List:
if data is None:
data = []
data.append(element)
return datafoo(5)
>>>
[5]foo(5)
>>>
[5]
Everything seems working. But what happens if we decide to give as input of the foo
function a pre-defined list? Look at the following example.
from typing import List, Optionaldef foo(element: float, data: Optional[List] = None) -> List:
if data is None:
data = []
data.append(element)
return datamy_list = [1, 2, 3]foo(5, my_list)
>>>
[1, 2, 3, 5]print(my_list)
>>>
[1, 2, 3, 5]
The function output seems right, precisely what we were expecting. But wait! my_list
has changed!
In this case, the problem is that we are using an in-place method (append()
) and we are using my_list
as data
value, hence for Python, they are the same thing! The result is that modifying data
you are also changing my_list
.
To solve this problem, you can use copy()
, [:]
or list()
to create a different copy.
from typing import List, Optionaldef foo(element: float, data: Optional[List] = None) -> List:
if data is None:
new_data = []
else:
new_data = data.copy() #or data[:] or list(data)
new_data.append(element)
return new_datamy_list = [1, 2, 3]foo(5, my_list)
>>>
[1, 2, 3, 5]print(my_list)
>>>
[1, 2, 3]
Otherwise, a more elegant way is to avoid the in-place operator and use a standard operator as follows.
from typing import List, Optionaldef foo(element: float, data: Optional[List] = None) -> List:
return (data if data is not None else []) + [element]my_list = [1, 2, 3]foo(5, my_list)
>>>
[1, 2, 3, 5]print(my_list)
>>>
[1, 2, 3]
2. Don’t repeat yourself
One of the coding best practices is to use the DRY (Don’t Repeat Yourself) coding principle. If you need to modify a part of code repeated elsewhere, you can easily forget to make all the necessary changes, and hence, you are introducing bugs in your program. Furthermore, executing more times the same code could be tremendously inefficient.
A quite new expression, introduced in Python 3.8, can help you to avoid code duplication. It is written as :=
(two points and equal), and it is known as the walrus operator because it looks like a pair of eyeballs and tusks.
In Python, we can assign to a
the b
value writing a = b
. The walrus operator -also known as assignment expression- is written as a := b
.
The walrus operator is useful because it enables you to assign variables in places where assignment states are disallowed, such as in the conditional expression of an if statement.
An assignment expression’s value evaluates to whatever was assigned to the identifier on the left side of the walrus operator.
For example, let’s assume we want to make a chocolate cake and we need at least 5 eggs and 150g of chocolate. We specify a dictionary with the ingredients we have bought.
ingredients = {'egg': 7, 'chocolate': 200}
Now, we should check if we have enough units to make a cake. One possibility is the following.
We can shorten the lines of code with the walrus operator as follows.
You can also use the walrus operator in list and dictionary comprehensions.
Assume we want to check if we have each ingredient per at least 10 units.
We could complete this task as follows.
tenth_ingredients = {}
for i, cnt in ingredients.items():
tenth = cnt//10
if tenth>0:
tenth_ingredients[i] = tenth
Using the walrus operation, we can perform the same task in only one line!
{i: tenth for i, cnt in ingredients.items() if (tenth := cnt//10)>0}>>>
{'chocolate': 20}
On the other hand, in list comprehensions, the walrus operator can be useful if you want to keep track of the last item.
ingredients = ['apple', 'chocolate', 'sugar']
print([(last := i) for i in ingredients])
print(f'Last ingredient is {last}')>>>
['apple', 'chocolate', 'sugar']
Last ingredient is sugar
This is generally not true with a simple loop in list comprehensions.
print([i for i in ingredients])
print(i)>>>
['apple', 'chocolate', 'sugar']
Traceback ...
NameError: name 'i' is not defined
3. Be aware of different formatting styles
Formatting is the process of combining predefined text with data values into a single human-readable message that is stored as a string.
Python has four different ways for formatting strings, available in the built-in package.
C-style Format string
Probably the most common way to format a string in Python comes from C’s printf
function. It uses the %
operator and format specifiers (like %d
) as placeholders that will be replaced by values specified at the right side of the formatting expression.
key = 'my_variable'
value = 0.1234
formatted = '%s = %.2f' % (key, value)
print(formatted)>>>
my_variable = 0.12
The first problem with C-style formatting expression is that the order in which we specify tuple elements is important and, if we swap key and value, we will get an error.
formatted = '%s = %.2f' % (value, key)
print(formatted)>>>
Traceback ...
TypeError: must be real number, not str
Furthermore, if you want to use the same value multiple times, you have to repeat it.
name = 'Giovanni'
formatted = '%s goes to school. %s studies math.' % (name, name)
print(formatted)>>>
Giovanni goes to school. Giovanni studies math.
The third and last problem is that it is difficult to read this type of formatting and it is difficult to make small modifications, especially if we have a long list of values that needs to be replaced.
C-style Format string with dictionary
To solve some of these problems, the %
operator offers the possibility to use a dictionary instead of a tuple. Hence the keys from the dictionary are matched with format specifiers with the corresponding name.
key = 'my_variable'
value = 0.1234
formatted = '%(key)s = %(value).2f' % {'key': key, 'value': value}
print(formatted)>>>
my_variable = 0.12
This formatting style solves the first problem since the order in which keys and values appear in the dictionary is not important anymore.
It also solves the second problem since it is not necessary to write twice the same value.
name = 'Giovanni'
formatted = '%(name)s goes to school. %(name)s studies math.' % {'name': name}
print(formatted)>>>
Giovanni goes to school. Giovanni studies math.
However, the third problem remains, since formatting expressions become longer and more difficult to read.
The str.format
This method uses curly brackets {}
to specify placeholders instead of percentage %
. By default, placeholders are replaced by the corresponding value passed to the format
method in the order in which they appear.
key = 'my_variable'
value = 0.1234
formatted = '{} = {:.2f}'.format(key, value)
print(formatted)>>>
my_variable = 0.12
But you can also specify within the brackets, the desired order.
key = 'my_variable'
value = 0.1234
formatted = '{1} = {0}'.format(key, value)
print(formatted)>>>
0.1234 = my_variable
You can even use the same positional index more times.
name = 'Giovanni'
formatted = '{0} goes to school. {0} studies math.'.format(name)
print(formatted)>>>
Giovanni goes to school. Giovanni studies math.
Unfortunately, problems with readability remain. It is still difficult to avoid bugs when small modifications are needed since it is still too noisy to read.
F-strings
Python 3.6 added interpolated format strings, also known as f-strings, to solve all these problems. This new language syntax requires an f
as string prefix. F-strings are similar to str.format
, but they eliminate the redundancy of providing keys and values to be formatted. They are succinct yet powerful because they allow for Python expressions to be directly embedded within format specifiers.
key = 'my_variable'
value = 0.1234
formatted = f'{key} = {value:.2f}'
print(formatted)>>>
my_variable = 0.12
Let’s compare all the four formatting expressions we have just seen with a new example.
fridge = {'tomatoes': 10.00, 'eggs': 5.00, 'avocados': 1.5}# c-style
for i, (item, count) in enumerate(fridge.items()):
print('#%d: %-10s = %d' % (i+1, item.title(), round(count)))# c-style with dictionary
for i, (item, count) in enumerate(fridge.items()):
print('#%(i)d: %(item)-10s = %(count)d' % {'i': i+1,
'item': item.title(), 'count': round(count)})#str format
for i, (item, count) in enumerate(fridge.items()):
print('#{:d}: {:10s} = {:d}'.format(i+1, item.title(), round(count)))# f-string
for i, (item, count) in enumerate(fridge.items()):
print(f'#{i+1}: {item.title():<10s} = {round(count)}')
In all of these cases the result is the same, but what is the more expressive and easy to read for you?
>>>
#1: Tomatoes = 10
#2: Eggs = 5
#3: Avocados = 2
I think that the combination of expressiveness, concision, and clarity makes the f-string formatting expression probably the best option to avoid bugs and improve readability.
Conclusions
That’s all! So far, we have discussed some of the best practices in Python that I think are not yet widely used.
From now on, remember the following tips while coding:
- Do not use dynamic values as default argument for your Python functions;
- Do not modify input arguments inside your functions to avoid odd behaviours;
- Do not repeat yourself. It improves code efficiency and avoids bugs;
- Prefer the
f-string
formatting expression to the others.
Did you already know all of them? What are the best practices that you are following?
If you have anything to add or to suggest, feel free to leave a comment!
If you liked this post and you’re a smart data scientist or data engineer have a look at our open positions in CGnal.