Tips and Tricks for Efficient Python Programming

Sanjay G
Subex AI Labs
Published in
7 min readAug 10, 2021
Photo by Chris Ried on Unsplash

Python being one of the most rapidly growing programming languages is best known and loved by developers for its simplicity, vast library support, and improved productivity.

If you’ve been coding in Python for a while you would’ve come across several tips to write efficient code, right from using list comprehensions to using built-in functions that are highly optimized. This article will look at some more tips you can use right away in your development.

1. Use generator expressions as an iterator to feed a for loop

Consider a situation where you need to use a cartesian combination of all elements of two lists or create a list with a huge number of elements and iterate on them in a for loop. In such cases, generator expressions are better since it saves the expense of building the entire list and storing them in memory. They instead yield the items one by one on demand.

Generator expressions are created similar to list comprehensions but they use parenthesis “()” instead of square brackets. Consider the following illustrative example showing the use of generator expressions:

brands = ["ASUS", "DELL", "Lenovo", "HP", "Acer", "Microsoft Surface", "MSI", "Toshiba", "Samsung", "Sony"]cores = ["i3", "i5", "i7"]for laptop in (str(b)+" "+str(c) for b in brands for c in cores):
# Your code here

Let’s compare the memory usage between a list and a generator expression. We’ll use the getsizeof() method in the sys module which gives us the size of an object in bytes.

import sysmy_list = [str(b)+" "+str(c) for b in brands for c in cores]
print("Size of my_list: ", sys.getsizeof(my_list))
gen = (str(b)+" "+str(c) for b in brands for c in cores)
print("Size of gen: ", sys.getsizeof(gen))
"""
Output
Size of my_list: 352
Size of gen: 128
"""

Let’s add another nested for loop to increase the number of elements.

my_list = [str(b)+" "+str(c)+" "+str(d) for b in brands for c in cores for d in brands]
print("Size of my_list: ", sys.getsizeof(my_list))
gen = (str(b)+" "+str(c)+" "+str(d) for b in brands for c in cores for d in brands)
print("Size of gen: ", sys.getsizeof(gen))
"""
Output
Size of my_list: 2544
Size of gen: 128
"""

Clearly, the list uses more memory compared to the generator. As the length of the list increases, so does the memory usage. But this isn’t the case for the generator.

2. Use setdefault to add new keys to a dictionary

When you want to add a new key only if it is not present in the dictionary and assign values to it, we would write a logic something like this:

if key not in dict_1:              # key search 1 
dict_1[key] = [] # key search 2
dict_1[key].append(new_value) # key search 3

The problem with the above logic is that it needs to perform at least two key searches. Two searches (searches 1 and 3) if the key is present and 3 searches (searches 1, 2, and 3) if the key is not present.

It is possible to achieve the above goal with a single lookup using the setdefault method as shown below.

dict_1.setdefault(key, []).append(new_value)

3. Use _ and * while unpacking tuple with multiple elements

We may not be interested in all the values unpacked from a tuple. In such situations, we can use _ as a dummy variable to catch an unnecessary value. This is especially useful when assigning multiple values returned by the function as a tuple.

# Catch first tuple element with variable a.
a, _ = tuple([1, 2])
print("Value of a: ", a)
"""
Output
Value of a: 1
"""
# Catch second tuple element with variable b.
_, b, _ = tuple([3, 4, 5])
print("Value of b: ", b)
"""
Output
Value of b: 4
"""

We can use *<variable_name> to catch excess values as shown below:

a, b, *other_values = tuple([1, 2, 3, 4, 5])
print("a = ", a)
print("b = ", b)
print("other_values = ", other_values)
"""
Output
a = 1
b = 2
other_values = [3, 4, 5]
"""
a, *other_values, b = tuple([1, 2, 3, 4, 5])
print("a = ", a)
print("other_values = ", other_values)
print("b = ", b)
"""
Output
a = 1
other_values = [2, 3, 4]
b = 5
"""

4. Use an ellipsis (…) in slicing notation to access all values in a multi-dimensional object

Consider a 2-D array ‘a’. Now to select all columns in the 3rd row, we use the familiar syntax a[2, :]. But let’s say we wanted to apply a similar operation on a 5-dimensional array. In that case instead of doing a[3, :, :, :, :], we could simply use ellipses and write a[3, …]. This would indicate that we want to access all values from other dimensions.

5. Use sets for containment checks

On many occasions, we are required to do containment checks of the following nature

item in item_collections

Now if the item_collections holds a large number of values (and also contains repeated values), it is best to create a set out of it and then do the containment check.

set_item_collections = set(item_collections)item in set_item_collections

Let’s go ahead and compare the time taken to do these containment checks on a list and a set.

from time import perf_countermy_list = [str(i) for i in range(1000000)]t1 = perf_counter()
res = "45639" in my_list
list_time = perf_counter() - t1
print("Time taken on list: ", list_time, "s")
set_list = set(my_list)
t2 = perf_counter()
res = "45639" in set_list
set_time = perf_counter() - t2
print("Time taken on set: ", set_time, "s")
print("Ratio list_time/set_time: ", list_time/set_time)"""
Output
Time taken on list: 0.0008049289972404949 s
Time taken on set: 4.453999281395227e-06 s
Ratio list_time/set_time: 180.72050451439426
"""

Note the e-06 (order of microseconds) in the time taken on set. Sets are optimized for very fast searches or containment checks. Thus, it will speed up your check. In the above case, it is about 180 times faster! Also, if possible build the set with set comprehension as it is faster than calling the set() method to build the set.

6. Avoid augmented assignment with immutable datatypes

The usage of+= and *= is called augmented assignment and is an example of syntactic sugar. This may seem convenient and we may be tempted to use it everywhere. But there is a catch! Let’s look at the following code example which uses augmented assignment for a list and tuple:

>>> a = [1, 2, 3]
>>> print(id(a))
140658173664576 # id of a
>>> a += [7, 8, 9]
>>> print(id(a))
140658173664576 # id of a is the same. It is the same object.
>>> b = (4, 5, 6)
>>> print(id(b))
140658173448768 # id of b
>>> b *= 2
>>> print(id(b))
140658173561920 # id of b has changed! It is a new object.

Did you get the catch? If we have to repeatedly concatenate or add new data it is better to use mutable types such as lists since they simply append the new data in the same memory location or the same object. On the other hand, this is inefficient for immutable types such as tuples since for every new addition of data, there is an overhead for the interpreter to copy the entire target data to create a new object.

7. Use bisect.insort() to add new elements to a sorted sequence

It is inefficient and expensive to repeatedly sort a sequence when we add new elements to the same sequence. Because the addition of a new element to the end of the sequence may destroy the sort. The bisect.insort() function allows us to insert elements while retaining the sort. Here’s an illustrative example of the usage:

>>> import bisect
>>> a = [9, 8, 7, 6]
>>> a_sort = sorted(a)
>>> print(a_sort)
[6, 7, 8, 9]
>>> bisect.insort(a_sort, 4)
>>> print(a_sort)
[4, 6, 7, 8, 9]
>>> bisect.insort(a_sort, 7.5)
>>> print(a_sort)
[4, 6, 7, 7.5, 8, 9]

8. Use sets and set operations wherever possible

By definition, sets are a collection of unique objects. When we quickly want to remove duplicates or get the count of the unique items we can use the set function and pass our list to its constructor:

>>> a = [1, 3, 2, 1, 3, 5, 7, 6, 6]
>>> set_a = set(a)
>>> print(set_a)
{1, 2, 3, 5, 6, 7}
>>> print(list(set_a))
[1, 2, 3, 5, 6, 7]
>>> print(len(set_a))
6

Sets also support mathematical operations such as unions, intersections, and set differences. For two sets a and b, they are respectively used as a|b, a&b, and a-b.

Adroit use of sets reduces loops and conditional checkings. Which ultimately also results in a more readable code and reduced overall runtime. For example, consider a use case where there is a need to check the number of instances in ‘items’ that occur in a larger collection of all instances ‘collections’. It’s natural for most of us to jump straight in to write a logic something looks like this:

count = 0
for item in items:
if item in collections:
count += 1

But the usage of sets and set intersection makes it simpler:

count = len(set(items) & set(collections))# To actually find out the items that are present in the collectionsitems_in_collection = set(items) & set(sollections)

So that’s all for this article! I hope you’ve learned something new and you’d start using them in your daily Python programming.

Share this article if you found this useful and learned something new.

Happy coding. Cheers!

--

--