Why should you use generators in Python ?

Rushiraj Gadhvi
3 min readFeb 13, 2023

--

Photo by Dima Solomin on Unsplash

As a Python developer, you likely spend a lot of time working with large amounts of data. Whether you’re processing logs, working with large datasets, it’s important to make sure your code is as efficient as possible.

One way to improve your code’s performance and reduce its memory footprint is by using generators. So, what are generators ?

Generators are a special type of function in Python that allow you to generate values one at a time, instead of generating all of the values at once. They are called “generators” because they generate values as you need them, on-the-fly.

Take the following example -

We need to produce squares of numbers to solve a certain problem. Find most efficient way to do so; when memory is small and restricted.

def square_it(n):
l = []
for i in range(n):
l.append(i**2)

return l

# alternate method to create a list
l = [x**2 for x in range(n)]

The above function square_it would correctly produce the needed output.

# Output of square_it for n = 10
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

As we can see, each element is computed and saved in a list; this works for small inputs, but when memory is restricted and big inputs must be calculated, this is not the most efficient technique.

Now, let’s tackle same problem with generators.

def square_gen(n):
for i in range(n):
yield i**2

Syntax wise there is not much difference in normal vs generator function, only the ‘return’ keyword is replaced with ‘yield’. Additionaly, note that generator function returns a generator object which needs to be parsed to get actual output.

gen = square_gen(10)

print(gen)
# output: <generator object square_gen at 0x7f2606471630>

print(next(gen))
# output: 0
print(next(gen))
# output: 1
print(next(gen))
# output: 4

As previously said, the generator computes in a step by step fashion. To get output, we can use next() function to go through all the produced outputs sequentially. Alternatively, you can loop through all the items in the generator object.

for item in gen:
print(item)

Generator is a great function to keep in your arsenal, now let’s understand in deep how a generator works?

When a generator function is called, it starts executing from the first line. When it encounters the yield keyword, it returns the value of the expression following yield, and then suspends its execution. The next time the generator is called, it resumes its execution from where it left off and continues until it encounters another yield statement.

This process continues until the generator function reaches the end of its code, at which point it raises a StopIteration exception to indicate that there are no more values to be generated.

Benefits:

  1. Processing large datasets: One of the biggest benefits of generators is that they allow you to work with large data sets efficiently, without having to store all of the data in memory at once. This can greatly reduce the memory footprint of your code, and make it faster and more efficient.
  2. Log processing: If you’re processing logs, you can use a generator to read and process each log entry one at a time, instead of having to read and store all of the entries in memory at once.

Tips:

One Liner Generator Statement: use rounded brackets () to change a list [] to a generator object.

gen = (x**2 for x in range(10))
# same square func in one liner

keep learning, keep coding !

--

--