Understanding the ‘yield’ Keyword in Python
The yield
keyword in Python is used in a function like a return statement, but it returns a generator. A generator is an iterator, a kind of iterable you can only iterate over once. Generators do not store all the values in memory, they generate the values on the fly, which makes them much more memory-efficient when dealing with large datasets.
This is a recipe from PythonFleek. Get the free e-book today!
CODE
Yield keyword in Python returns a generator
def fib(limit):
a, b = 0, 1
while a < limit:
yield a
a, b = b, a + b
EXPLANATION
This keyword is used by Python developers, especially those working with large datasets and needing to optimize memory usage.
— Definition: yield
is a keyword in Python that is used in a function like a return statement but it returns a generator.
— Generator: A generator is an iterator, a kind of iterable you can only iterate over once. Generators do not store all the values in memory, they generate the values on the fly.
— Usage: yield
is used when we want to iterate over a sequence, but don’t want to store the entire sequence in memory.
— Function: When a function containing yield
is called, it returns a generator object without even beginning execution of the function.
— Execution: When __next__()
method is called for the first time, the function starts executing until it reaches the yield
statement. The yielded value is returned by the __next__()
call.
— Pause & Continue: The function execution is paused and the control is transferred to the caller. Local variables and their states are remembered between successive calls.
— End: Finally, when the function is called and the function is terminated, StopIteration is raised automatically on further calls.
— Example: In the provided code, a generator function is defined that yields numbers from 0 to 9. We can iterate over these numbers using a for loop or manually using the next()
function.
— Benefit: The main advantage of generator over a list is that it takes much less memory.
— Use Cases: We should use yield
when we want to iterate over a large sequence, but don’t want to store the entire sequence in memory.
Understanding the ‘yield’ Keyword in Python
The ‘yield’ keyword in Python is used in a function like a return statement, but it returns a generator.
Why: Understanding the ‘yield’ keyword is crucial for Python developers as it allows for more efficient memory usage when dealing with large datasets. It is especially useful in scenarios where you need to iterate over a large sequence, but don’t want to store the entire sequence in memory.
Install: No installation is needed as 'yield' is a built-in keyword in Python.
Algorithm
‘yield’ is a keyword in Python that is used to define a generator function. This function returns an iterator that generates the values on the fly, making it more memory-efficient.
# Python program to demonstrate the use of 'yield' keyword
# A generator function for Fibonacci Numbers
def fib(limit):
# Initialize first two Fibonacci Numbers
a, b = 0, 1
# One by one yield next Fibonacci Number
while a < limit:
yield a
a, b = b, a + b
# Create a generator object
x = fib(5)
# Iterating over the generator object using next
print(x.__next__())
print(x.__next__())
print(x.__next__())
print(x.__next__())
print(x.__next__())
# Iterating over the generator object using for
# in loop.
print('\nUsing for in loop')
for i in fib(5):
print(i)
Demo 1
Yield is useful when iterating over large sequences
# This is a simple Python program to demonstrate the use of 'yield' keyword
# A generator function that yields 1 for the first time,
# 2 second time and 3 third time
def simpleGeneratorFun():
yield 1
yield 2
yield 3
# Driver code to check above generator function
for value in simpleGeneratorFun():
print(value)
Demo 2
Generators are memory-efficient for large datasets
# Python program to demonstrate the use of 'yield' keyword
# A generator function that yields numbers from 0 to 9
def numGenerator():
n = 0
while n < 10:
yield n
n += 1
# Using for loop
print('Using for loop:')
for num in numGenerator():
print(num)
# Using next() function
print('\nUsing next() function:')
numbers = numGenerator()
print(next(numbers))
print(next(numbers))
print(next(numbers))
Case Study
Suppose we were tasked with creating a function that generates a large sequence of numbers in Python. We could use a list to store all the numbers, but this would consume a lot of memory, especially for large sequences. Instead, we could use a generator function with the ‘yield’ keyword. The function would look something like this: def generate_numbers(n): for i in range(n): yield i
. When we call this function, it returns a generator object: gen = generate_numbers(10)
. We can then iterate over this object using a for loop: for number in gen: print(number)
. This will print the numbers from 0 to 9, one at a time, without storing all the numbers in memory at once.
Pitfalls
— Understanding generators: The ‘yield’ keyword is closely tied to the concept of generators in Python. If you’re not familiar with generators, the ‘yield’ keyword can be confusing. It’s important to understand that a generator is a type of iterable that generates its values on the fly, rather than storing them all in memory.
— Function execution: When a function containing ‘yield’ is called, it doesn’t start executing immediately. Instead, it returns a generator object. The function only starts executing when the ‘__next__()’ method is called on the generator object.
— Single-use: Generators can only be iterated over once. Once a generator’s values have been consumed, you can’t iterate over them again.
Tips for Production
— Memory efficiency: The ‘yield’ keyword can be very useful when working with large datasets, as it allows you to create a sequence of values without storing them all in memory. This can significantly reduce the memory footprint of your program.
— Lazy evaluation: Generators are lazily evaluated, which means they only generate their values as needed. This can make your program more efficient, as it avoids unnecessary computation.
— Use with large sequences: If you’re working with a large sequence and you don’t need to access all the values at once, consider using a generator with the ‘yield’ keyword. This can make your code more efficient and easier to read.
Recommended Reading
— Python (programming language): Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum and first released in 1991, Python’s design philosophy emphasizes code readability with its notable use of significant whitespace. https://en.wikipedia.org/wiki/Python_(programming_language)
— Generator (computer programming): In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. https://en.wikipedia.org/wiki/Generator_(computer_programming)
— Control flow: In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an imperative programming language from a declarative programming language. https://en.wikipedia.org/wiki/Control_flow
Encore
In conclusion, the ‘yield’ keyword in Python is a powerful tool that allows for more efficient memory usage, especially when dealing with large datasets. It is used in a function like a return statement but instead of returning a value and terminating the function, it produces a value and suspends the function’s execution. The function can be resumed later on from where it left off, allowing the function to produce a series of values over time, rather than computing them at once and sending them back like a list. This makes ‘yield’ particularly useful when working with large datasets that can’t fit into memory. Understanding how to use ‘yield’ effectively can greatly enhance the performance of your Python code.
# Importing required modules
import numpy as np
# Defining a generator function
# This function will yield numbers from 0 to 9
# We can iterate over these numbers using a for loop or manually using the 'next()' function
def num_generator():
i = 0
while i < 10:
yield i
i += 1
# Creating a generator object
nums = num_generator()
# Iterating over the generator object using next()
print(next(nums))
print(next(nums))
print(next(nums))
# Iterating over the remaining items in the generator object using a for loop
for num in nums:
print(num)
# The main advantage of generator over a list is that it takes much less memory
# We should use 'yield' when we want to iterate over a large sequence, but don't want to store the entire sequence in memory
# For example, let's create a large numpy array using np.arange()
large_array = np.arange(0, 1000000)
# Now, let's create a generator that yields items from this large array
large_array_generator = (i for i in large_array)
# We can now iterate over this large array without storing the entire array in memory
for i in large_array_generator:
pass # do something with i
# This is much more memory-efficient when dealing with large datasets