Everything You Need to Know About Python Iterators — A Guide
Iterators are the core of how for
loop works in Python. If we want to loop through the container objects like list, tuple, dictionary, set, strings, files and so, they have to be iterable.
We know that everything in Python is an object. So when we create a list, tuple, or any container type, they’ll be objects as well and every container type object in Python is an iterable object.
Basically, there are two essential things when it comes to for
loop or iterating through objects: iterable & iterator. Let's see the definition of them both from the official Python docs,
An iterable object is an object that implements
__iter__
, which is expected to return an iterator object.An iterator object implements
__next__
, which is expected to return the next element of the iterable object that returned it and to raise aStopIteration
exception when no more elements are available.
From the above explanations, we could be certain that every container data type in Python has an __iter__
implementation that returns an iterator object. The returned iterator will be having __next__
method using which we could get each element from the container or an iterable.
Let’s take a for
loop and walk through how it works.
lst = [1,2,3,4]
for i in lst:
print(i)dt = {1:"one", 2:"two"}
for d in dt:
print(dt)# for files
with open("somefile", 'r') as some_file:
for line in some_file:
print(line)# it goes on for all container types including tuple, set, strings
In the above code, when the for
loop initializes, it’ll call the __iter__
method of the list lst
. The __iter__
method will in turn return an iterator object. The for loop will use that iterator object and call __next__
on the respective iterator.
The __next__
method will return each element from the container on each iteration. Once we iterated over all the elements and no elements are left, a StopIteration
exception will be raised indicating the end of the iteration. Below is an approximate code equivalent to what I’ve described above.
lst = [1, 2, 3, 4]
print(dir(lst))
# [..., '**__getitem__**', '**__iter__**', 'append', 'extend', ...]iterator = lst.__iter__()print(dir(iterator))
#[..., '__iter__', '**__next__**', ...]while True:
try:
print(iterator.__next__())
except StopIteration:
break
It’s the same for all the container objects like tuple, dictionary, set, strings, files and so.
Creating custom iterator:
We can make any Python object iterable by implementing an __iter__
method. Let's create a hypothetical Bag
class that has items in them. If we want to iterate through the items just by giving the bag
object to the for
, we have to implement an __iter__
method that returns an iterable.
Then the for
loop will basically iterate through the iterator returned by the __iter__
method.
class Bag:
def __init__(self, items=[]):
self.items = items def __iter__(self):
return iter(self.items)bag = Bag(items=["pen", "money"])for item in bag:
print(item)
Below is another example of creating a custom iterator. Code
is a class that has raw code statements in it and we can loop through the statements of the code
object directly in for
loop as we have __iter__
implementation in the class.
class Code:
def __init__(self, code_block, lang, raw, line_separator='\\n'):
self.statements = [stmt.strip() for stmt in code_block.split(line_separator)
if stmt.strip()]
self.lang = lang
self.raw = raw def __iter__(self):
return iter(self.statements)code = Code("""
a = 5
b = 10
print(f"Sum of {a} & {b} is : {a+b}")
""", "Python", True)for statement in code:
if code.lang == "Python":
eval(compile(statement, "<string>", "single"))
else:
raise NotImplementedError
In the above examples, we directly used the built-in method iter
to create an iterator object and the iter
works only for built-in types. When we have to create an iterator for a user-defined object, we have to implement __next__
method which will provide each element for iteration (both for
loop and using next
).
Below is an even number generator example that implements both __iter__
and __next__
.
class EvenNoGenerator:
def __init__(self, limit=0):
self.limit = limit
self.cur = 0 def __iter__(self):
return self def __next__(self):
self.cur = self.cur + 2
if self.limit > 0 and self.cur > self.limit:
raise StopIteration
return self.curfor i in EvenNoGenerator(15):
print(i)
Oops, there is __getitem__
as well
To be honest, I wasn’t honest with everyone about how the built-in containers are iterable, to keep things simple for understanding. It’s also possible that the container objects might be using another special method __getitem__
to iterate over the elements.
According to pep — 0324,
The two (
__iter__
&next
) methods correspond to two distinct protocols:1. An object can be iterated over with
for
if it implements__iter__()
or__getitem__()
.2. An object can function as an iterator if it implements
next()
.Container-like objects usually support protocol 1. Iterators are currently required to support both protocols. The semantics of iteration come only from protocol 2; protocol 1 is present to make iterators behave like sequences; in particular, so that code receiving an iterator can use a for-loop over the iterator.
We can also use the __getitem__
in our custom objects instead of __iter__
and __next__
if our use case is to simply iterate over a container/sequence objects.
The Bag
class example that we saw earlier can be rewritten as,
class Bag:
def __init__(self, items=[]):
self.items = items def __getitem__(self, index):
return self.items[index]bag = Bag(items=["pen", "money"])for item in bag:
print(item)
StopIteration & other Exceptions
Python relies on StopIteration
exception to determine the end of an iteration. So, if you are writing a custom iterator, then if you raise a StopIteration
exception in your __next__
method
class Bag:
def __init__(self, items=[]):
self.items = items
self.cur = 0 def __iter__(self):
return self def __next__(self):
self.cur = self.cur + 1
if self.cur <= len(self.items):
return self.items[self.cur-1]
else:
raise StopIterationbag = Bag(items=["pen", "money"])for item in bag:
print(item)
In the above example, after we iterated through every element we are raising a StopIteration
exception to let for
loop know that there are no more elements to iterate.
We have to be careful about where we raise the StopIteration
. If we raise it prematurely, then our iteration ends there itself and introduces bugs in out code. Let me modify the above code to raise StopIteration
in if
instead of else
,
class Bag:
def __init__(self, items=[]):
self.items = items
self.cur = 0 def __iter__(self):
return self def __next__(self):
self.cur = self.cur + 1
if self.cur <= len(self.items):
print("Not interested in iteration!")
raise StopIteration
else:
return self.items[self.cur-1]bag = Bag(items=["pen", "money"])for item in bag:
print(item)
Then the output will be just,
Not interested in iteration!
Also, this StopIteration
should be handled when we do the iteration manually instead of for
loop. Let’s say, we have a FileReader
class that has both __iter__
and __next__
.
class FileReader:
def __init__(self, file, mode):
self.file = open(file, mode)
self.mode = mode def __iter__(self):
return self def __next__(self):
return next(self.file) def close(self):
self.file.close()
We can create an object from the class and loop through each line of the file using for
.
for line in FileReader('testfile', 'r'):
print(line)
But, if we have a use case, where we have to read each line arbitrarily at different places in the code instead of for
, we can use next
to get the lines.
reader = FileReader('testfile', 'r')
line = next(reader)
print(line)
# do something with the lineline = next(reader)
print(line)
In such scenarios, once the file reader runs out of lines in the file, it will throw a StopIteration
exception. we don’t have the for
loop or any other implicit mechanism to catch the StopIteration
and it’ll break the application.
Traceback (most recent call last):
File "/module.py", line 21, in <module>
line = next(reader)
^^^^^^^^^^^^
File "/module.py", line 10, in __next__
return next(self.file)
^^^^^^^^^^^^^^^
StopIteration
So, we have to make sure to handle the StopIteration
explicitly when using next
method on iterators. Mostly we would be silently passing the StopIteration
exceptions and while doing so, we need to ensure that the exceptions other than StopIteration
are propagated properly or as intended.