Iterators in Python

M Kim
4 min readAug 7, 2019

--

Iterators are Python objects that produce data.

Like an assembly line, iterators produce data on-demand, in well-defined and highly customizable ways.

Iterators are factory lines for data

Elements from the iterator are dealt one at a time via the next()method. Here’s an example of an iterator that will yield the numbers 1, 2, and 3 in that order.

Iterators remember their position within an iteration. In other words, they know where they are within their stream of data.

If the iterator is finite, like the examples above, they are used until they raise a StopIteration exception. Once they reach their endpoint, they cannot be reused until they are regenerated by iter().

However, iterators may not have a logical endpoint. They can be infinite!

The itertools library has several functions for infinite iterators: cycle(), repeat(), count().

This brings up an important feature of iterators. They are lazy. Iterators only produce data on demand, as needed.

Generators provide data on the fly

Therefore, they do not take up a lot of memory.

This is especially clear in the case of the infinite iterators. Imagine trying to populate a list with a billion or trillion elements.

Python is very unhappy when we try to store those elements in a list, but we can easily define the next trillion data we’d like to use with an iterator.

You can imagine how this could be useful.

Imagine if we were hackers trying to crack a password by entering every possible combination of numbers and letters possible.

If we had to generate every possible password ahead of time and call it from a list, we’d be even worse hackers than this hypothetical scenario suggests.

An iterator would speed up this brute force method by generating each password on the fly and get us to jail much sooner.

Iterable Objects

From our examples above, it’s clear that certain objects are good blueprints for our iterator.

These objects are described as iterable, meaning they can feed iterators. More specifically, it must be able to pass through the iter() method. And because the original argument will serve as an iterator, it must also support next().

Looking back at our examples, we can see that lists and strings are iterable. Others include dictionaries, tuples, ranges, sets, and file objects. Some of these are sequences (i.e. they have order) and others are not.

Note: Iterators are always iterables, because you can pass them through iter() an it will just return itself.

Containers

Containers are data structures that actively hold data in memory. Containers can be asked whether it contains a certain element. In other words, you can test membership.

# Lists
assert 30 in [10, 20, 30]
assert 50 not in [10, 20, 30]
# Sets
assert 30 in {10, 20, 30}
assert 50 not in {10, 20, 30}
# Tuples
assert 30 in (10, 20, 30)
assert 50 not in (10, 20, 30)

Generators

A generator is a function that behaves as an iterator. It returns data through the use of yield statements. Generators offer a elegant alternative to object oriented construction of iterator objects.

def squares(start, stop):
for i in range(start, stop):
yield i * i

generator = squares(a, b)

The object oriented version of this would be:

class Squares(object):
def __init__(self, start, stop):
self.start = start
self.stop = stop
def __iter__(self): return self
def __next__(self):
if self.start >= self.stop:
raise StopIteration
current = self.start * self.start
self.start += 1
return current

iterator = Squares(a, b)

Summary of what we’ve just covered:

https://nvie.com/posts/iterators-vs-generators/

Examples of Iterators in Python and Data Science

  • For-Loops

For-loops generate iterators implicitly from iterable objects, like lists, and deal out values to each loop iteration. Python’s way of executing for-loops is simpler and unique compared to languages such JavaScript, C, C++, Java, PHP due to this iterator-based execution.

  • Loading Large Files

When reading a large file, creating a file-object automatically generates a file object iterator that will parse the contents out line by line, instead of dumping it all into your local memory.

There are many examples of iterators in machine learning. For example, in gradient methods, a model is run sequentially and must use a previous run to guide its next iteration. This may be calculating a loss function or a model score but essentially executes an iterative process without a fixed number defined ahead of time.

--

--