Track your loop using tqdm: 7 ways progress bars in Python make things easier

Harshit Gupta
6 min readSep 25, 2021

--

Track the your Python loops with a real-time progress bar. 4th one is intersting.

tqdm is a Python library used for creating smart progress meters or progress bars. Word tqdm has two interesting origins: In Arabic: taqaddum which means ‘progress’. In Spanish: te quiero demasiado which means ‘I love you too much’. This is an exhaustive list of most use cases of tqdm.

Photo by Timothy Mugayi on Medium

tqdmdoes not require any dependencies and works across multiple python environments. Integrating tqdmcan be done effortlessly in loops, on iterable, with Pandas or even with machine learning libraries— just wrap any iterable with tqdm(iterable), and you're done!

Installing and Importing

tqdmcan be installed using pip commandpip install -U tqdm or if you’re are using anaconda then conda install -c conda-forge tqdm and can be imported using from tqdm import tqdm

Progress bars make things easier in Python because:

  1. They look visually engaging.
  2. They provide information like Estimated Time, iterations per second and execution time.
  3. They do not print unnecessary iteration messages and make the output look clean.

Here are the 7 ways you can use tqdmin your current Python code

1. Colorful progress bar to track a loop in Python

Suppose that you have a for loop which iterates 20 times and performs an operation each time (here I have taken time.sleep(0.1)as a dummy operation). Without printing there no way to know the real-time progress of the loop, estimated time etc. With tqdm we can do that with a single line bar. Here is the example:

For jupyter notebook, more colorful version of tqdm can be imported using tqdm.auto which gives a more elegant look to the progress bar.

We can wrap the tqdm on any iterable e.g.

for element in tqdm(list):
for key in tqdm(dictionay):
for x in tqdm(np.array):
for char in tqdm(text):

2. Nested progress bars for Nested loops in Python:

Suppose instead of one loop you have two loops inside each other and you need a double progress bar. tqdmallows you to have multiple progress bar for each loop you want. We will use the nested progress bars feature of tqdm and here is an example:

We can add descriptions to each loop so that we know which loop is currently running. This can be done using desc parameter:

If we do not want multiple nested loops, we can discard the inner loop every time once it is completed. We can do the same using leave=False parameter:

3. Working with Pandas (progress_apply)

Have you ever encountered a need to track the progress when you are calling a function on each row of the dataframe and you have no idea how much will it take to finish? tqdm integration with pandas provides a solution for that Instead of using .apply() method we can use .progress_apply(). This can be used after importing tqdm and calling tqdm.pandas(). The result is a colorful and informational progress bar that includes iteration per second and remaining time that updates in real-time.

4. Working with a while loop and unknown increments

Instead of using tqdm as a wrapper, we can create it outside the loop and update it inside the loop on each iteration. This makes tqdm more flexible for loops with unknown length or unknown increments.

How to use tqdm if my loop increment with a different value every time? By creating pbar = tqdm(…) outside the loop and calling pbar.update(value) inside the loop.

Here is an example:

You can use display(pbar.container) to delay the first print of pbar. More parameters can be found using:

from tqdm import tqdm
help(tqdm)

5. Downloading and uploading files in Python with progress bars

If you ever worked with python requests for file uploading and downloading, there is an inherent waiting time. In this section, I will talk about how to use tqdm with requests for downloading progress bars.

We use stream = True in requests.get to get the file in chunks. The response headers contain content-length which has the information about the total size of the file. Then we use resp.iter_content to iterate over chunks of data with a fixed chunk_size.

To set up the tqdm we use unit, unit_scale and unit_divisor to convert iterations into B, KB, or MB so that it will look more meaningful. Inside the iteration, we write the chunk to the disc and update the tqdm pbar with the length of the chunk. Here is all the code in action:

import requests
from tqdm.auto import tqdm
url = 'https://wordnetcode.princeton.edu/2.1/WNsnsmap-2.1.tar.gz'
filename = url.split('/')[-1]
resp = requests.get(url,stream=True)
pbar = tqdm(desc=filename, total=int(resp.headers.get('content-length', 0)),
unit='B', unit_scale=True, unit_divisor=1024,)
with open(filename, 'wb') as f:
for data in resp.iter_content(chunk_size=1024):
f.write(data)
pbar.update(len(data))
pbar.close()

6. tqdm with Multiprocessing and Threads (Linux and Windows)

Windows:

For windows we use ThreadPoolExecutor. In Python 3 or above, tqdm.write is thread-safe so we can see the multiple progress bars at once. I have used a dummy function worker which runs a loop of 100 and performs time.sleep based on thread_number. Now you can run your long-running jobs and see track their progress.

import time
from tqdm.auto import tqdm
from concurrent.futures import ThreadPoolExecutor
def worker(thread_number):
for i in tqdm(range(100), desc = f'thread {thread_number}'):
time.sleep(0.05*thread_number)

if __name__ == '__main__':
thread_list = list(range(1,4))
with ThreadPoolExecutor() as p:
p.map(worker, thread_list)

Linux:

In Linux multiprocessing is easier than Python and there are many ways to integrate tqdm with multiprocessing. The one mentioned for windows will also work for Linux. Here is a simple two-liner solution:

from multiprocessing import Pool
with Pool(workers) as pool:
results = list(tqdm(pool.imap(worker,thread_list, total=len(thread_list))

7. tqdm with Machine Learning libraries (Keras and Tensorflow)

ML or DL models take a long time to train and it is useful to know the estimated time, time per epoch, or time in full data preprocessing. Though most libraries provide their own way of logging but here is a simple integration method of tqdm with popular libraries to make things easier.

Keras:

We can use tqdm.keras to import TqdmCallback which is supported by Keras model.fit

from tqdm.keras import TqdmCallback
pbar = TqdmCallback(display=False)
pbar.display()
model.fit(…, callbacks=[pbar])

Tensorflow:
Recently TensorFlow also integrated these tqdm progress bars. We can import TQDMProgressBar from tensorflow_addons. Below is the template:

import tensorflow_addons as tfa
pbar = tfa.callbacks.TQDMProgressBar()
model.fit(…,callbacks=[pbar])
# TQDMProgressBar() also works with evaluate()
model.evaluate(…,callbacks=[pbar])

Questions, queries, and suggestions are welcome in the comments. You can also contact me via LinkedIn: https://www.linkedin.com/in/harshit4084/

--

--