Track your loop using tqdm: 7 ways progress bars in Python make things easier
Track the your Python loops with a real-time progress bar. 4th one is intersting.
tqdm is a Python library used for creating smart progress meters or progress bars. Word tqdm has two interesting origins: In Arabic: taqaddum which means ‘progress’. In Spanish: te quiero demasiado which means ‘I love you too much’. This is an exhaustive list of most use cases of tqdm
.
tqdm
does not require any dependencies and works across multiple python environments. Integrating tqdm
can be done effortlessly in loops, on iterable, with Pandas or even with machine learning libraries— just wrap any iterable with tqdm(iterable)
, and you're done!
Installing and Importing
tqdm
can be installed using pip commandpip install -U tqdm
or if you’re are using anaconda then conda install -c conda-forge tqdm
and can be imported using from tqdm import tqdm
Progress bars make things easier in Python because:
- They look visually engaging.
- They provide information like Estimated Time, iterations per second and execution time.
- They do not print unnecessary iteration messages and make the output look clean.
Here are the 7 ways you can use tqdm
in your current Python code
1. Colorful progress bar to track a loop in Python
Suppose that you have a for loop which iterates 20 times and performs an operation each time (here I have taken time.sleep(0.1)
as a dummy operation). Without printing there no way to know the real-time progress of the loop, estimated time etc. With tqdm
we can do that with a single line bar. Here is the example:
For jupyter notebook, more colorful version of tqdm
can be imported using tqdm.auto
which gives a more elegant look to the progress bar.
We can wrap the tqdm
on any iterable e.g.
for element in tqdm(list):
for key in tqdm(dictionay):
for x in tqdm(np.array):
for char in tqdm(text):
2. Nested progress bars for Nested loops in Python:
Suppose instead of one loop you have two loops inside each other and you need a double progress bar. tqdm
allows you to have multiple progress bar for each loop you want. We will use the nested progress bars feature of tqdm and here is an example:
We can add descriptions to each loop so that we know which loop is currently running. This can be done using desc
parameter:
If we do not want multiple nested loops, we can discard the inner loop every time once it is completed. We can do the same using leave=False
parameter:
3. Working with Pandas (progress_apply)
Have you ever encountered a need to track the progress when you are calling a function on each row of the dataframe and you have no idea how much will it take to finish? tqdm integration with pandas provides a solution for that Instead of using .apply()
method we can use .progress_apply()
. This can be used after importing tqdm
and calling tqdm.pandas()
. The result is a colorful and informational progress bar that includes iteration per second and remaining time that updates in real-time.
4. Working with a while loop and unknown increments
Instead of using tqdm
as a wrapper, we can create it outside the loop and update it inside the loop on each iteration. This makes tqdm
more flexible for loops with unknown length or unknown increments.
How to use tqdm if my loop increment with a different value every time? By creating
pbar = tqdm(…)
outside the loop and callingpbar.update(value)
inside the loop.
Here is an example:
You can use display(pbar.container)
to delay the first print of pbar. More parameters can be found using:
from tqdm import tqdm
help(tqdm)
5. Downloading and uploading files in Python with progress bars
If you ever worked with python requests
for file uploading and downloading, there is an inherent waiting time. In this section, I will talk about how to use tqdm
with requests
for downloading progress bars.
We use stream = True
in requests.get
to get the file in chunks. The response headers contain content-length
which has the information about the total size of the file. Then we use resp.iter_content
to iterate over chunks of data with a fixed chunk_size
.
To set up the tqdm we use unit
, unit_scale
and unit_divisor
to convert iterations into B, KB, or MB so that it will look more meaningful. Inside the iteration, we write the chunk to the disc and update the tqdm pbar with the length of the chunk. Here is all the code in action:
import requests
from tqdm.auto import tqdm
url = 'https://wordnetcode.princeton.edu/2.1/WNsnsmap-2.1.tar.gz'
filename = url.split('/')[-1]
resp = requests.get(url,stream=True)
pbar = tqdm(desc=filename, total=int(resp.headers.get('content-length', 0)),
unit='B', unit_scale=True, unit_divisor=1024,)
with open(filename, 'wb') as f:
for data in resp.iter_content(chunk_size=1024):
f.write(data)
pbar.update(len(data))
pbar.close()
6. tqdm with Multiprocessing and Threads (Linux and Windows)
Windows:
For windows we use ThreadPoolExecutor. In Python 3 or above, tqdm.write
is thread-safe so we can see the multiple progress bars at once. I have used a dummy function worker which runs a loop of 100 and performs time.sleep
based on thread_number. Now you can run your long-running jobs and see track their progress.
import time
from tqdm.auto import tqdm
from concurrent.futures import ThreadPoolExecutor
def worker(thread_number):
for i in tqdm(range(100), desc = f'thread {thread_number}'):
time.sleep(0.05*thread_number)
if __name__ == '__main__':
thread_list = list(range(1,4))
with ThreadPoolExecutor() as p:
p.map(worker, thread_list)
Linux:
In Linux multiprocessing is easier than Python and there are many ways to integrate tqdm
with multiprocessing
. The one mentioned for windows will also work for Linux. Here is a simple two-liner solution:
from multiprocessing import Pool
with Pool(workers) as pool:
results = list(tqdm(pool.imap(worker,thread_list, total=len(thread_list))
7. tqdm with Machine Learning libraries (Keras and Tensorflow)
ML or DL models take a long time to train and it is useful to know the estimated time, time per epoch, or time in full data preprocessing. Though most libraries provide their own way of logging but here is a simple integration method of tqdm
with popular libraries to make things easier.
Keras:
We can use tqdm.keras
to import TqdmCallback
which is supported by Keras model.fit
from tqdm.keras import TqdmCallback
pbar = TqdmCallback(display=False)
pbar.display()
model.fit(…, callbacks=[pbar])
Tensorflow:
Recently TensorFlow also integrated these tqdm progress bars. We can import TQDMProgressBar
from tensorflow_addons
. Below is the template:
import tensorflow_addons as tfa
pbar = tfa.callbacks.TQDMProgressBar()
model.fit(…,callbacks=[pbar])# TQDMProgressBar() also works with evaluate()
model.evaluate(…,callbacks=[pbar])
Questions, queries, and suggestions are welcome in the comments. You can also contact me via LinkedIn: https://www.linkedin.com/in/harshit4084/