A quick introduction, “Parallelism” is the least used concept among python programmers, and this series is an attempt to give intuition on how impactful this can be for your python programs to perform a smoother, and faster execution.
Parallelism is a concept that revolves around a set of activities like multi threading, or multiprocessing. It a broad concept, and this series focuses on the advantages, implementations, and the disadvantages of threading in python.
Part-1 covers the very basic understanding of threading. This will help the reader to understand why threading is important, and how even by simple use of threading the code execution time can be drastically reduced.
Non technical definition of Threads :-
Threads are a way for a computer program to split itself into two or more simultaneously running tasks.
Simply put, threads allows a computer program to execute its code parallelly, by dividing the program into separate blocks of executable codes which are then executed by the processor in a way that is analogous of having multiple programs.
For example :-
Consider a program which performs execution of block of code that takes 1 second to complete. But instead of executing it for one time, we will execute it for 30 times. So that gives us a total of 30 seconds of execution time.
The above code will take exactly 30 seconds to run, which was expected. Now if we use a little bit of threading to tweak our code we can reduce the execution time by a lot.
Code execution with 2 threads :-
Now if we add only two threads, we can get a 50% reduction in execution time.
Ignoring the actual implementation of thread. Here we see two threads t1, and t2 who perform the same operation for cumulative 30 times with t1, and t2 each performing wait operation for 15 times.
Now if we compare the output, we get :-
That’s a 50% reduction!
Code execution for more than 2 threads.
Similar to previous code, if we add more threads by the factors of 30, we get the output time as :-
For 3 threads :-
For 6 Threads :-
That’s a 83% reduction in execution time!
So with a brief importance of threads under consideration, lets look what threads actually are in the system.
Technical definition of threads.
A thread is a path of execution within a process. A thread is also known as lightweight process. The idea is to achieve parallelism by dividing a process into multiple threads.
Analogy of the above definition with our previous example :-
Consider our previous example of executing time.sleep(1) for 30 times as a process. And then we divided this task into two threads, which executed time.sleep(1) for 15 times parallelly. This splitting of our ‘process of executing 30 times’ into ‘sub processes which each executes 15 times’ is done to achieve parallelism.
Other examples of threading on a broader scale :-
Threading is used almost in every good professional software, where execution time can make, or break the deal. Here are some examples on how threading is used in some of the popular softwares.
- Web Browsers — A web browser can download any number of files and web pages (multiple tabs) at the same time and still lets you continue browsing. If a particular web page cannot be downloaded, that is not going to stop the web browser from downloading other web pages.
- Web Servers — A threaded web server handles each request with a new thread. There is a thread pool and every time a new request comes in, it is assigned to a thread from the thread pool.
- Computer Games — Playing the background music at the same time as playing the game is an example of multithreading.
- Text Editors — When you are typing in an editor, spell-checking, formatting of text and saving the text are done concurrently by multiple threads. The same applies for Word processors also.
- IDE — IDEs like Android Studio run multiple threads at the same time. You can open multiple programs at the same time. It also gives suggestions on the completion of a command which is a separate thread.
Closing words :-
- Threading is typically used in programs to achieve parallelism. When parallelism is achieved, a programmer can execute multiple task without having the constraint of control flow.
- i.e Even when the main program is busy in tasks like reading an input form an user, the programmer can use threads to have a separate control which can perform tasks without having to wait unnecessarily.
- It can also be used to speed up an execution of task as we had seen in our example.
- However there are some disadvantages to use threads, and sometimes it maybe an overkill which we will see in the subsequent parts of the series.
- Issues like GIL(Global Interpreter Lock), and alternatives like multiprocessing haven’t been discussed, but will be discussed in the subsequent parts. These topics are necessary to have a good understanding of parallelism in python.
That’s it for part1, from part 2 onwards we will start implementing, and understanding various concepts related to threading in python. Please leave any doubts, suggestions, constructive criticism, and your valuable feedback in the comments. Thank you.