Processes, Threads, Deadlock, Semaphores, and More

A very brief overview of OS and concurrent programming topics

Gabriel Hicks
Geek Culture
6 min readApr 6, 2021

--

Gopher reading a Go programming language book
https://medium.com/@trevor4e/learning-gos-concurrency-through-illustrations-8c4aff603b3

Why? The bigger picture

Recently during an interview I was asked questions based around operating systems and concurrent programing topics. Although I knew that these concepts were important, my non-tech background and focus on functional and frontend technologies allowed me to justify not deeply looking into them.

My current favorite programming language, JavaScript is a single-threaded language. Although I love JS, I can’t deny that concurrent (multi-threaded) programming is an essential part of modern programming and shouldn’t be overlooked in the name of perceived superiority or preference.

Concurrency, as it sounds, implies multiple computations happening at the same time. Whether thats handling multiple users on a website, several applications running on a computer, or just the fact that processors are adding additional cores to be competitive, concurrent programming is here to stay.

Most readers may be familiar with programs. Programs are essentially instructions for a computer to perform a particular task. Modern programs are written on computers, for computers, in computer programming languages. In concurrent programming, and operating systems, there are some ideas I want to learn more about and hopefully I can generally lay them out.

These concepts and questions are threading, processes, processes vs threads, deadlock, semaphores, mutexes, and how these things all are related.

Processes

A process is a program in execution. Code is written, compiled into binary, and as it is being read by the computer becomes a process. Processes are “active” where programs are “passive”. A program executed several times creates multiple processes (instances) of a program.

https://www.geeksforgeeks.org/introduction-of-process-management/

Processes have contexts — attributes that are unique to each process which is also known as the process’ process control block (PCB). For processes to run concurrently, their contexts need to be saved, loaded or unloaded, as illustrated in the graph above.

There are other switches that can occur in processes, which I will investigate more thoroughly and follow up at some point in time.

What is important to know, is that within a process there exists one or many threads.

A thread, is a path of execution within a process. As stated above, there can be multiple threads within a process. Multi-threading is the idea of dividing a process into multiple threads for the goal of parallelism.

There are two types of threads, user level threads, and kernel level threads. As one may infer, user threads are implemented by users while kernel threads and implemented by operating systems. There are more differences, but for the sake of simplicity and overview I will spare these details for now.

Threads can have three states, running, ready and blocked.

One of the most significant differences between processes and threads are that threads can share memory space in a process, while a processes runs in separate memory spaces. However, threads are not independent of one another within a process. Therefore threads are generally more responsive, faster to switch context, more effective in multi-processors, and they communicate more effectively than processes.

A semaphore is a kernel mechanism for signaling. They are variables that are non-negative and shared between threads to help synchronize process in a multi-processing environment. A thread that is waiting on a semaphore can be signaled by another thread. They have two basic operations which are to wait, and to signal for synchronization.

There are two types of semaphores:

A binary semaphore has mutual exclusion, it can only have two values, 0 and 1, and is initialized as 1. It is used to signal and provide solutions when dealing with multiple processes.

A counting semaphore has no mutual exclusion and aids in controlling access to a resource which has multiple instances.

Through my research I found a lot of different descriptions of a mutex. From what I have gathered, a mutex proves mutual exclusion, and is leveraged as a locking mechanism. They Only one task can acquire the mutex, only one owner can release the lock.

Semaphores can be considered a more generalized conceptual mutex. Mutexes and binary semaphores can have similar implementation, but it is important to understand the purposes and intentions behind their uses are different.

Deadlock describes a situation in which a set of processes is blocked because each process is waiting holding onto a resource and waiting for a separate resource that is being held by a different process.

The following situations happening simultaneously ensure deadlock:

Mutual Exclusion: One or more than one resource are non-shareable (Only one process can use at a time)
Hold and Wait: A process is holding at least one resource and waiting for resources.
No Preemption: A resource cannot be taken from a process unless the process releases the resource.
Circular Wait: A set of processes are waiting for each other in circular form.

Source:
https://www.geeksforgeeks.org/introduction-of-deadlock-in-operating-system/

To work around deadlock you must prepare and avoid conditions that lead to deadlocks, detect deadlocks and handle it once it has occurred, or ignore them altogether because they are very rare and you can reboot the system to correct if one has occurred. The third option is how Windows and UNIX operating systems handle deadlocks.

Conclusion

The performance benefits of deciding between processes and threads to achieve parallelism is nuanced and varies from case to case. It seems generally the popular consensus is that writing multi-threaded software is hard and complexity increases exponentially. Multi-threaded deals with race conditions, locking, async calls and can be very error prone if you are not granular about your understanding.

In a more relatable example: if you have one database, you could have millions of users reading from that database at the same time. You need to be able to manage concurrency by implementing good threading practices to handle it.

I hope to continue to learn more about these types of concepts and get real world experience as I grow and learn in my career. I do not expect these descriptions or explanations to be exactly accurate as I have only cursory knowledge myself. I do intend for this to be informational and give an insight into some of the less touched topics for entry level engineers, perhaps like me, from a non-tech background!

--

--

Gabriel Hicks
Geek Culture

Software Engineer from Iowa, living in NYC. Interested in new technologies and exploring opportunities to grow. https://gabrielhicks.dev