Sigh!…duh!..wtf…are you serious?…why?…omg…!!!
With great power comes great responsibility - Spiderman Movie
But here, I am trying to talk about the baggage. Every great and magical thing comes with a baggage which we take for granted, like its a given fact that good things have to be compensated for in some way or the other. I talked about how python is good but how being good also hurts python in some other aspect in one of my previous blog post here.
Similarly, JS and Node have some baggage, unique problems of their own which we have to gulp down our throat whether we like it or not. So, today I am trying to discuss about the baggage. Don’t misinterpret this with JS and Node being bad, but give consideration to the things you may have to endure while using these magical potions.
“Asynchronous” — If you want me to sum up my praise for Node and Js in one word, then this would be it.
Asynchronous is your swiss army knife to keep you from getting fired.
What is this Asynchronous everybody keeps talking about now a days, almost everywhere?
“Asynchronous I/O, or non-blocking I/O is a form of input/output processing that permits other processing to continue before the transmission has finished.” — Wikipedia
“Asynchronous programming refers to a style of structuring a program whereby a call to some unit of functionality triggers an action that is allowed to continue outside of the ongoing flow of the program.” — Nodesource
Well, if its still isn’t easy enough for you to understand it, let me try in my own words now. In a nutshell, asynchronous looks something like this image below.
Computers understands commands, instructions given to it to perform some operation. We all in our daily lives do something and we are habituated to do something in sequence. We buy vegetables first, then we wash our dirty dishes, then we cut our vegetables, then we start cooking our food, and finally eat. Well, you can always eat by ordering in or going out to a restaurant and just order there, but my point is to do something, we do it in sequence, even eating out needs you to first go out of the house, get inside a vehicle, car maybe, then drive to the restaurant, then call the waiter, give the waiter your order, the chef/cook cooks your food, the waiter brings your food to your table, and only then you start eating. You see everything needs a sequence. You cannot buy vegetables and start eating simultaneously. The sequence is always there. Well, there are a few things in life which you can do simultaneously but you get my point.
This habit of following sequences is built-in in our lives and thus we find it easier to understand the sequence concept. Now in programming, when we write code, it basically is a sequence or steps of commands/instructions that we tell our computer system specially the CPU to perform. And we unknowingly always end up writing sequential code, which is another word for synchronous code because this comes naturally to us. Which is what I was trying to point out in the earlier paragraphs.
// Synchronous code
alert('i am anirban');
const a = 3;
const b = 5;
const sum = a + b;
console.log('sum : ', sum);
Now the above code is a sequence of steps that the CPU performs one by one. This is called synchronous programming, synchronous means sequential here.
I/O — What is it?
Now synchronous I/O is synchronous programming which also involves some kind of I/O. Now, I/O means anything which doesn’t involve the use of cpu cycles. Something which happens in some other part of the computer system, like what the I/O driver does.
Sometimes just following or executing steps/sequence of instructions is not enough. Sometimes we need to store the results somewhere, store the data somewhere and later read those data from the place where it is stored. This form of actions are called I/O. Network events are also I/O. So when you send a network request or respond to an incoming network request, all these fall under I/O.
Back to Asynchronous and Non-Blocking I/O
Now if we keep doing any kind of I/O in our sequence of commands, then its synchronous I/O or Blocking I/O. Why Blocking? Because I/O takes more time. I/O doesn’t involve working of the CPU or Faster Memory like cache or RAM. It has to go though layers of devices and drivers and then to real hardware to fetch data. When it comes to network events, the requests (send by client) goes through one system to another system which may be quite far and it hops from one place to another across the network (which we call the internet) and finally reaches the destination (the server). Thus this kind of I/O also takes a lot of time.
Now if we keep writing I/O commands in the same way as sequential commands, it becomes Blocking I/O and every step has to wait until it returns or finishes successfully.
But Non-Blocking I/O means when the I/O doesn’t block the regular sequence of instructions and let the sequences run even after the I/O starts and before it finishes or returns as if the I/O didn’t even happen.
We can do non-blocking I/O by using threads and let the I/O instructions/actions to run from different threads. Threads can help because multiple threads can run in parallel or asynchronously given multiple cpu cores.
Another way to do non-blocking I/O is by using non-blocking sockets. At the end of the day any network request deals with sockets in the background. Now by using non-blocking sockets, the request and response data exists in the sockets in their personal queues. And we can use some kind of polling mechanism which helps to poll the socket queues to see if data is available to read, write, send, receive. Poll mechanisms like
epoll (Linux) or
kqueue (BSD and Mac OS X) if they are available, or else falling back on
select() are ways you can use to do non-blocking I/O.
Now by using threads is what many compiled languages like C/C++, Java etc, handle non-blocking I/O. Some interpreted languages like Python, Ruby they also try to use multi-threading or multi-processing to do asynchronous I/O. Well, some libraries like Tornado (in Python) doesn’t use threading or multi-processing, in fact it uses a very similar method what node js uses, even based I/O loop in a single threaded mode. The only difference is Tornado is a library written in python, hence it has to do many tricks to achieve this asynchronous behaviour whereas NodeJs is built from ground up keeping asynchronous in mind, there is no other way you can use node. Asynchronous Behaviour is first class citizen for Node. Hence it has far greater hold over event loops and performance too. (Well, that is always debatable, but most of the time it is)
Node was built from ground up to understand asynchronous I/O. Being single threaded, it makes this a first class citizen. Developers code keeping in mind that their code should respect the asynchronous behaviour. If they try to do some cpu heavy work or some blocking I/O then being single threaded their entire process is going to block. Being single threaded again, highly concurrent requests can be made since all I/O are non blocking, once an I/O is finished, some event notifies the main thread about the return of the I/O and responds back from the main processing thread. Thats why its called event based programming or making use of event loop.
I think you have tired yourself enough by reading till here. I appreciate your effort and willingness to stay till the end. Lets take a break and I will continue this in Part 2 of this post.
Link to Part 2