While your cluster example does the job quickly, it exits the process early (right after reading is done) and potentially discards data. Did it actually produce the correct files? It should be producing 199.2 MB files (199,243,770 bytes per file). I’ve corrected your code below:
And here are the new timings:
Clustering is useful, and it’s a great tool for server applications. Although it wastes some memory and CPU cycles on start-up (since you get the overhead from spinning up new sub-processes every time you start your application).
Aside from that, you can’t scale a real-time application using clustering. A game — for example — can’t just spawn processes for audio and graphics and input. A real-time video encoding program that’s using pipes to communicate with sub-processes won’t even compete with one that’s using threads.
A website for live video streaming will most likely scale much better with multithreading than with a single threaded model.
As I said before, my analogy for node is this: What good is the world’s fastest highway if there’s a crowded toll booth at the end of it?
The toll booth would be the single-threaded event loop model that node uses, which is the choking point — and real limiter — of every node application.
I’m demolishing that booth.