Goodbye Transform-Streams, long live ES9 Async Generators
Clickbait title. Async generators are a powerful news, but you cannot throw transform streams in the trash.
What are transform streams?
Node.js doc says: “a transform stream is a duplex stream where the output is computed in some way from the input. Examples include zlib streams or crypto streams that compress, encrypt, or decrypt data”.
I’m not going to explain them here, you can find lot of free resources online. They are a really powerful resource to build amazing things with Node.js, but the drawback is that they are not so simple to implement not even for easy stuff. Anyway if you need to handle only the transform logic of the chunks odds are that this type of streams is the best.
Transform streams are independent units in our code, they could be directly piped with other streams, taking all the advantages of the case. Backpressure will be automatically applied, so we do not have to worry about different time needs between the readable stream and the writable one. Node.js will optimize these processes, making them concurrent.
How does async generators come into play?
Async generators are a special ES9 news that let us to easily implement the asynchronous iterator interface. Node.js streams have already implemented it, allowing us to do like the following:
So if a stream is readable it is also an asynchronous iterable.
This is not a tutorial on async stuff, if you want to know more about async iterables and async generators check this out!
They can be used to transform chunks like transform streams, but their main purpose is to handle the data flow.
In short words async generators let you:
- transform chunks
- filter/compose chunks
- pipe chunks’ transformations
- query data from the source when you need
Now time is completely under your control. You have not to query data from the source as soon as they are available. You have not to write data into a write stream as soon as you receive them from the source. Of course you must wait each time you query data and each time you write them, because read and write are asynchronous operations.
The point is that what to do between read and write is up to you and it can be asynchronous. It would be a lie if I told you that you cannot do something similar with transform streams. But with async generators things are easier and you can let the asynchronous data flow be managed by other entities in your code. This is why we can consider async generators a flow control tool.
Let’s see an example where we pipe two generator for filtering out all even numbers and increment by one odd ones, before writing them to
process.stdout. Numbers are asynchronously coming from a misterious readable stream:
In this example there is no explicit flow control, because each async generator simply consume all chunks as soon as possible. Into the IIAFE things don’t change due to the third for-await-of. But the async iterable returned by the async generators piping could have been used differently, giving us the opportunity to let other entities manage the whole asynchronous iteration.
Are you not confident with generators?
Don’t worry. I’m working on a little module that let you write functions, also async ones, and then trasform them into async generators! Furthermore I’ve coded a little helper function to easily pipe async generators together.
Let’s see how to use the
You can see that filter functions must return
undefined only when a chunk has to be discarded. Normal transform functions must always return a value. Composition functions are not yet supported, but I’m working on them.
Now let’s see how to pipe the generators:
It’s great, isn’t it?
Here you can find a complete gits.
Here you can see the source code of my helpers.
What about performance?
I’ve tested async generators vs transform streams with files. With small and medium size files you cannot see lot of differences: sometimes transform streams are better, sometimes async generators are better.
If you need to use my
asyncGeneratorsFactory helper function know that things will go a bit slower.
Anyway with big size files seems that async generators are better also for transformation purposes only. But more tests are needed also with different types of streams. If you can, help me :)