Getting started with js-csp — part 2

Justin Calleja
4 min readMay 27, 2017

--

Perviously, in part 1 we took a look at how to work with the basics of js-csp in the context of writing a small CLI app. After writing the thing and stepping back a bit — I realised that, although I had split the problem up into nice understandable parts — nothing was happening concurrently. This post addresses that issue. (btw — if you’re interested in getting some practice, I’d suggest trying to do this yourself after reading the recap in the intro here).

Lets recap some of the main points we covered in part 1:

  1. yield csp.take(channel) can be used to suspend execution of a generator function and wait for a message to be available on channel before resuming. Can only be used in a generator function executed by csp.go .
  2. yield csp.put(channel, msg) can be used to suspend execution of a generator function and wait for a message to be consumed from channel before resuming the generator function. Can only be used in a generator function executed by csp.go .
  3. const channel = csp.go(function * () { … }) can be used to run a generator function (containing put / take ops). It returns a channel which will receive one message. This message will be whatever the generator function returns when it’s done.

Now, let me spell out why the code in part 1 has no concurrency in it. I didn’t go into this previously as an exercise for the reader (ok I was also tired and didn’t feel like haha but really if you did figure it out by yourself then you certainly did understand the code):

  • We first make a request for all contributors (this is an un-paginated result). We cannot do anything productive while we wait for this HTTP request as we need the contributors to continue.
  • Once we get the contributors, this is what I was doing:
const { json } = msg
for (let i = 0; i < json.length; i++) {
const contributor = json[i]
yield csp.put(
channels.outputCh,
AddContributor(contributor.login, contributor.html_url)
)
const doneCh = csp.go(
followerProcess,
[contributor.followers_url, contributor.login, csp.chan()]
)
yield csp.take(doneCh)
}
return yield csp.put(channels.outputCh, FlushMsg())

i.e. for every contributor, I was adding that contributor to the output and spawning another process to take care of fetching the contributor’s followers. I was then waiting for this process to finish before looping on to the next contributor. I was doing this because I needed to know when all followers were processed so that I could send the FlushMsg .

This, however, is the main thing that’s slowing everything down. We clearly don’t need to wait for one contributor’s followers to be fetched and added before requesting another one. (One thing to note though: I do want the contributor to be added to the output before fetching any of its followers. The reason being is that adding a follower depends on it’s contributor to have been added first (otherwise there’s no key to append to in the output)).

So that’s the problem we’ll be solving in this part… and it’s almost a one liner.

One possible solution

We can just collect all channels returns by spawning all followerProcesses and then wait for all of them to finish:

let doneChs = []
for (let i = 0; i < json.length; i++) {
const contributor = json[i]
yield csp.put(
outputCh,
AddContributor(contributor.login, contributor.html_url)
)
doneChs.push(
csp.go(
followerProcess,
[contributor.followers_url, contributor.login, csp.chan()]
)
)
}
for (let doneCh of doneChs) { yield doneCh }// all followerProcesses have ended
return yield csp.put(outputCh, FlushMsg())

One thing I avoided to mention in part 1 (but now thing you’ve got enough experience to start shortening things) is that yielding a channel is equivalent to taking from it so the following are identical:

  • yield doneCh
  • yield csp.take(doneCh)

Why are you switching to `for of` now?

You might have noticed that I’m finally using the for of construct for iteration in this 2 part post. The reason is that, while writing part 1, I was getting some errors with the for of which I wasn’t getting with a plain for loop. In any case, I decided to scrap them altogether for the time being.

In particular, the reason why I’m pointing this out here is to highlight something I forgot to mention in part 1. Basically, avoid using .forEach to iterate over your collections — or any other construct that requires you to pass a callback function. The reason being that any code you write in your callback will be outside the wrapping generator function and you will no longer be able to csp.take / put .

Check out the refactored code

I’ve refactored the code (split in multiple files) and added the little tweak above in branch post2 here.

Time?

I ran against ubulonton/js-csp again and the total time came in at 38s… a marked improvement from the previous 1m 35s.

I noticed that the result wasn’t exactly the same though — the project now has one less contributor… wasn’t aware that contributor counts can go down… That said, the total lines in the output is close enough.

In any case, that’s all I got for this time. Happy coding :)

--

--