Recent developments in the browser automation & web scraping space

David Wickström
Aug 25, 2017 · 2 min read

Okay now, for a long time there wasn’t a lot of action in this space. We had PhantomJS and NightmareJS, which to be fair, at least in the case of NightmareJS was all you’d ever wish for.

Earlier this year Chromium, the open source project behind the worlds most popular web browser Chrome, released technology that allows you to run the browser headlessly.

This stirred some proper action in the community, because, I believe, this resolves the dependency on Electron (as in the case of NightmareJS). And removing dependencies sounds like a good thing.

There is a bunch of different projects that has surfaced and that wants to gain traction as the next hot thing. This blog post will try to give you a brief update on how things are going with that.

Google’s home grown

Google decided to join the party and build their own high level Node API on top of headless Chrome. It’s called Puppeteer.

https://github.com/GoogleChrome/puppeteer

Looks promising, but since before I have the feeling that when it comes to data aggregation, Google does not like competition. Which in turns leads me to believe that this might be a problem for this library in the long run. OK, tin foil hat off. Next.

Navalia

According to the growth rate of Github stars, this project is the one mentioned here that so far has had the least amount of traction.

https://github.com/joelgriffith/navalia

Having said that there is definitely good involvement from developers and things are looking nice and smooth.

Chromeless

This thing looks very promising and has gotten almost as good traction as the Puppeteer project, judging by the number of Github stars. It runs either locally or on AWS Lambda. Which seems very convenient and scalable. Very nice indeed.

Check this demo:

How cool is that?


Other projects

Definitely also worth mentioning are these:

  • Chromy that provides mobile emulation with its API.
  • GhostJS which runs on other browsers than Chrome too.

A3J dev blog

Thoughts and reflections from a Stockholm based web tech agency

)
David Wickström

Written by

Developer - JS, Python, PHP and more

A3J dev blog

Thoughts and reflections from a Stockholm based web tech agency

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade