Moving to fullstack end-to-end test automation with Node.js part 1 - Selenium Webdriver

Mek Stittri
11 min readJan 3, 2016

--

Back in November '15, I have had the privilege to present at San Francisco’s Selenium meetup. The topic was the growing adoption of Node.js and how it will impact future implementations of Selenium automation frameworks. My first series of posts will be sharing in more detail how this journey started at Airware and lessons learnt along the way.

The slide and recording for the talk can be found on slideshare and SauceLab’s youtube.

Abstract

With the growing popularity of Node.js, more and more companies have embraced its adoption and gone full-stack. The next logical move is to have the test framework be on the same stack. Unfortunately, proven ways of implementing a Selenium framework in Javascript are very limited and this space is still fragmented.

Background

I joined Airware back in June 2015, my first task as the first automation engineer for the cloud is to build a full stack functional test automation framework that fits well with the current Node.js stack. I was free to choose however I want this to be done but with one caveat. The framework is to be in Node.js.

Why Node.js

  • Both frontend and backend are in Node.js, having functional tests written in Node.js as well would help with collaboration across the stack between developers and quality engineers.
  • Javascript is eating the world, the growing adoption is undeniable, its making its way into server side as well as native apps. Might I add that Wordpress, originally a php shop, is also moving to Javascript.
  • There has always been a disconnected engineering stack with functional & end-to-end test automation with Selenium. Most implementations are in either Java, Python or Ruby. However, UI developers do their stuff in Javascript. On top of that there is also fragmentation on the backend as well with Java, C++, Python and etc. With the adoption of Node.js we have the opportunity to unite all facets of the product under one lingo, Javascript.

Other requirements

  • Must support visual screenshot diff. Either a well known 3rd party like Applitools or a ready-to-use library. (covered in part 2)
  • Must be able to drive the UI (Selenium) as well as REST APIs/HTTP request endpoints (covered in part 3).
  • Must be easy to write tests with an emphasis on code readability and understandability.

Problem statement

A typical engineering stack for an end-to-end test automation framework will look something like the first 3 columns.

We are going for the right most stack

We have our legacy Selenium framework built on Java (TestNG), Python and Ruby. Then you have frontend engineers who are coding in Javascript and another fragmentation at the backend with Java, Node.js, Ruby and many more. As you can see that the whole engineering stack is disconnected as everybody is using different languages and establishing collaboration across the stack gets challenging. What we are trying to achieve is the right most stack.

The disconnect can be also seen with hiring engineers

This is from actual job postings of a certain startup in the valley.

The UI engineer

  • Must haves: Javascript, Node.js, Bootstrap, Angular
  • Needs to interface with the backend in Java

The automation engineer

  • Must haves: Python or Java

Browser and UI automation requires a close collaboration between front-end and Selenium automation engineers. It does not make any sense to have our UI automation harness written in a heavy language like Java. Even more so when we are going full-stack with Node.js.

We were also not alone in this. Other companies like Netflix, Paypal, Yahoo and etc. have started to move their Selenium automation to Node.js as well.

Research

When we speak of Selenium Webdriver almost 99% of the examples online will be in either Java, Python or Ruby. To make things worse googling “Selenium node.js” would also give us tons of libraries to choose from. Not only is it hard to find any good examples but this space is very fragmented as well.

From this list we narrowed it down to the 3 most popular frameworks

  1. NightwatchJS
  2. WebdriverIO
  3. Selenium-Wedriver (WebdriverJs)

Also it would be helpful to point out that WebdriverIO was named WebdriverJS before. The evidence is here. If you are googling with WebDriverJs hoping to find the official project, google results would point you to cached versions of WebdriverIO which was named WebdriverJS. If you don’t know what you are looking for you can go down the wrong rabbit hole pretty fast. Note that we skipped Protractor because we do not use AngularJS.

Challenges with Javascript

Asynchronous-ness

Javascript is asynchronous, most of the UI developers are familiar with this but coming from an automation background which I mostly worked with Python and Java this concept of non-blocking calls was new. I spent my first 2 weeks on Node.js learning this the hard way.

Take a look at the sample code below and its output. From looking at the code you would expect “Finished!!” to be printed last after displaying all of the contents of the array. But that is not the case for Node.js, “Finished!!” is actually the second line to be printed out.

We simulated a non-blocking operation that will take 1000 ms to complete with the setTimeout() function, within that context we do our array push operation. By nature Javascript methods are non-blocking, which means at runtime it will not wait for the work to be finished before executing the next line. From the output you can see that the result of the array is printed out first and the string ‘Finished!!’ then the debug statements inside the loop comes later.

Getting familiar with asynchronous behavior is probably the biggest challenge for any automation engineer coming from a traditional synchronous programming language like Java, Python or Ruby.

Getting around asynchronous-ness

To tame asynchronous behavior, Javascript provides two popular mechanism; callbacks and promises.

Note that we will only cover callbacks and promises, nowadays there are also ES6 generators and async / await. However back at the time of building our framework these were still new and not widely supported.

Callbacks

A callback is function context given as a parameter at the end of a function so the next operation can be chained and fired off once the current is done. This is the most primitive API structure in Javascript and things can get out of control pretty fast. See the below example of a callback pattern example and the equivalent synchronous code.

Callbacks can be hard to maintain and not readable once your code gets bigger. You will also fall in to the pyramid of doom trap or callback hell pretty fast. See a real world example below.

Promises

Promises are a bit better, a resolution context is returned and a then() method can be chained for the next operation. The sequence is now somewhat manageable and not packed into one giant triangle. See the below example of a promise pattern example and the equivalent synchronous code.

With promises, we haven’t achieved the same way how clearly things are written with synchronous languages but we have made some progress. Below is an example code.

Deciding on libraries

After looking at how code composition was done in Javascript, lets go back to our 3 finalists.

Both NightwatchJS and WebdriverIO have been very popular. They are both easy to setup and provides convenient chained APIs to get around asynchronous behaviour. Both implement Selenium’s JsonWireProtocol.

NightwatchJS

Pros

  • Convenient chain APIs, a workaround for dealing with callbacks
  • Some form of pageobject support
  • Some extendability with custom commands
  • Saucelabs / Browserstack integration out of the box
  • Good documentation

Cons

  • No visual testing capability
  • Locked into its own test runner
An example of a test written with NightwatchJS

WebdriverIO

Pros

  • Convenient chain APIs
  • A+ Promise support
  • Excellent documentation
  • Saucelabs / Browserstack integration out of the box
  • Some form of visual testing capability based on WebdriverCSS
  • Works with other test runners like MochaJS

Cons

  • Hard to implement the pageobject pattern
An example of a test written with WebdriverIO

The downfall of chained based APIs

While being very easy to use from the start. The chain based API structure turns out to be an Achilles heel.

Chain falls apart once we start to do complex things

  • Chain based API is a particularly bad pattern when asynchronous call are involved. As soon as we try to do something complex like iterating, working with data structures and doing things not supported in the given API, we end up having to break the chain. (e.g. traversing an array of WebElements)
  • Once the chain is broken, we need to resort to callbacks / promises to frame the execution sequence. And it takes us right back to the pyramid again.

Below is an example of a NightwatchJS test grabbing an entry from a job list and asserting on the result.

Pageobject pattern and chain APIs don’t get along well

Most Selenium chain based libraries gives you just one main object (e.g. browser, client) and all interaction commands are tied to that object’s chain. The two libraries above provide the following ui context object.

  • NightwatchJS : browser
  • WebdriverIO : client

It maybe good for a small project starting out but as you scale and start using pageobject patterns, it’s not ideal. See the below example on using chain-based pageobject.

We can see a pattern forming: a chain within a chain. As a result, pageobjects are just methods that end up containing another chain. This does nothing to help with code composition and is still prone to pyramid of doom once you do more complex operations.

A NightwatchJS pageobject method grabbing text fields from a row

We will be discussing pageobjects in Node.js in detail later, since this topic deserves its own post.

WebDriverJs

Then we looked at WebDriverJs, from the official Selenium project. WebDriverJs does not provide any chain APIs. The APIs are purely asynchronous. At a glance, it seems that tests written with WebDriverJs are not readable and hard to maintain since we need to use callbacks or chain our own promises. If you recall the first callback and promise example above.

Callbacks

Promises

But after diving into more detail this library provides a promise manager that helps with the scheduling and execution of all promises automatically. The promise manager is provided as a test wrapper for MochaJS. This makes the code very readable and synchronous-like. The WebDriver API is layered on top of this promise manager.

Code written with the promise manager

Our Javascript code actually looks very much like Java now. To achieve this, we replace Mocha’s describe and it methods with WebDriverJs’s promise manager wrapped version. This is done via requiring the test object and using describe and it methods from there.

var test = require('selenium-webdriver/testing');
...
test.describe('', function() {
test.it('', function() {
...
});
});

A full example is below. There are no callbacks or manual promise chaining and the code very “synchronous” like. Each line of code that is a promise gets added to the queue and everything runs top down just like a synchronous language. Specifically, you don’t have to chain everything and each individual line of code can do only one ui action and then some assertion if necessary.

WebDriverJs is the winner for the following reasons.

  1. Code simplification and readability using the promise manager. I was not sold on how manageable the code will be with chained APIs.
  2. Since this is from the official Selenium project, the API is a close mirror of its Java, Python or Ruby counter part. We get to work with WebElement and Driver objects that is the defacto standard. This will also make things easier when we integrate with other third party tools that expects these objects. (e.g. Applitools visual validation which will be covered in another post)
  3. From the above, on-boarding automation engineers who previously worked with older legacy Selenium implementation should be easier. They just need to get up to speed with Javascript and its asynchronous nature. The rest should just be plug-and-play.

Quick comparison of all the libraries we evaluated

Lessons learnt

Looking back on how we started this journey, we probably spent a good month and a half doing prototypes and evaluating all the available Node.js Selenium libraries out there before finalizing on the decision. Here are some of the lessons learnt.

Easy to use maintainable or scalable

We were distracted by the ease of use with chain apis and pretty documents in the beginning. I find it really funny that the answer lies in the official Selenium project all along and it was the last place we looked at.

Iqnore github stars and look at number of npm downloads

One of the mistakes I made early on was looking at how popular the project was on github to get a feel on the market. I have found a better indication to actually be the download count on npm.

To effectively evaluate, implement a mini scale of the project

To get the level of confidence that we are on the right path, we implemented a smaller scale of the project. Include all the use-cases to be fulfilled but in basic minimal form. We basically wrote a small UI automation framework for all three libraries: NightwatchJS, WebdriverIO and WebDriverJs. Then we simulated scaling different parts of the framework to get a big picture of how things will look like.

Forward looking

If you are looking to move to Node.js with your Selenium framework, look nowhere else; the official project is the way to go. As more companies adopt Node.js, I look forward to see more and more quality teams switch to Node.js for their Selenium automation framework. I truly believe that this is just the beginning of a technology shift that will impact how test automation are being done. Stay tuned for the ride, it will definitely be a fun one.

Special thanks

SF Selenium Meetup Folks

  • Marcel Erz, Yahoo — Feedback on implementations of Webdriver-sync
  • Mary Ann May-Pumphrey — NightwatchJS feedback

Saucelabs
Initial feedback on selenium node.js

  • Neil Manvar
  • Kristian Meier
  • Adam Pilger
  • Christian Bromann — WebdriverIO

--

--

Mek Stittri

Engineering leadership at @airware, Automation & Continuous Delivery. Jitta seed investor, VMware & Tidemark alumni.