Applying Composition to Page Object Model

Filip Swiatczak
Beargineer
Published in
4 min readDec 8, 2019

From locators to encapsulated web element objects that form a highly readable composition structure.

Hello! ✋

In this article I will showcase a concept of applying Composition structure to address lack of encapsulation and proper decoupling in a sample front-end automation project with WebdriverIO and Typescript.

The aim is creating custom classes for handling commonly encountered types of UI elements in your application under test (AUT) and turning your Page Object into a readable, strongly typed, living document describing the content of given web page.

For clarity I am assuming reader’s familiarity with Page Object model and basic understanding of Javascript/Typescript or other OOP language.

This Git repo is an illustration to below text.

Problem:

A clean Page Object, one which single responsibility is holding locators, offers a clean overview of the state of the web page elements. However:

  1. It does not tell us much about what those locators represent
  2. and offers no programmatic support on how to interact with each element

Example:

We can append descriptions of what the element type is at the end of declared names, though it reads poorly, is prone to mistaken selection and only hints at how to interact with that element.

Long names do not scale very well. Short snippet above is crystal clear. One with 50+ locators and names representing stacking composition:

clientJuniorCardDetailsAccountNumberInputField = $('input[title="Account number"]');

is a nightmare to type and read. This example represents only moderate level of complexity when multiple account and configurations are under test.

The more important part of the problem is where to put all the code that interacts with it?

Click, get text, tap and drag, wait for visible or invisible? Send value, or just add value? Not to mention various drop-downs and tables, with changing front end style libraries. That’s a considerable volume of code that we want encapsulated and not duplicated across the code base.

The aim is reducing maintenance cost!

Let’s briefly comment on common solutions:

  1. Put it in a Page Object. You bloat it with one or more methods for each declared locator. Proceed to copy paste code forever after. No re-usability.
  2. Put it as Static methods and you move away from object oriented architecture. Loss of visibility with increasing complexity, no method chaining.
  3. Inheritance with ex. Page class. Great at first, soon turns into Stockholm syndrome — you are held hostage and any change to parent object threatens to break the structure. No flexibility, so to keep up with change the parent ends up bloated, refactoring is a mess.

Solution:

Create classes that represent each commonly interacted web element type and declare their instances in Page Object replacing naked locators.

The single responsibility of each such class (ex. Button, TextField, Dropdown) is to provide all actions available on that element type and hide both application and automation framework specific implementation details.

Let’s see:

Sample Page Object with declarative component structure

Page Object now tells us what element type each locator is! You can simply copy web-elements folder from the git repo to begin using those.

Your step definition (business) layer now looks very readable:

Readability and auto-completion support!

Accessing an element now shows only methods available and custom tailored to this part of your front end. And you can add new classes or add methods to existing web element classes without breaking existing code. Decoupling, check.

Secondly, there are optional parameters which can be passed as a JSON into the constructor of each web element:

At a glance, declared element now tells us what it’s type is, if it requires extra time to execute properly and more. This is a perfect place to pass extra handling logic for those tricky and pesky situations.

The redefined single responsibility of Page Object is therefore to describe the interactions with each element, without any implementation details.

As a cherry on top, let’s wrap those new locators in a grouping object (field, button, results above) and we have an auto-completion friendly structure that:

  1. is very descriptive, breaking long names into dot separated, chunked decision tree
  2. separates implementation layer (web elements) from business layer (step definitions).
  3. is friendly to less technical or junior co-workers, allowing them to create new tests with less training and more confidence.

To finish off, let’s take a closer look at the Dropdown class :

Dropdown class contains three ways to select values from a dropdown

Each of 8 exposed functions checks for optional wait and visibility parameters. Those are passed into constructor at Page Object level.

3 select functions allow targeting display value, index position or specified attribute value. This is the core interaction we expect to perform on that type of element. Implementing IContainer exposes 4 visibility functions as convenience for additional checks for when application is in a transition state.

You can tailor this to match exactly what dropdowns are like in your application under test!

And when that application changes, you only need to update single class.

Thank you for reading!

--

--

Filip Swiatczak
Beargineer

Developer, tinkerer, automation enthusiast. Designs test architecture at AJ Bell.