Take a step back in history with the archives of PragPub magazine. The Pragmatic Programmers hope you’ll find that learning about the past can help you make better decisions for the future.

FROM THE ARCHIVES OF PRAGPUB MAGAZINE AUGUST 2010

Page Objects in Python: Automating Page Checking without Brittleness

By Adam Goucher

PragPub

Published in

The Pragmatic Programmers

9 min readMay 12, 2022

The Page Object pattern is the key to implementing smart automated checks. Here’s how Python programmers can make use of it.

Automating checks* at the UI level has earned a bad reputation in certain circles, primarily due to the associated maintenance costs. The bad reputation is deserved, in large part. Almost everyone who has done UI automation has done some variation of the familiar “the UI folks changed the username box to the loginname box and now everything has broken” dance. Elements on the page are always in a state of change — and should be. If they aren’t, then your product has stopped evolving. The challenge is to respond to those changes in a way that decreases the maintenance burden.

*Think of a check as a test done purely to confirm an assumption, which means that it is probably easy to do by machine — see this DevelopSense explanation.

As an automation practitioner and consultant, I find that the easiest way to handle these changes is to bundle up oft-used clusters of commands and operations into functions. Your script then becomes a mix of framework calls and application-specific functions that abstract operations.

Brittle or Better?

Compare these two snippets of Python from a Selenium script.

Brittle:

s.open("/login") 
s.type("username", "pragmatic") 
s.type("password", "magazine") 
s.click("login") 
s.wait_for_page_to_load("30000")

Better:

def login(s, username, password): 
    s.type(“username”, username) 
    s.type(“password”, password) 
    s.click(“login”) s.wait_for_page_to_load(“30000”) 
    #s.open(“/login”)
s.l ogin(s, “pragmatic”, “magazine”)

Helper functions should not be limited to just actions. They can also be used for interactions with list boxes, radio button groups, and so on.

def select_country_by_label(s, label): 
    s.select("country", "label=%s" % label)

This style of script design lends itself naturally to scripting languages like Perl, Ruby, and Python, which are traditionally more functionally oriented, but less to more OO languages like C# or Java.

Enter the Page Object.

The Page Object Pattern

The Page Object pattern comes out of the WebDriver project* as a way of encapsulating the intent of and interaction with an individual page in your application in a single object. The original implementation of the Page Object pattern was in Java, but recently reference implementations have appeared for Java, C#, and Ruby. But to the best of my knowledge, Python did not have such an example. Until now. The remainder of this article shows how I have successfully used the Page Object pattern in Python (using Selenium, but the approach is driver neutral).

*In 2009 Webdriver merged with Selenium to form the basis of Selenium 2.

The first thing we do is to distinguish between the page itself and the elements on that page. This difference in turn will determine our inheritance model.

Pages contain both actions implemented as methods (example: login_page.submit()) and web elements, which are themselves separate objects.
Web Elements are the parts of the page that are interacted with to either set up an action or to check the result of one — such as check boxes, text fields, radio buttons, and so on.

With these two branches determined, and throwing in the desire to do asserts on pages as well as to make use of as much encapsulation as possible, we arrive at this tree:

Object
  +-- unittest.TestCase
    +-- BasePageObject
  +-- BasePageElement

This is a good time to detour slightly around how things get organized on disk. When implementing Page Objects in Python, I have found it useful to make use of Packages to organize things. So with our two base classes, things look like this:

pageobjects/
    init .py 
    basepageobject.py 
    basepageelement.py

Right now, the basePageObject class need not do anything other than exist. What the baseWebElement class looks like will be cleared up shortly.

import unittest
class basePageObject(unittest.TestCase): 
    pass

This allows us to use it as a super class to our actual Page Objects and get all the asserts that come with a TestCase. Here is a Page Object that exists in a majority of systems: the login page.

from pageobjects import locators, selenium_server_connection 
from pageobjects.basepageobject import BasePageObject
from pageobjects.basepageelement import BasePageElement 
class UsernameElement(BasePageElement):
    #
  def init (self):
    self.locator = locators["login.username"]
    #
  def   set  (self, obj, val):
    se = selenium_server_connection.connection 
    se.type(self.locator, val)
    #
class PasswordElement(BasePageElement):
    #
  def init (self):
    self.locator = locators["login.password"]
    #
  def   set  (self, obj, val):
    se = selenium_server_connection.connection 
    se.type(self.locator, val)
    #
class LoginPageObject(BasePageObject):
    #
    username = UsernameElement() 
    password = PasswordElement()
    #
  def init (self, se): 
    self.se = se 
    self.se.open("/login")
    self.assertEqual("My Application - Login", elf.se.get_title())
    #
  def submit(self): 
    wait_for =
     "selenium.browserbot.getCurrentWindow().document.getElementById
     ('LogoutButton')"
    self.se.click(locators["login.submit"])     
    self.se.wait_for_condition(wait_for, "30000")

Don’t write all the element classes for your page up front. Instead, do them incrementally on an as-needed basis so even though the login page might have a clear button and forgot password functionality, neither is defined, as no script needs them yet.

Working from bottom to top, the LoginPageObject has a submit action, which is defined as a class method. When called, it will click a button and wait for another button to appear. Which button Selenium “clicks” is part of the reason for organizing Page Objects into packages.

Python packages give us a nice dotted notation to refer to modules within the package, but if you want something to be in the main namespace, you put it in the package’s init .py. One of the things that should be in there is a dictionary of all the locators throughout your application. Yes, it will get large, but it also means they are in a single spot when they need editing.

locators = {} 
locators["login.username"] = "username" 
locators["login.password"] = "password" 
locators["login.submit"] = "login"

Because there will be a lot of them, a good habit to get into is to use page.thing as the key. And make the key something that describes the intent of the field, as the key should not change once it is used. Its value will, however, change at the whims of the UI developers.

These two entries happen to be using the Selenium identifier locator (the default), but they could have been an xpath or css locator string as well.

The LoginPageObject also has its own custom constructor that takes a connection to the Selenium RC server (which was created in the original script) and sets it as a class attribute. It will also navigate to the login page. Another way to write this page navigation is to check whether you are on the page already, and skip it if you are.

def init (self, se): 
   self.se = se
   try:
     self.assertEqual("My Application - Login", self.se.get_title())
   except AssertionError:
     self.se.open("/login") 
     self.se.wait_for_page_to_load("30000")
     self.assertEqual("My Application - Login", self.se.get_title())

This second style of constructor can be more efficient for pages that are navigated to from within your application rather than pages you access directly.

The Magic

The two lines that define the username and password attributes of our class are where things start to get really interesting. Note that we define them not as strings or integers but as descriptor objects. Simply, a descriptor is an object that has had its get, set, and delete behavior overridden. This overriding is the magic of Page Objects in Python.

Looking at the UsernameElement class, we see that it too has a custom constructor, whose purpose is to set locally the locator the object refers to. The set method gets run any time the value of LoginPageObject.username is assigned, and rather than storing that value, it will type it in the field on the page. Because UsernameElement has BasePageElement as its super class, and there isn’t anything special about this particular element, we can rely on BasePageElement’s get and delete implementations.

from pageobjects import selenium_server_connection
    #
class BasePageElement(object):
    #
  def get (self, obj, cls=None):   
    selenium_server_connection.get_text(self.locator) 
    #
  def delete (self, obj): 
    pass

In your BasePageElement class, have the get use whichever field is most prominent in your application. For this purpose, we’ll pretend that most in this application are text fields.

❗ Warning: If you are trying to do a get operation on one of the descriptors and it is not a text field, you will have to define the get behavior in the specific element class itself.

Just as an element’s set behavior is to interact with the browser and insert text onto the page, the get behavior is to read it from the page.

The final bit of infrastructure needed by this implementation of the pattern is to be able to share the connection to the Selenium server between objects that might not normally have access to it. This is best illustrated by first showing a pyunit script that makes use of the Page Objects defined so far.

from pageobjects.login import LoginPageObject
from pageobjects import selenium_server_connection
    #
import sys, unittest, re, time, os.path, logging
    #
class PageObjectExample(unittest.TestCase):
    #
    def setUp(self):
        self.log = logging.getLogger("pragmatic.pageobjectexample")  
        self.verificationErrors = []
        self.selenium =
          selenium_server_connection.connect
          ("localhost", 4444, "*chrome", "http://some.test.site")
        self.selenium.start()
    #
    def testLogin(self)
        lpo = LoginPageObject(self.selenium) 
        lpo.username = "adam@element34.ca" 
        lpo.password = "password" 
        lpo.submit()
    #
    def tearDown(self): 
        self.selenium.stop()
        self.assertEqual([], self.verificationErrors)
    #
if name == " main ": 
  unittest.main()

This is a pretty standard-looking Selenium/pyunit script, except in the way that the Selenium server is accessed has been overridden with one that will create a singleton local to our package.

To achieve this there is code in two places. In pageobjects/ init .py:

from pageobjects.seleniumwrapper import SeleniumWrapper
    #
selenium_server_connection = SeleniumWrapper()

and in pageobjects/seleniumwrapper.py, where the actual connection establishment is buried.

from selenium import selenium
    #
class SeleniumWrapper(object):
    # singleton
    _instance = None
    #
    def new  (cls, *args, **kwargs): 
        if not cls._instance:
           cls._instance =
             super(SeleniumWrapper, cls). new (cls, *args, **kwargs)
        return cls._instance
    #
    def connect(self, host, port, browser, server): 
        self.connection = selenium(host, port, browser, server)   
        return self.connection

With all this in place, let’s walk through the execution of this script as a way of tying all the code bits together.

Taking It Step by Step

The Python interpreter parses the file, checking for syntax errors, imports a bunch of stuff into scope and learns how to create a PageObjectExample class that has unittest.TestCase as a super class.
It also determines that what it is supposed to execute is unittest.main(), which is when the fun starts.
The pyunit framework scans the file for TestCase classes and when it finds them looks at whether they have any methods that begin with “test.”
Since PageObjectExample has testLogin, it then executes the setUp method, which does a few things, but most importantly uses our wrapper around the Selenium server connection to establish communication with the running Selenium RC server.
Then the actual test method is called, the first line of which creates a LoginPageObject, which in turn causes the creation of UsernameElement and PasswordElement objects.
The next two lines trigger the set methods of the UsernameElement and PassswordElement objects to enter the username and password into the page.
The page is then submitted.
If this were a real script, there would be some assertions and checks into the database to ensure things behaved as expected.
Now that the test method is complete, the tearDown method is executed.
And lastly, since there are no other methods prefixed with “test,” results are displayed to the user.

Page Objects are definitely coming to the fore as a way of managing the maintenance costs of testing at the UI level, but there is a non-trivial technology cost associated with them. For people new to programming, I might still steer them towards just abstracting things into functions rather than objects. But for people with OO understanding and knowledge, or if you are automating an application that undergoes constant disruptive change, then Page Objects is absolutely a pattern worth considering.

About the Adam Goucher

Adam Goucher head shot — Author Adam Goucher

Adam Goucher has been testing professionally for over twenty years at a range of organizations from national banks to startups. A large part of that time has been spent augmenting his exploratory testing with automation. He is the maintainer of Selenium IDE and consults on automation through his company, Element 34.