Automation testing, scraping and more with Selenium and Python.
Selenium — once started as a browser extension and is now one of the most popular frameworks for testing web-based applications. In this article, you’ll learn what Selenium is, how to use it, and how to get started with your first steps.
Selenium Basics
Selenium is an open-source framework and consists of the components:
1. Selenium Webdriver: the main component that automatically performs the user’s actions in the browser.
2. Selenium Grid: the component that allows multiple instances to run simultaneously — on multiple devices, operating systems, and browsers. This makes it possible to run test results on as many platforms as possible, which is a prerequisite for successful automation testing.
3. Selenium IDE: records and repeats actions of the user in the browser.
The Webdriver communicates with the browser via a browser-specific driver, which also sends HTTP responses to the Webdriver in the opposite direction. This simple structure, coupled with an open architecture and the possibility of working cross-platform and cross-language, attracts many users and developers and enables extensions to be built in.
Let’s get started!
First of all, the setup: We need Selenium — which we can download and install in the terminal with“-m pip install selenium”. Next, we need the already mentioned webdriver. I use Google Chrome for this project and you can find the Chrome webdriver here. We start a new Python project — and import Selenium with the first lines of code, select the webdriver.Chrome and give the path where the chromedriver is located.
from selenium import webdriver
driver = webdriver.Chrome(‘/Users/NAME/Desktop/chromedriver’)
Here you can already test if everything was imported correctly: Running the code should open a new Chrome window with the message: “Chrome is being controlled by automated test software”.
To get a first understanding of how Selenium works and to start learning the syntax — let’s start with a very simple example. To call a page, we use “get” on the “driver” we just defined to open the desired URL:
driver.get(‘https://www.google.com/xhtml')
To do an automated Google search we have to find the elements needed to perform a search. To find it out which elements we need, we have to go to “View Page Source” with a right-click on the site (using Chrome browser) to see the HTML-Code. How HTML codes are built and how to navigate here, I have described in a previous article using a scraping example with BeautifulSoup.
To implement the elements we need in our code we will use the function: “find_element_by_name” to use the elements on the page by name. Side info: We are using “google.com/xhtml” so we can better read the HTML code of google. A short search shows us that the name of the search bar element is simply “q”.
Now we go to the element via “find_element_by_name” and let enter the desired text with “send_keys”.
search=driver.find_element_by_name(‘q’)
search.send_keys(‘Our First Google Search’)
To make our automation more “human” we add wait times between commands. For this we first have to import “time” and set a pause of e.g. 4 seconds after calling the web page by entering the following:
time.sleep(4)
The last part we have to add to our code is submitting the search with:
search.submit()
This line is simulating a smash of the enter-button which brings us to the search results:
To end a session (caution: the window closes immediately, so include a sleep time to see the result before the window disappears) we use the quit function:
driver.quit()
There are, of course, a variety of other ways we could select the required element in the code — “find_element_by_name” is just one option. Other common possibilities are: “find_element_by_id”, “find_element_by_xpath” “find_element_by_class_name” and others. A detailed selenium-documentation can be found here.
With these tools, all elements can be found in the HTML code, incorporated as an element in the Python code, and used variably to create an automated flow. Similarly, one can automatically log in to web pages by selecting and filling in the individual elements. For example, using the website Facebook.
A quick search on the front page reveals that the email is hidden under id=”email” and the password under “pass”. All you have to do is “send_keys” to those and once again perform “enter” by submitting.
I would like to end this article with more useful functions that can be used in creative ways to customize your individual automation process:
- “getCurrentUrl()” — gets the current URL as a string, which can be useful when you’re traveling through website and want to crawl them
- “getText()” — gets the inner text of a web element you pick — great for automated crawling of strings and characters
- “implicitlyWait()” — sometimes the webpage needs some time to load and your driver tries to find an element, that is not loaded yet. implicitlyWait will pause the action for the number of seconds you type in.
- “visibilityOfElementLocated()” paired with “untill()” — pretty much does the same, but will not wait for X seconds, but will wait for a certain element to become visible. For example: wait.until(SomethingWeAreWaitingFor.sivibilityOfElementLocated). This becomes handy when the loading time is long and you don’t know how much waiting time to set.
- “getScreenshotAs()” — will take a screenshot at this moment, which can be used for control later on. Don’t forget to set a filename and a format in the brackets.
With all these functions you can build a program that can: automatically navigate through web pages, check web-app-functions and extract and save information to detect errors and improve performance.
As always: Thank for reading!