Get Started with Selenium WebDriver in Under 5 Minutes

Web testing your code to download files and more

Stephanie Bourdeau
The Startup
5 min readNov 19, 2019

--

Through Selenium, Python can be used to automate various web browsers
Through Selenium, Python can be used to automate various web browsers

WebDriver is an open source framework from the Selenium that can initiate a web browser, through the respective browser’s driver server, and control various aspects of the browser including, but not limited to text input and exploratory testing. Selenium Webdriver was the first testing framework that allowed for controlling the browser at the Operating System level.

In this quick tutorial, I will walkthrough how I used Python, XPath, and ChromeDriver, along with Selenium, to automate my browsing experience. Specifically, I will be showing how to download zip files containing comma-separated-values (CSV) files with data on water quality with user-specified parameters from the Water Quality Portal sponsored by the United States Geological Survey (USGS) government website. Lastly, I will piggyback off of Selenium’s application and use ZipFile to open the download into the readable format.

Requirements For Selenium Web Testing

Installing Selenium and importing its necessary packages for functionality through PIP — Python’s in-house package manager. Given that Selenium is the overall framework for testing web applications, Webdriver must be imported after installation to control the browser instance. The Keys class is imported for usage of keyboard keys such as Return, Escape, Command, etc. If using a Jupyter notebook, the kernel may need to be restarted to use the updated packages, as signified by a prompt if required.

The next step required to run the automated browser is to determine which browser you intend on testing and downloading the respective web driver, directly from the server’s source. For this instructional, we will be using ChromeDriver 79.0.3945.36 as it corresponds with my version of Google Chrome, version 79. After downloading the extension, I located the file path and set that equal to my browser variable through WebDriver, allowing for the initial browser instance to be initialized. Running just the first line of the code below opens the Chrome browser that is being controlled by the Selenium WebDriver.

The second line of code is where the automation truly begins. After inputting the url for the web-driven Chrome to ‘get’, the previously blank window redirects to the Water Quality Data portal where the fun of Selenium can be used. Through using this list of XPath code, I was able to find the appropriate attributes that coincided with the fields that I wanted to manipulate. In this case, text was inserted into the Location and Sampling Parameters, then the type of data to download was chosen through using Selenium’s click feature.

Screenshot from USGS Water Quality Data Portal being controlled by Selenium

Creating a function to get the desired zip file was the easiest way to further automate the automation of Selenium WebDriver. By doing so, the input fields could be changed accordingly to acquire water quality data on various different states. For this tutorial, however, I will show how I manipulated each field individually.

Inserting The Desired Parameters

Getting the water quality information regarding the locations and instances in which lead was found in Queens County, NY is just a few (Selenium) clicks away. The following code acts directly on the lefthand portion of the page pictured above, the Place segment of the portal.

Selenium has various attributes that can be used to find the respective elements in the page’s HTML. This is particularly useful in cases where ids, classes, or names are used multiple times throughout the page. By changing the suffix in ‘find_element_by_’ with either id, name, class, tag_name, class_name, xpath, or css_selector (to name a bunch), we can locate the exact field we want to control. By pluralizing element in this line of code, we can receive a list of all parameters that satisfy the desired output.

After specifying the country, state, and county we wish to look at — and having Selenium hit the Return/Enter key — the last few steps include grabbing data only for lead samples (from the Sampling Parameters section of the page) before we download the corresponding zip file.

Selenium has built in functions that allow it to click on certain functional aspects of the page, which in this case prepare our file for download by filtering through the USGS database. While the query runs, a popup shows the status of the download until it is ready for downloading onto your local drive. By using the WebDriver’s implicit wait attribute with a set time, we can wait for the desired element for downloading the zip file to appear on the page. Selenium has an implicit and explicit wait feature that execute the same task but with different input parameters:

Implicit Waits are executed by simply inputting a numeric value in the parentheses to pause for that specified amount of time before running the subsequent line of code. Once set, the WebDriver will continue to wait that specified amount of time unless a new wait time is assigned to a variable. Explicit Waits follows the same setup, however, uses the WebDriverWait and the expected_conditions classes along with a timed wait to pause until a specific element appears on the page before continuing to the next line of code. Given that the query typically takes less than 10 seconds to download and only the Continue button needs to be clicked, I opted to use an implicit wait in this case. Below, you will see a snippet of code taken from Read the Docs detailing the use of explicit waits.

Getting the Download into Readable Format

After SeleniumWebDriver successfully downloads the desired CSV file, we must first unzip it until ZipFile to access the CSV. If working in a Jupyter notebook, the unzipped file will save in the same folder/path that you are currently working in.

All that is left is to enjoy the fruits of your labor, as you’re beginning your Selenium expertise. Be sure to close your browser properly, or else you’ll have your webdriver floating through the code-verse.

--

--