Dynamic Web Scraping — 7: Using Proxies with Selenium

Irfan Ahmad
Geek Culture
Published in
2 min readMar 13, 2023

In my previous article, I explained the basics of using proxies, including their definition and purpose, and provided a simple demonstration of their usage. This article aims to offer a straightforward demonstration of how to effectively use proxies with Selenium for dynamic web scraping.

Let’s jump right into it.

Photo by Brett Jordan on Unsplash

In my previous article, I illustrated how to scrape proxies from a website and verify their functionality, ultimately storing the functional proxies in a ‘selected_proxies.csv’ file. I continued this process by regularly updating the proxy list on my GitHub repository to ensure its relevancy. In this article, we will utilize the updated proxy list to randomly select proxies with Selenium and confirm their efficacy by checking the current IP via the ‘whatismyip’ website. This will serve as a useful validation tool to ensure our proxies are functioning as intended.

This ‘selenium_proxy_test.py’ file is also included in the github repo. The code is divided into three functions. Following is their description.

main():
This function reads the proxies from the selected proxies and then a function ‘get_working_ip’ to get the current IP from the whatismyip website.

get_working_ip(proxies):
This function takes in a list of proxies. From the list it randomly selects a proxy and passes that IP to ‘test_ip’ function.

test_ip(ip_address):
This function takes the IP address and opens a browser using selenium with the current IP address. Then it opens the ‘whatismyip.com’ and gets the current ipv4 address. This repeats the process 4 times.

I have used Remote WebDriver for this purpose and here is the code which implements a proxy with selenium.

options = webdriver.ChromeOptions()
proxy = f"{ip_address}"
options.add_argument(f"--proxy-server={proxy}")
driver = webdriver.Remote(command_executor='http://localhost:4444', options=options)
driver.get('https://www.whatismyip.com/')
time.sleep(5)
ip = driver.find_element(By.ID, "ipv4").text

The complete code can be found my my github repo.

*** Don’t forget to clap for my efforts and follow for future updates.

--

--

Irfan Ahmad
Geek Culture

A freelance python programmer, web developer and web scraper, data science and Bioinformatics student.