Web Scraping Instagram with Selenium
and that darn NoSuchElementException
Selenium is a very powerful web scraping tool, it can target specific content elements on a webpage and extract them mercilessly!
But great power also leaves room for great errors, and in this short tutorial, I will show handy ways to bypass them and automate the entire process of image extraction.
We’ll focus on one task — web scraping a full database of cat images out of Instagram. We’ll do it step by step and we’ll discuss the challenges and the reasoning behind certain commands:
- Login to out personal Instagram account
- Handle the pop-up messages by clicking on “not now”
- Search for a keyword “#cat”
- Scroll down and select all the above thumbnails
- Create a new directory on your computer
- Save all the images inside the new directory
Install Chrome Driver
Download Chrome Webdriver: https://chromedriver.chromium.org/downloads
- a quick tip: I highly recommend saving chromedriver.exe in your root path, that way you won’t need to specify the URL of your file every time you initialize the driver (please refer to the comment inside the following code cell).