How to Parse a Webpage Using Selectors | Scraping with 🐍
Ever heard about XPath or CSS Selectors? Today we’ll get through these powerful tools, which are the quintessence of every scraping project.
Published in
8 min readFeb 10, 2021
This post is part of the series “Scraping with Python 🐍 ” where I intend to explain and teach from basic to most advanced scraping concepts.
“The first rule of web crawling is you do not harm the website. The second rule of web crawling is you do NOT harm the website.” — by ScrapingHub
Introduction
Selectors are powerful tools that allow the selection of any node in an HTML tree. Thanks to selectors, selecting a node or a bunch of them becomes very easy. They are at the heart of the scraping process, and also the heart of this post 😉
- In the first episode, we went through the underlying principles of scraping; discover how the web works, precise the different ways of collecting data, and present some python tools useful for web scraping.
- Today, we’ll go through the parsing steps. Indeed, if there are different python packages to analyze the content of a webpage, the exploration relies…