Fig. 1 — Scraping with Python — How to parse a webpage using selectors (Image by author)

How to Parse a Webpage Using Selectors | Scraping with 🐍

Ever heard about XPath or CSS Selectors? Today we’ll get through these powerful tools, which are the quintessence of every scraping project.

Thibaud Lamothe 🤠
Published in
8 min readFeb 10, 2021

--

This post is part of the series “Scraping with Python 🐍 ” where I intend to explain and teach from basic to most advanced scraping concepts.

“The first rule of web crawling is you do not harm the website. The second rule of web crawling is you do NOT harm the website.” — by ScrapingHub

Introduction

Selectors are powerful tools that allow the selection of any node in an HTML tree. Thanks to selectors, selecting a node or a bunch of them becomes very easy. They are at the heart of the scraping process, and also the heart of this post 😉

  • In the first episode, we went through the underlying principles of scraping; discover how the web works, precise the different ways of collecting data, and present some python tools useful for web scraping.
  • Today, we’ll go through the parsing steps. Indeed, if there are different python packages to analyze the content of a webpage, the exploration relies…

--

--

Thibaud Lamothe 🤠

Head of Data @ Iroko, from Paris | Visit my website www.etomal.com | Unlock unlimited content medium.etomal.com/membership | 🤠