Let’s Discover the Wonderful World of Scrapy | Scraping with 🐍
Scrapy is a Python framework to collect data from web pages. Here is an overview of the methodology and how to proceed with it.
Published in
9 min readFeb 18, 2021
This article is part of the series “Scraping with Python 🐍 ” where I intend to present and teach from basic to most advanced scraping concepts.
“The first rule of web crawling is you do not harm the website. The second rule of web crawling is you do NOT harm the website.” — by ScrapingHub
- In the previous episodes, we discovered the concepts of web scraping, and how to access information on a structured HTML file, thanks to selectors.
- Today, we’ll go through the Scrapy framework, learn what a spider is and how to start a scraping project.
- Next time, we’ll start scraping with a tangible use case: scraping products on Airbnb’s website.
Main concepts — Overview
I’d say there are very few steps to understand how a scraping project is structured. The first thing to know is that algorithms to scrap are usually called spiders. The reason is quite simple; they are moving…