Scrapy and Scrapyrt: how to create your own API from (almost) any website

5 min readAug 25, 2018

Introduction

Scrapy is a free and open-source web crawling framework written in Python. It allows you to send requests to websites and to parse the HTML code that you receive as response.

With Scrapyrt (Scrapy realtime), you can create an HTTP server that can control Scrapy through HTTP requests. The response send by the server are data formatted in JSON containing the data scraped by Scrapy.

It basically means that with the combination of these two tools, you can create an entire API without even having a database! I will show you how to process to achieve this.

Set up Scrapy and create your spider

If you don’t have Scrapy installed on your machine yet, run the following command (I will assume you have Python installed on your computer):

pip install scrapy

It will install Scrapy globally on your machine. You can also do it on a virtual environment if you prefer.

Once the installation completed, you can start a Scrapy project by running:

scrapy startproject <project_name>

In my case (and if you want to follow along the article), I’ll do

Scrapy and Scrapyrt: how to create your own API from (almost) any website

Introduction

Set up Scrapy and create your spider

Written by Jérôme Mottet