Scrapy Tutorial — Part 5

Jebaseelan Ravi
3 min readApr 16, 2022

--

How to deploy a scrapy spider into production?

This blog is part of a tutorial series: PART 1, PART 2, PART 3, PART 4, PART 5

Running Scrapy spiders in your local machine is very convenient for the (early) development stage, but not so much when you need to execute long-running spiders or move spiders to run in production continuously. This is where the solutions for deploying Scrapy spiders come in.

Popular choices for deploying Scrapy spiders are:

Let us look at how we can deploy the spider using Scrapyd.

Deploying to a Scrapyd Server

Photo by Christina @ wocintechchat.com on Unsplash

Scrapyd is an open source application to run Scrapy spiders. It provides a server with HTTP API, capable of running and monitoring Scrapy spiders.

To deploy spiders to Scrapyd, you can use the scrapyd-deploy tool provided by the scrapyd-client package. Please refer to the scrapyd-deploy documentation for more information.

Scrapyd is maintained by some of the Scrapy developers

Install the packages

pip install scrapyd
pip install scrapyd-client

Start the scrapyd server

$ scrapyd

You should see something like

Scrapyd server logs

If you launch http://localhost:6800/ in your browser you should see

Scrapyd UI

Now we have the scrapyd server running we have to deploy our spider here using the following commands

Update the quotesspider/scrapy.cfg to following code

scrapy.cfg is a deploy configuration file which describes how you are going to deploy your spider

[settings]
default = quotesspider.settings
[deploy:local]
url = http://localhost:6800/ # where is your scrapyd server running
project = quotesspider # project name

Now run

$ cd quotesspider# format: scrapyd-deploy <target> -p project$ scrapyd-deploy local -p quotesspider

This will eggify your project and upload it to the target. If you have a setup.py file in your project, it will be used, otherwise one will be created automatically.If successful you should see a JSON response similar to the following:

Scrapy deploy logs

Now you can start the scheduling/running the spider using

$ scrapyd-client schedule -p <project_name> <spider_name>$ scrapyd-client schedule -p quotesspider quotes
Scrapyd schedule log

Now if you go to http://localhost:6800/jobs you can see our spider is running

Scrapyd UI

You can see the log

Scrapyd log for spider run

Whoooo !! We have deployed our spider. You can run scrapyd in anywhere(may be in cloud) and replace the url in scrapy.cfg and deploy the same way.

That’s a wrap!!

Happy Scrapping!! 🕷

Please leave a comment if you face any issues

--

--