Introduction

This release covers updates as below:

  1. Better support Scrapy spiders;
  2. Support Git repository synchronization;
  3. Support long-task spiders;
  4. Better spider management.

Change Log

Features / Enhancement

  • Better Support for Scrapy. Spiders identification, settings.py configuration, log level selection, spider selection. #435
  • Git Sync. Allow users to sync git projects to Crawlab.
  • Long Task Support. Users can add long-task spiders which is supposed to run without finishing. #425
  • Spider List Optimization. Tasks count by status, tasks detail popup, legend. #425
  • Upgrade Check. Check latest version and notifiy users to upgrade.
  • Spiders Batch Operation. Allow users to run/stop spider tasks and delete spiders in batches.
  • Copy Spiders. Allow users to copy an existing spider to create a new one.
  • Wechat Group QR Code.

Bug Fixes

  • Schedule Spider Selection Issue. Fields not responding to spider change.
  • Cron Jobs Conflict. Possible bug when two spiders set to the same time of their cron jobs. #515 #565
  • Task Log Issue. Different tasks write to the same log file if triggered at the same time. #577
  • Task List Filter Options Incomplete.

Product Plan

Results Display

  • Support other database

Configurable Spiders

  • Support Splash
  • Support CrawlSpider
  • Support regex fields
  • Support converting configurable spiders into customized spiders

Task

  • Task re-run

Schedule

  • Calendar display

Server

  • Support terminal of Docker container

SDK

  • Support more commands
  • Support Golang、Java

Plugin System

Reference

--

--