Geek Culture
Published in

Geek Culture

Scrapy Item with general fields

General/unknown/dynamic fields

The source code you can found here. It is available as part of scrapy-item project.

You can install scrapy-item from PyPi:

python3 -m pip install -U scrapy-item

You should also have Scrapy installed.

Sometimes, you’re scraping a webpage or call receive response from some Web Service that has dynamic structure — may be it is just table structure and you want to form the item and field attributes or may be you’re receiving different result structure depending on some state on REST API call. Or may be you just have many attributes, and they may changing depending on the parameters you’ve passed to the request.

You can just use my GeneralItem directly or (better) define your item that inherits from it, like this:

Usage example:

Another example:

You may also want to look on itemadapter project:

The ItemAdapter class is a wrapper for data container objects, providing a common interface to handle objects of different types in an uniform manner, regardless of their underlying implementation.

Currently supported types are:

dict

scrapy.item.Item

dataclass-based classes

attrs-based classes

>>> from scrapy.item import Item, Field
>>> from itemadapter import ItemAdapter
>>> class InventoryItem(Item):
... name = Field()
... price = Field()
...
>>> item = InventoryItem(name="foo", price=10)
>>> adapter = ItemAdapter(item)
>>> adapter.item is item
True
>>> adapter["name"]
'foo'
>>> adapter["name"] = "bar"
>>> adapter["price"] = 5
>>> item
{'name': 'bar', 'price': 5}

https://github.com/scrapy/itemadapter

First release of this project was in April 25 2020. It is sub-project of Scrapy.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
alex_ber

alex_ber

Senior Software Engineer at Pursway