Scrapy Item with general fields
General/unknown/dynamic fields
The source code you can found here. It is available as part of scrapy-item project.
You can install scrapy-item from PyPi:
python3 -m pip install -U scrapy-item
You should also have Scrapy installed.
Sometimes, you’re scraping a webpage or call receive response from some Web Service that has dynamic structure — may be it is just table structure and you want to form the item and field attributes or may be you’re receiving different result structure depending on some state on REST API call. Or may be you just have many attributes, and they may changing depending on the parameters you’ve passed to the request.
You can just use my GeneralItem directly or (better) define your item that inherits from it, like this:
Usage example:
Another example:
You may also want to look on itemadapter project:
The
ItemAdapter
class is a wrapper for data container objects, providing a common interface to handle objects of different types in an uniform manner, regardless of their underlying implementation.Currently supported types are:
dataclass
-based classes
attrs
-based classes
…
>>> from scrapy.item import Item, Field
>>> from itemadapter import ItemAdapter
>>> class InventoryItem(Item):
... name = Field()
... price = Field()
...
>>> item = InventoryItem(name="foo", price=10)
>>> adapter = ItemAdapter(item)
>>> adapter.item is item
True
>>> adapter["name"]
'foo'
>>> adapter["name"] = "bar"
>>> adapter["price"] = 5
>>> item
{'name': 'bar', 'price': 5}
https://github.com/scrapy/itemadapter
First release of this project was in April 25 2020. It is sub-project of Scrapy.