DATA STORIES | WEB SCRAPING | KNIME ANALYTICS PLATFORM

Scrape your Competitors’ Website with an Advanced Web Scraper

Retrieve product categories and prices with a codeless approach

BI-FI BLOGS
Low Code for Data Science

--

As first published in BI-FI Blogs

Photo by The Nix Company on Unsplash.

In this article, we will go over the details of our latest project: creating an Advanced Web Scraper for H&M Germany using KNIME Analytics Platform.

This KNIME workflow scrapes the H&M Germany website and gets the product category and sub category information, as well as all the product page URLs and price information to aggregate at certain hierarchies.

This template can be used for other retailers. However, since the design of the website will be different than this, there will be some changes required.

Download the workflow “Advanced Web Scraper for H&M Germany” from the KNIME Community Hub.

For each category on the H&M website, we will replicate the steps below. Mind that, when the website gets UI updates, there might be some workflow changes required.

First, we will use the Webpage Retriever and XPath nodes to connect to the website and retrieve product categories. Then, we will do some transformations to extract product prices, and format the data for our reports.

Connects to the website and retrive product categories.
Transformation steps to extract product prices.

We will also calculate the product count at each price level per category to see how the prices are distributed per categories. After these transformations, the data will be ready to be sent to Power BI for further reporting with the Send to Power BI node.

Using metanodes, we can wrap and run group of nodes with just one click, and have the data ready for PowerBI. We can also export results to Excel.

H&M metanode, containing all the steps above.

Here the end result tables are exported as Excel files.

If you liked this article, please don’t forget to share it and leave a comment below!

--

--

BI-FI BLOGS
Low Code for Data Science

BI-FI Blogs provides useful materials, examples, tutorials about: SQL Server, PowerBI, Python, VBA, Data Analysis, Knime, and many more...