WEB SCRAPING IS MADE EASY BY POWER AUTOMATE DESKTOP

Web Scraping | Web data extraction | Screen Scraping | Web Harvesting | Power Automate Desktop

bedy kharisma
5 min readNov 23, 2022
Photo by Maxim Hopman on Unsplash

Data is everywhere, the challenge is how to get them, and is not that hard either. If you are familiar with python and beautiful soup, you must have known how to get data over a webpage or so-called web scraping. But if you are not familiar with coding, this article may just be right for you.

Introducing Power Automate Desktop

Power Automate Desktop is a software developed by Microsoft (download link here) to save you from a boring and repetitive manual task simply by recording mouse clicks, keystrokes, and copy-paste steps from your desktop! Power Automate is all about automation. And not just that, it is built to almost require no coding skills, you just need to know what to do. And since it is built to automate repetitive tasks, no one should know better the flow better than you. Hence, it is a perfect tool to scrap data over a website, or even multiple pages of a website just by dragging and dropping “Actions”.

Get Started

Once installed, you will be brought to this opening window, where you can see some introduction in the home tab, your current workflows, or even see some examples. As we would like to use it to scrap data over the internet, let’s take you to a demonstration on how it is done to grab data from the shopee webpage. Shopee is a commercial e-commerce site headquartered in Singapore and owned by Sea Limited, which was founded in 2009 by Forrest Li. Shopee first launched in Singapore in 2015 and has since expanded its reach to Malaysia, Thailand, Taiwan, Indonesia, Vietnam, and the Philippines (as per Wikipedia). So it is a great starting point to do web scraping with a mission to get all the list of products sold from this webpage.

Opening window

Let's start with creating a New Flow, by clicking the + button on the left corner of the window, and it will take you to a separate window. where you can choose your action, structure the flow, modifying actions, instances, and variables.

And I kid you not, the number of “lines” required to do web scraping using this Power Automate Desktop (PAD) is as short as 10 lines! 6 if we exclude comments and disabled actions. To understand better each steps, let’s break it down

  1. Launch the browser

PAD works best with Chrome, Edge, and Firefox. You may choose which one is better for you by searching it in the actions panel on the left side of the window. Try to find “Browser Automation” or “Launce new Chrome”

In this window, you may define which website you want to surf by setting the initial URL. In this instance, we type “shopee.co.id” as our target.

2. Ask the Keyword

Then, we may ask the user to provide us with the product keyword that we want to scrap by providing an input window.

Not much to set here, you could leave it blanks if you will.

3. Input the keyword

Later, we will use the keyword input, and we will tell the pAD to input said keyword in the respective field in the webpage, to do this we will ask the mouse to move to the specific location where the search panel is located within the webpage. Get the PAD to type the said keyword, and then click enter.

4. The Main Course

This is the heart of this task, luckily PAD has made it easy for us to do so. How, just click the recorder button on top of the window, while opening the webpage.

Once the record button is clicked, right-click on the information that you want to get within the webpage, in this example I will try to grab the description, price, how many of it sold, and the location.

Next, we will need to tell the PAD to do this on entire items on the page. This is done by repeating the same steps on the neighboring items. Once you completed this task one time, you will see that the entire information items within the page will be marked with a green dash-line.

Lastly, we will need to which one on the page that works as a pager or pagination.

Once done, the PAD will understand that it will try to get all information within the page and repeat the same task in the whole pages available there is.

Set the Store data mode, if you want to save it on an Excel Spreadsheet.

And finally, all you have to do is click run, and wait a bit for the machine to do the task. And once the machine has finished its job. an Excel with the whole data will pop up for you to look at.

--

--

bedy kharisma

Indonesian Strategist,Analyst, Researcher, Achievement Addict who helps company grow their business by applying data-driven Management. Follow to follow