reduce are not part of this ecosystem. The fact that you don't have a built-in function to check if an element is part of a slice is even more shocking. When you'd simply do:
5 in [1, 2, 3, 4, 5] in Python, you'll discover that it's a whole other story in Go.
The first reflex, of course, is to Google the classical “golang check if element in slice” hoping to find a built-in way of doing such operations. The first links on the result’s page will quickly put an end to your expectation. You’ll learn that Go doesn’t have a built-in way to handle those functions. …
Let’s unleash the power of Go and Colly to see how we can scrape Amazon’s product list.
This post is the follow up to my previous article. If you haven’t already done it, I’d recommend that you have a look at it so you can have a better understanding of what I’m talking about here and it will be easier for you to code along.
In this writing, I’ll show you how to improve the project we started by adding functionalities such as random User-Agent, proxies switcher, pagination handling, random delays between requests, and parallel scraping.
The goal of those methods is first, to improve the harvesting’s speed of the information we need. Second, we also need to avoid getting blocked by the platform we’re extracting data from. Some websites will block you if they notice you’re sending too many requests to them. I want to specify that our goal here is not to flood them with requests, but just to avoid getting blocked while extracting the data we need at an appropriate speed. …
In this article, we’ll explore the power of Go(lang). We’ll see how to create a scraper able to get basic data about products on Amazon.
The goal of this scraper will be to fetch an Amazon result page, loop through the different articles, parse the data we need, go to the next page, write the results in a CSV file and… repeat.
In order to do this, we’ll use a library called Colly. Colly is a scraping framework written in Go. It’s lightweight but offers a lot of functionalities out of the box such as parallel scraping, proxy switcher, etc.
This article will cover the basics of the Colly framework. …