Python is The Sure Way to Make Money (Part 2)
“The Secret Is Web Scraping”
Part 1 of this article discussed several reasons why Python is a sure way to make money; please read it first. So, in Part 2, we’ll go over how to make money with Python in the shortest amount of time. Can we begin working in Python in 2 months? What kind of job should we prioritize first? The answer is already in the title of this article: web scraping.
Some of us may associate web scraping with something negative because it is frequently associated with hackers, scammers, and other undesirables. We forget that Google, Yahoo, and other search engine websites began with web scraping before expanding into other businesses. So it’s neutral; it depends on the user and what they’ll use it for.
So, in this part 2, We’ll learn how to do simple web scraping in Python in less than 20 lines. We’ll use the website indeed.com as the example of our target website; we’ll extract the job title, company name, and company location from the first page of the website using Python.
Python And Pycharm (IDE) Installation
First, we need to install the Python itself and the IDE or editor that we’ll use; in this case, we’ll use Pycharm as the editor. First, we need to download the Python installation file from here please follow the installation instruction as usual. After that install the Pycharm (IDE) from here you can pick version for windows, mac, or linux depend on your operating system. Follow the installation instruction until finish.
If you have difficulties in installing Pycharm you can watch one of YouTube video like this .
Package Installation
We briefly discussed the package in Python in part 1; for a simple explanation, a package is a program that other people make that other people can reuse with specific functions. For example, “requests” is a package for retrieving a website that we will use later. This package was created specifically to assist others in retrieving a website without having to write lengthy code. Python has millions of these packages available for free download at https://pypi.org/. Requests and BeautifulSoup will be the only packages we use. BeautifulSoup is a package that will assist us in retrieving data from HTML files that the package has downloaded from the internet, in our case, from the website indeed.com.
We can install both packages at once in our Pycharm using this line of command form the “Terminal” tab at the bottom of Pycharm.
pip install beautifulsoup4 requests
We then create new file in our project folder in Pycharm, maybe we give it a name “indeed_scraping.py”. If you don’t understand how to do this, you can find it in YouTube, there are tons of tutorial about this.
Start Coding
First two lines of our coding we’ll import the packages so it can be used in our program.
That’s it, then we start to retrieve the first HTML page from website indeed.com, specifically for example we are looking for “Python Developer” jobs in “New York State” so the code like this.
We specify the job and location that we want to find in variable “params”. Consider variable like a page of note that we use to write simple notes, like name and phone number. Then we use variable params to retrieve HTML using a requests package.
The first page of website indeed.com that contain “Python Developer” jobs in “New York State” has been retrieved. Next step we extract the data from this HTML file using BeautifulSoup package.
After the data has been extracted to variable soup then we’ll parse the data that we need using BeautifulSoup find function. For example we need the job title, the company name, and company location data.
Then select run from Pycharm menu and voila! all the jobs with its company name and location will be publish in our Pycharm Run tab below. The complete code like this:
It only takes 16 lines of codes to retrieve data from the website indeed.com’s first page. Of course, additional lines of code are required to accommodate the employer’s request in real-world applications, but the core code remains unchanged. The code format is simple to comprehend, right? Is it impossible to learn such simple programming in two months? Python is the most popular web scraping tool because of this. Hopefully, these two parts of the article will shed some light on any of us looking for a new job or an extra source of income.