Easy Guide on Scraping LinkedIn With Python

Birappa Goudanavar
Towards Data Engineering
2 min readNov 20, 2023

LinkedIn is a huge source of data that’s publicly available for users and non-users alike, but it’s quite difficult to scrape, especially because it serves its content dynamically.

Photo by Bastian Riccardi on Unsplash

The best thing about this script is it does not require any proxy, Everyone know that Linkedin has the more advanced bot tracking and blocking system. Comment if you are exited to know why it won’t get caught by bot.

In this article, I will show you how to build a web scraper that doesn’t infringe any privacy policies or require a headless browser — keeping the project as simple and manageable as possible — using Python.

We’ll extract the job title, company hiring, location, and the link to the job listing using playwright and export the data to a CSV file for later analysis or use.

Here we are not violating Linkedin privacy policy, we are extracting on public data which is available

Basic requirements

The goal of this script is to extract job details from a list of company, for this you need to know the company name or list of companies listed in csv file along with Linkedin page url.

Setting Up project

Let’s start by installing all the dependencies we’ll be using for this project.

pip install playwright pandas csv

To list all the dependencies in requirements.txt file

pip freeze > requirements.txt
pip install -r requirements.txt

Initialization of playwright module

Playwright is a Python library to automate Chromium, Firefox and WebKit browsers with a single API. Playwright delivers automation that is ever-green, capable, reliable and fast.

Go through that function that will have the functionality of extracting job details, which do not require any proxy, this can work fine for any number of company.

This function will read data from data directory file and check the validity of the website url. And call the main method to extract the data.

By following the above procedure you can implement the script and can automate the script to run on regular basis using Github Action.

Contact me on Upwork, if you need help for projects Click, Thanks.

--

--

Birappa Goudanavar
Towards Data Engineering

Data Engineer at Alcon | Freelance Data Guru, Helping You Unlock Insights