World Salary analysis for various job posts and visualization

Jaymeet Mehta
3 min readNov 4, 2020

--

Photo by Jp Valery on Unsplash

Introduction:

Data analysis is a field which deals with loads of data and provides useful insights by studying that data. Even though one might get a rough idea by overviewing the data but the same might not be the case with the underlying patterns. Visualizations are the one of the most important part of data analysis. It assures that anyone with even a little or no knowledge about the topic can read and grasp information from the visualization.

Flow of work:

Scrapping of data:

The data has been scraped using python BeautifulSoup and chrome driver.

BeautifulSoup : Beautiful Soup is a Python package for parsing HTML and XML documents. It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping.

ChromeDriver : It is an open source tool for automated testing of web apps across many browsers.

Format of processed data:

After processing the raw data we gathered the following information:

· Country

· Country code

· Job title

· Currency notation

· Average salary

· Minimum paid salary

· Maximum paid salary

· Cities

· Variation in salary according to cities

· Male/female ratio

· Currency against dollar

· Average salary in dollars

· Minimum salary in dollars

· Maximum salary in dollars

· Cost of living of one person

· Top 3 skills required for the given job post

· Subsidiary skills required

Tableau and visualizations:

Tableau is a very powerful visualization tool which allows us to connect to any type of data file and make beautiful and informative visualizations. Tableau allows us to convey our processed information via dashboard and story points which makes it easy to grasp the information. Below are some visualizations made using the preprocessed data.

World map visualization
The storyboard depicting the journey of analysis

Exploratory data analysis (EDA):

Exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

EDA can be accomplished with the help python libraries like pandas, matplotlib and seaborn.

Conclusion:

Hence we have developed a project under the hood of data science which will help immigrants compare the salaries and job titles around different countries and instead of moving to the country they wanted to, they can move to the country which is affordable with a affordable cost of living.

Here I am providing a link to the dashboard:

Below is the Github link for the following project:

--

--