Freelance Data Analysis Trends and Success Strategies
More people are choosing to work as freelance data analysts because it gives them more freedom to choose their hours and projects, be their own boss, and make more money. However, it also comes with many challenges, such as competition, uncertainty, isolation, and lack of benefits.
One of the key factors that determines the success of a freelancer is finding the right jobs that match their skills, interests, and goals. But how can a freelancer find the best opportunities in the vast and diverse freelance market? What are the most in-demand skills, categories, and locations for freelance jobs? How much can a freelancer expect to earn from different types of projects?
To answer these questions, I scraped over 9,000 data analysis job postings from freelancer.com, one of the most popular and reputable freelance websites in the world. This dataset includes a wealth of information about freelance jobs, such as the job title, project description, price range, client's location, etc.
In this article, I will use this data to explore the trends and characteristics of data analysis freelance jobs on Freelancer.com. I will also share some insights and tips on how to succeed as a freelancer based on the data analysis. The article will be divided into two sections: data cleaning and data exploration and insights. In the first section, I will clean and prepare the data for analysis. In the second section, I will explore the data to uncover patterns and insights about freelance jobs.
I hope you find this article interesting and useful. Let's get started!
Data Cleaning
To gain a better understanding of the data, let’s take a look at the first five rows of the dataset:
| job_title | projectId | job_description | tags | price_range | rate | client_location | client_average_rating | client_review_count |
|:------------------------------------------------------------------------------------|:------------|:----------------------------------|:---------------------------------------------------------------------------------------|:--------------|:---------|:-------------------|------------------------:|----------------------:|
| development and implementation of a federated learning system within a Flutter app. | #37426471 | please bid only if you are rea... | ['Algorithm', 'Java', 'Python', 'Machine Learning (ML)', 'Flutter'] | 8-30 | EUR | Heilbronn, Germany | 5 | 17 |
| Data Entry -- 2 | #37421546 | Project Title: Data Entry - Da... | ['Excel', 'Statistical Analysis', 'Statistics', 'SPSS Statistics', 'Data Processing'] | 750-1250 | INR/hour | Nagpur, India | 0 | 0 |
| Data Scrap | #37400492 | I am looking for a freelancer ... | ['Web Scraping', 'Data Mining', 'Data Entry', 'Excel', 'Web Search'] | 30-250 | EUR | Eaubonne, France | 5 | 1 |
| Big Data Project | #37404568 | Store Sales Data Analysis: A D... | ['Big Data Sales', 'Data Science', 'Data Mining', 'Statistical Analysis', 'Analytics'] | 5000-5500 | INR | Mundra, India | 5 | 2 |
| Virtual Assistant / Research Assistant | #37397423 | Job Description: I am seeking ... | ['Data Entry', 'Virtual Assistant', 'Web Search', 'Excel', 'Presentations'] | 5-15 | USD/hour | United States | 0 | 0 |
Removing Hashtags from Project IDs
# Remove Hashtag from 'projectId' column
data['projectId'] = data['projectId'].str[1:]
Hashtags were prefixed to the actual project ID, but they are not relevant to the project ID and can cause problems when using the IDs for sorting or filtering. Removing them ensures that the project IDs are consistent and can be used for accurate identification.
2. Splitting Price Range
# Split 'price_range' into two columns
data[['min_price', 'max_price']] = data['price_range'].str.split('-', expand=True)
The 'price_range' column combined the minimum and maximum prices into a single string, making it difficult to analyze and to analyze price distribution and perform calculations. Splitting the string into separate columns for 'min_price' and 'max_price' allows for more detailed analysis of pricing trends.
3. Converting Numeric Values
data['min_price'] = pd.to_numeric(data['min_price'], errors='coerce')
data['max_price'] = pd.to_numeric(data['max_price'], errors='coerce')
data["projectId"] = pd.to_numeric(data["projectId"], errors='coerce')
Converting non-numeric columns (’min_price’, 'max_price’, and 'projectId' ) to numeric data types ensures that mathematical operations and calculations can be performed accurately. Performing calculations on non-numeric columns can lead to errors in calculations and analysis.
4. Extracting Currency and Rate Type
# Create 'currency' and 'rate_type' columns from 'rate' column
rate_split = data['rate'].str.split('/', expand=True)
data["currency"] = rate_split[0]
data["rate_type"] = rate_split[1].apply(lambda x: "fixed" if pd.isna(x) else "hourly")
The 'rate' column included both the currency and rate type (hourly or fixed) in a single string. Separating these values into separate columns for 'currency' and 'rate_type' provides a clearer understanding of the pricing structure and allows for filtering based on currency or rate type.
5. Handling Incomplete Location Data
# Split 'client_location' into 'state' and 'country' columns
data["location_split"] = pd.DataFrame(data['client_location'].str.split(', '))
data['client_state'] = data["location_split"].apply(lambda x: None if len(x) == 1 else x[0])
data['client_country'] = data["location_split"].apply(lambda x: x[1] if len(x) > 1 else x[0])
The 'client_location' column contained both state and country information, but not all rows had both values. Splitting the string into separate columns for 'client_state' and 'client_country' allows for analysis of location data at both the state and country levels.
6. Filtering Currency Length
# Filter out rows where 'currency' length is greater than 3
data = data[data["currency"].str.len() <=3]
Currency codes typically have a standard length of three characters. Filtering the data to include only currencies with a maximum length of three ensures consistency and avoids potential errors arising from non-standard currency codes.
7. Calculating Average Price
# Calculate and add 'avg_price' column
data["avg_price"] = (data["min_price"] + data["max_price"]) / 2
The 'min_price' and 'max_price' provides a range for project pricing, but a single representative value would be more useful for analysis. Calculating the 'avg_price' by averaging the 'min_price' and 'max_price' provides a more meaningful measure of pricing.This 'avg_price' can be used for more accurate analysis of pricing trends.
8. Removing Missing Average Prices
# Remove rows where 'avg_price' is NaN
data = data[pd.notna(data["avg_price"])]
Missing 'avg_price' values can affect the overall accuracy of price calculations. Removing rows with missing 'avg_price' values ensures that the price calculations are based on complete data and provides a more reliable measure of pricing.
9. Converting Tags to Lowercase
# Convert 'tags' to lowercase
data["tags"] = data["tags"].apply(lambda x: [element.lower() for element in x])
Tags were a mix of uppercase and lowercase, potentially affecting search and analysis. Converting all tags to lowercase ensures consistency and facilitated case-insensitive searches and comparisons.
10. Removing Unnecessary Columns
# Remove the unnecessary columns
data.drop(["client_location", "location_split", "price_range", "rate"], axis=1, inplace=True)
The 'client_location', 'location_split', 'price_range', and 'rate' columns were no longer needed after data transformation and could add clutter to the dataset. Removing these columns streamlines the dataset and improves its readability and usability.
11. Reordering Columns
# List of column names in the desired order
column_order = ['projectId', 'job_title', 'job_description', 'tags', 'client_state', 'client_country', 'client_average_rating', 'client_review_count', 'min_price', 'max_price', 'avg_price', 'currency', 'rate_type']
# Reorder the columns
data = data[column_order]
Reordering the columns to a more logical and visually appealing order improves the readability and usability of the dataset. This makes it easier to navigate and analyze the data.
12. Saving the Cleaned Data
# Save the cleaned data to a CSV file
data.to_csv("freelancer_job_postings.csv", index = False)
Saving the cleaned dataset as a CSV file preserves the transformed data and allows for future analysis and reference. This ensures that the cleaned data is readily available for further exploration and insights.
By carefully undertaking these data cleaning steps, we transformed the raw dataset into a refined and reliable source of information, ready for further exploration and analysis. These steps ensured the accuracy, consistency, and usability of the data, enabling us to extract meaningful insights from the job postings on freelancer.com.
A Glimpse into the Cleaned Dataset
| projectId | job_title | job_description | tags | client_state | client_country | client_average_rating | client_review_count | min_price | max_price | avg_price | currency | rate_type |
|------------:|:------------------------------------------------------------------------------------|:----------------------------------|:---------------------------------------------------------------------------------------|:---------------|:-----------------|------------------------:|----------------------:|------------:|------------:|------------:|:-----------|:------------|
| 37426471 | development and implementation of a federated learning system within a Flutter app. | please bid only if you are rea... | ['algorithm', 'java', 'python', 'machine learning (ml)', 'flutter'] | Heilbronn | Germany | 5 | 17 | 8 | 30 | 19 | EUR | fixed |
| 37421546 | Data Entry -- 2 | Project Title: Data Entry - Da... | ['excel', 'statistical analysis', 'statistics', 'spss statistics', 'data processing'] | Nagpur | India | 0 | 0 | 750 | 1250 | 1000 | INR | hourly |
| 37400492 | Data Scrap | I am looking for a freelancer ... | ['web scraping', 'data mining', 'data entry', 'excel', 'web search'] | Eaubonne | France | 5 | 1 | 30 | 250 | 140 | EUR | fixed |
| 37404568 | Big Data Project | Store Sales Data Analysis: A D... | ['big data sales', 'data science', 'data mining', 'statistical analysis', 'analytics'] | Mundra | India | 5 | 2 | 5000 | 5500 | 5250 | INR | fixed |
| 37397423 | Virtual Assistant / Research Assistant | Job Description: I am seeking ... | ['data entry', 'virtual assistant', 'web search', 'excel', 'presentations'] | | United States | 0 | 0 | 5 | 15 | 10 | USD | hourly |
Data Exploration and Insights
What are the most common types of freelance data analysis projects?
-- Count frequencies of project types excluding specific ones and filter by certain project types
SELECT * FROM
(
SELECT project_type, COUNT(*) AS frequency FROM
(
SELECT UNNEST(tags) AS project_type FROM freelancerdata
) as unnested_data
WHERE project_type != 'data analysis' AND project_type != 'data analytics'
GROUP BY project_type
ORDER BY frequency DESC
) as project_type_table
WHERE (project_type ILIKE ANY (ARRAY['data%', '%analysis%', 'machine learning (ml)', '%scraping%', '%research%', 'report writing', 'statistical modeling', 'biostatistics', 'regression testing', 'etl']));
The most popular type of freelance data analytics projects are data processing, followed by statistical analysis and data entry. This suggests that clients are primarily looking for freelancers to help them with the early stages of the data analytics process, such as cleaning and preparing the data. The use of machine learning suggests that the project may involve building predictive models.
What are the most in-demand skills for freelance data analysts?
-- Categorize skills and count the occurrences for each category
SELECT * FROM
(
SELECT
CASE
WHEN skills ILIKE '%excel%' THEN 'excel'
WHEN skills ILIKE '%sql%' THEN 'sql'
ELSE skills
END AS categorized_skills,
COUNT(*) AS frequency
FROM
(
SELECT UNNEST(tags) AS skills FROM freelancerdata
) as unnested_data
GROUP BY categorized_skills
ORDER BY frequency DESC
) as skills_table
WHERE categorized_skills ILIKE ANY (ARRAY['excel', 'sql', 'python', 'tableau', 'power bi', 'r %', 'sas', 'powerpoint', 'word', 'azure', 'oracle', 'sap', 'aws', 'go', 'flow', '%vba%', 'snowflake', 'java', 'looker', ' qlik', 'spark', 'sas' ]);
The chart reveals with Excel, Python and SQL as the top 3 skills is that these are the most in-demand skills for data analysts. This is likely because these skills are essential for all aspects of data analysis, from data cleaning and preparation to data analysis and visualization. So if you are interested in a career in data analysis, it is important to develop skills in SQL, Excel, and Python. These skills will make you more competitive in the job market.
Where are the most popular locations for freelance data analysis projects?
-- Count the number of jobs for each client country and order by count in descending order
SELECT client_country, COUNT(*) AS job_count
FROM freelancerdata
GROUP BY client_country
ORDER BY job_count DESC;
India clearly dominates the list, with almost twice as many jobs posted as the second-place United States. The United States, the United Kingdom, Australia, and Saudi Arabia also have a significant presence in the freelance market. It is surprising that USD has the highest frequency of job postings while India clearly dominates the list of countries posting jobs.
What are the most popular tools for used by freelance data analysts for different types of projects?
-- Count pairs of skills, ordered by frequency, and limit to the top 20
SELECT skill1, skill2, COUNT(*) AS pair_count
FROM (
SELECT tags[i] AS skill1, tags[j] AS skill2
FROM freelancerdata, generate_series(1, ARRAY_LENGTH(tags, 1)) AS i, generate_series(1, ARRAY_LENGTH(tags, 1)) AS j
WHERE i < j
) AS pairs
GROUP BY skill1, skill2
ORDER BY pair_count DESC
LIMIT 20;
This suggests that Excel and Python are the most popular tools and skills for data analysis projects. Excel is well-suited for data cleaning and data entry, while Python for machine learning, and Statistics for statistical analysis.
It is important to note that this is a general overview, and the most popular tool or skill for a particular project may vary depending on the specific requirements of the project. For example, a project that involves complex machine learning algorithms may require a more specialized tool such as TensorFlow or PyTorch
How much can freelance data analysts expect to earn?
-- Calculate median price and frequency for different currency-rate_type combinations
SELECT currency, rate_type,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY avg_price) AS median_price,
COUNT(rate_type) AS frequency
FROM freelancerdata GROUP BY currency, rate_type ORDER BY frequency DESC;
This suggests that fixed rate projects are more popular than hourly rate projects, and that the median price for fixed rate projects is higher than the median price for hourly rate projects.
Additionally, the data suggests that the median price for fixed rate projects varies depending on the currency. For example, the median price for fixed rate projects in USD is $140, while the median price for fixed rate projects in INR is ₹7000. This may be due to factors such as the cost of living, the exchange rate, and the demand for freelance services in each country.
What are the top currencies used?
-- Count jobs for each currency and order by count in descending order
SELECT currency, COUNT(*) AS count FROM freelancerdata GROUP BY currency ORDER BY count DESC;
The most popular currency for freelance job postings on Freelancer.com is USD, with 4444 postings. This is followed by INR, with 2433 postings, and EUR, with 639 postings. This suggest that USD is the most widely accepted currency for freelance work on this platform.
How are client ratings distributed?
-- Categorize client ratings and count the occurrences for each category
WITH cfield AS (SELECT
client_average_rating,
CASE
WHEN client_average_rating >= 4.5 THEN '5'
WHEN client_average_rating >= 3.5 AND client_average_rating < 4.5 THEN '4'
WHEN client_average_rating >= 2.5 AND client_average_rating < 3.5 THEN '3'
WHEN client_average_rating >= 1.5 AND client_average_rating < 2.5 THEN '2'
WHEN client_average_rating >= 0.5 AND client_average_rating < 1.5 THEN '1'
ELSE 'No rating'
END AS rating_category
FROM
freelancerdata)
SELECT rating_category, COUNT(*) AS count FROM cfield
GROUP BY rating_category ORDER BY count DESC;
The majority of clients have no rating, which may be because they are new to the platform or because they have not yet been rated by their freelancers. A significant number of clients have high rating, with over 40% of clients having a rating of 5 stars. A small percentage of clients have a low rating, with less than 1% of clients having a rating of 2 stars or less.
This suggests that most clients on Freelancer.com are reputable and have a good track record. However, it is important to note that there is a small percentage of clients with low ratings, and it is always advisable to do your research before accepting a project from a new client.
Github
The source code for the project, which includes the website spider, is available on my GitHub repository
Conclusion
The freelance data analytics landscape offers a multitude of opportunities for skilled professionals to leverage their expertise and earn a living independently. By understanding the prevailing trends, in-demand skills, and popular locations for freelance data analysis projects, individuals can strategically position themselves to succeed in this competitive field.
The insights derived from the analysis of freelance data analysis job postings on Freelancer.com highlight the importance of possessing strong skills in Excel, Python, and SQL, as these tools are consistently sought after by clients. Additionally, developing expertise in specific data analysis techniques and tools can further enhance one's marketability.
As the demand for data-driven insights continues to grow, the freelance data analytics market is expected to expand, presenting even greater opportunities for skilled individuals to contribute their expertise and make a significant impact. By staying abreast of industry developments, adapting to evolving client needs, and consistently delivering high-quality work, freelancers can thrive in this dynamic and rewarding field.