Sales prediction using a Linear regression model.
Analyzing and anticipating the sales for the given budget for TV, radio, and newspapers.
Hola, in this project I created a prediction model for sales analysis. In this model, we need to feed the advertising budget of TV, radio, and newspapers to the model and the model will forecast the possible sales. For designing the model, the machine learning method I opted for is simple linear regression, and the programming was done in Jupyter notebook.
Dataset Description:
The advertising dataset captures the sales revenue generated with respect to advertisement costs across numerous platforms like radio, TV, and newspapers. Find my Kaggle notebook here.
Data:
Features variable :
- TV: advertising dollars spent on TV.
- Radio: advertising dollars spent on Radio.
- Newspaper: advertising dollars spent on Newspaper.
Target variable:
Sales budget.
Step 1: Import the required libraries and dataset.
The dataset I chose for this exercise or program is in the form of CSV so, I used pd.read_csv from the panda's module as shown in the picture below dataset contains 4 columns named TV, radio, newspaper, and sales.
Step 2: Check for null values in the dataset and data inspection.
After the extraction of data, it’s time to check the dataset for null values and duplicate values.
Step 3: Exploratory Data Analysis (EDA).
In EDA we are gonna find the relationship between features and the target variables.
Distplot:
Displot is used to represent the univariate distribution of data(involving one variate or variable quantity) against the density.
Step 4: Statistical Tasks
Standard Deviation(std) is a function used to depict how much variation is from the mean.
Correlation(corr) is a function used to identify the relationship between the variables.
Variance(var) is a function used to check the dispersion that takes into account the spread of all data points in a data set.
Mean returns the average of the dataset.
The median calculates the middle value of the dataset.
Step 5: Linear regression model building and prediction.
The linear regression graph is created by train data and the model line is shown by the blue line which is created using test data and predicted data as we can see most of the red dots are on the line, thus we can say that model has produced the best-fit line.
My Previous Articles:
Conclusion:
In a nutshell, TV advertising is the best for sales prediction. It’s a good starting point, especially when attempting to understand the relevance of python as well as statistics.
Finally…
I really hope this article has been a great read and a source of inspiration for everyone thinking to pursue a career in the field of data science.
Please Comment for suggestions and feedback. I am still learning. Please help me improve so that I could help you by upgrading my writing skills as well as knowledge and presenting myself to you in a much better way through my subsequent article releases.
Thank you and Happy coding :)