Marketing Analytics using Python — Series Introduction

Kamna Sinha
Data At The Core !
Published in
4 min readSep 24, 2023
img source : various online sources.

As a data professional working in the field for 14 years, it has been a nonstop learning experience and a joyful and enriching one ever since my days in EMC ~2009 till today running a successful startup on Data analytics.

I owe it to the field to share what I have learnt and simplify it for beginners and learners who are enthusiastic to make contribution.

As part of this initiative , I have created this series to discuss fundamentals of marketing analytics using python , which is no doubt the most popular and dynamic choice for any analyst or data scientist.

A brief of whats to come has been put below… Hope you enjoy the journey as much as I enjoyed creating this content !

Part 1 : We will start by creating our own dataset , which is the total sales by weeks for 2 competing products at a chain of stores. We will see various techniques to simulate different kinds of data to match closely to real data.

Then store data in a pandas dataframe to make it ready for further analysis.

We use the numpy random library with its various functions in the process.

Part 2 : After creating our data we move on to sumarizing the data in order to get an initial idea of how the values [ discrete and continuous ]are placed as a next for step to do further analysis. We also summrize the entire dataframe as an important prerequisite step .

Functions we use are groupby, value_counts, plot.bar , pandas.crosstab , distribution functions ( min, max, mean, median, std, var , mad, quantile) , describe [ for dataframes], iloc , apply and a small example of using lambda functions.

Part 3 : Next, we look at our data visually to understand about the distribution, skewness etc.

We use pandas.DataFrame.hist which calls matplotlib.pyplot.hist() resulting in one histogram per column, pandas.DataFrame.boxplot to compare sales across stores, scipy.stats.probplot to compare data to a specified distribution [qqplot] , look at cumulative distribution using statsmodels.distributions and plotting data on map using cartopy.io.shapereader .

Part 4 : We move on to the next step in data analysis, exploring relationships between variables in the data.

We use histograms and scatterplots to do so.

Part 5 : A broader and smarter way to do initial exploratory analysis is by plotting values of all possible pairs of variables that we can and then assess relationships. We do this using matplotlib subplot function and pandas.plotting.scattermatrix on our example dataset to learn the same.

Part 6 : In this story we use the seaborn library PairGrid to get more dynamic plots than scatter_matrix for better visualization on analysing relationship between variables.

Part 7 : To understand through actual numerical values the relationships between variables, what needs to be done beyond visualising it using statistics , correlation coefficients in particular.

We use numpy covariance function to start with, move on to numpy corrcoef function, then understand its significance using scipy.stats.pearsonr() , dataframe corr() to get coefficient matrices and plot them using matplotlib imshow and seaborn heatmaps.

Part 8 : Transformation of certain variables which are not in normal distribution is an important prerquisite before moving forward with any data analysis step in order to find relationships and make correct predictions.

We look into options to do so for our ongoing example using a simple scipy stats boxcox() library and try to understand with the results how that makes a difference in drawing meaningful insights.

Part 9 : Our final story is a brief one, picking up the example of categorical variable from our dataset and showing how certain transformations need to be done on it in order to see meaningful visualizations.

With this we conclude the series on fundamentals of makrketing analytics using python.

Watchout this space for more on this topic including advance data analysis and examples on segmentation and classification.

More by me on Marketing Analytics :

More on Data analytics :

Our work with Social sector data by Sensewithai involving Data Analytics :

Please clap/comment if you enjoyed the content :)

--

--