The 5 Pillars of Data Science

Ammar Jamshed
Geek Culture
Published in
5 min readJan 29, 2022

--

Photo Designed BY SABAH SHAHED

1. Consumer Desire

https://muirbury.com/modern-consumer-and-the-opportunity/

We have to always first look how our data Science project is going to benefit our stakeholders.

If we build a Machine learning model to predict consumer churn will it help the organization determine which consumers are leaving and going to leave and why are they leaving. So they can build business strategies to retain them, Its actually surprising that most Machine learning models on Consumer churn are barely able to even touch upon which consumers are going to leave but only how many might. While in Deep Learning models with advanced algorithmic procedures its possible to tackle upon that.

2. Business objectives

Source : Kaylan City blogspot

We need to evaluate what the business is planning to achieve from this. Whether they need to use the Data predictions to build a:

  • Forecasting tool.
  • Budget tool.
  • Application programming interface to evaluate predictions on all upcoming sales and merge them on their online work tool for data driven decision making.
  • A Business use case website to check predictions on all volume of orders and on new customer churns and acquisitions.
  • Or do they require it as a one time answer for a decision they making to meet current year business objectives.

In each of these case a Data Scientist, ML engineer will have to partner with the right technical talent such a web developer, domain expert and even a simple data Management expert for the execution and deployment of the end service.

3. Context of data in business settings.

/* WHAT DO WE DO WITH 28389 32923 & 9020 */

We need to first Understand the Data and its logic for existing in the business.

“ The Astuteness and Acuteness of data within the business matters the most during any model creation process or Machine learning framework design for the business .” Ammar Jamshed (Author)

Astuteness implying how much of the daily operations are based on data such as it’s nearly 100% in a Hospital which has to look at statistical factors like blood report and genetical history when diagnosing a patient, While in a small scale Food Manufacturers only its sales are reliant upon data analysis and forecasts while its manufacturing is not that dependent on it unless it’s a large scale company.

Acuteness implies here that how sensitive is the business to data like 5% confidence interval in a Investment house may not be considered a big problem as its only implies 5% chance of profit decline as it only deals with money which it can recover with market appreciation at a later date. While a 5% Confidence interval in a Hospital is a very serious problem as it implies 5% chance of more patients dying than assumed, which can result in financial losses, legal issues and work demoralization for them.

We have to also look at the infrastructure the data is stored within like if a business keeps its data on spreadsheets then does it keep it for low cost maintenance or does it keep for tabulation views and employee understanding and if it generates its data through manual import or tool exports such as exporting it from google analytics, firebase etc.

As each business giving a certain product or service has a different data requirement.

  • Such as Software houses which make applications directly have all their data of products in their Integrated Development Environments and they can export it in HTML to upload on their employee websites or in excel to share with other departments.
  • For Food Manufacturers and Consumer goods companies their primary data storage is in tabular form which they store in Spreadsheets for understanding by all and even link their excel sheets to auto updating with their budget tools whenever new data enters them.

These are a few example on data management settings per industry and company.

4. Statistical journey of data

Source: Dreaming Andy- Fotolia

After getting access to the clients data.

We first need to check the datasets for

  • Anomalies and errors to ensure our analysis and predictions are as accurate as possible.
  • There is obviously no 100% as data is manipulated at some extent throughout the business procedure before it comes to our access as well such as certain inventory managers can input a wrong value and that gets stored in the database amongst million other entries unnoticed.

We then have to visualize the data and understand which features of the data are important such that in the case of the data of fruit sales the important features which would give a holistic view of the data and can be used to predict the price of fruits as well would be Fruit prices, fruit sales, Fruit quantities and Fruit categories.

We often have to apply encoding algorithms to convert categorical values like fruit categories into numbers as computing algorithms only read numerical data.

Before we apply Predictive Machine Learning or Deep Learning algorithms we have to revisit each step to see if no error was made on our end.

5. Communication of results achieved.

/* You Dear are worth 9 million Dollars in cash */

This is often the most difficult part because the audience comprises of people with vastly diverse backgrounds and mindsets and they all cant be individually catered to each time.

So we have to come up with a story narrative that’s understandable to everyone regardless of their way of thinking or background.

If we try to explain a data story with such technological terms and advanced mathematical calculations then only people with background in those industries will understand it.

we have to make sure that everyone in our audience understands what we are saying so they adopt our solution.

If we word our explanations in basic terminologies even then some will not understand what we are trying to say.

We have to think about something that every human in the audience can relate to and such a thing can be food as everyone eats to survive.

Authors Picture

If we explain our Machine learning prediction as a café menu that can show what we will order from the selected items at the café and sometimes can also show the wrong item that we cannot order. And that if explain that our Machine learning model has 98% accuracy then we have to explain that when we visit a café 10 times a day and then 9 out of 10 of those time the smart menu we designed as a machine learning model will predict what we are ordering 9 correct times out of 10 times total.

--

--

Ammar Jamshed
Geek Culture

Data Science Specialist- Data Sciences | Business Intelligence | Social Sciences | Machine Learning | Coding | Lifestyle