Analytika: The Analytics Process

Published in

Clayming Space

7 min readApr 3, 2018

In this second part of Analytika we will delve into the process of analytics during decision making. Like in any project, it goes in stages and the same way, the analytics process can be categorized into stages that are distinct from each other.

The Process of Analytics

Define business objective

The analytics process always starts with the business strategy objective. This is what drives the method by using a systems thinking approach. Some examples of these business strategy objectives:

How should are product lines evolve to increase sales?
How do we keep our customers on our platforms (or) products and maintain a low attrition rate?

When in doubt during the analytics process (or) data strategy, we always refer back to this objective (or) requirement as our guide to critical path.

Translate business objective to a data project

In this part of the analytics process we are focused on transformation the objective into specific quantified metrics (preferably).

For example, if we were directing a marketing campaign for an electric car. We’d need to find the characteristics of the end-user. Such as, are they pro to environmental issues (qualitative here); their age group (generally younger then to be more environmentally focused); their annual incomes (can they afford our electric car) and more. These metrics will allow us to direct marketing campaigns to these specific individuals in the future.

Select and obtain the data

Establishing a plan where we transform this business objective to specific data required to achieve the goal — we now select the data we require.

For example, in the above case of an electric car, we look at past campaigns whether are own or others. In the case we are in a new industry (or) idea, we look at parallels in history and this becomes more experimental.

We may not have all the data. At times, we will need to go out and get this data that is either free and publicly available (or) would need to be paid. We also need to think to the future regarding data search especially if we do not have it internally. Is this something we can get from our customers in the future as part of the product/service (or) would we need to continue to get it from external third party organizations.

Explore the data

After accumulating most data (as we may come back to getting more data in the future based on our analysis and testing) and coming to a consensus on a method to conduct analysis, we explore this data.

Exploration involves understanding it, identifying whether we’d need to correct it and/or establish if these correction will need to be made and/or how it may affect our analysis.

Going back to the electric car marketing campaign. It is hard to establish whether people are environmentally conscious compared to finding their age or gender. In such a case, we need to account how much will this affect the optimization of our marketing campaign and as such will need to come up with a strategy to clean and transform the data.

Transform and clean the data

This aspect of the process typically consists of data wrangling and cleansing. We also need to merge any data that we already have with any data we obtain from other sources.

We may decide that because of the inability to establish whether individuals are environmentally conscious or not, we discard this marketing campaign attribute while keeping in mind that in the future, if we do get access to such data, we accommodate for it in our analysis.

Conduct data analysis

As the data is wrangled, cleansed and merged; we are now ready to conduct any formal analysis.

A common approach here is using regression. A regression model will catch any attributes of customers that may be highly correlated (positively or negatively) on whether or not they would respond to our electric car marketing campaigns contained in our (historical) data set.

We tend to run different models sometimes different types to obtain one that provides the optimal predictive outcome — a model that distinguishes the good from the bad potential prospects.

Translate data insights into action

Once we’ve agreed on a model, we focus on translating the model into actionable insights. In this case, we want to improve the rate of response on our future marketing campaign on electric cars.

Given the predictive outcomes we’ve determined we can target individuals with specific characteristics that the model predicts will respond well to the campaign and as such gives us the ability to be cost and time effective with our campaigns.

Assess outcomes

Finally, we need to assess the outcomes of the analytics process for the specific business objective. We need to ask ourselves:

Was the actual response rate predicted by model and what was the margin of error?
What did we learn?
Can we use what we learned in the next campaign to be more effective?

The analytics process is a continuous loop whether it is on one business objective or the business as a whole. It’s important to remember — with time, things change — and as such so should our analytics process and business objective.

An example of a typical Business Intelligence and Data Visualization tool © Pexel

Analytics Process Tools

In the world of data analytics, there are a myriad of tools available and a myriad more always being made such that it can be hard to keep up with the trend as a leader. Therefore, the best way to understand these tools is by categorizing them into domains in which they are used when in decision making. These domains are based on the analytics process steps and when it comes to data there are four main ones:

Data Storage
Data Transformation
Data Exploration and Analysis
Data Visualization

Most tools are focused on one of these steps while some technologies and tools can span a few to all of them.

Data Storage

Efficient and effective storage is vital in any data related venture. The important distinction here is SQL and NoSQL data.

SQL or Structured Query Language is well suited for structure data. SQL was designed for a time when acquiring data and storing was costly.

NoSQL on the other hand tends to be well-suited for unstructured data (which most modern data is all about) such as images, social media posts and videos to name a few. Saying this, storing unstructured data requires more processing power and storage capacity compared to if the data was better organized.

Data Transformation

SQL is used to transform data to and from rows and columns on a spreadsheet. Unstructured data is not easily categorized and as such requires a different approach where NoSQL comes to its aid.

Apache Hadoop (and MongoDB) are popular methods to divide storage and processing tasks on clusters of cheap commodity hardware rather than large data silos while analyzing big data sets in parallel.

Data Exploration and Analysis

Descriptive analytics are usually associated with Business Intelligence (BI) tools. Their purpose is to aggregate, divide and make large data sets into small nuggets that are displayed in a visually appealing manner such as tables, plots that show domains of interest. Common tools in the BI space are Microsoft Power BI and Tableau.

Machine Learning (ML) tools are used in predictive analytics even though this is one small aspect of ML. They can be used in computer vision, natural language processing, path planning, games, etc. Some popular examples include scikit-learn, TensorFlow, Amazon ML and Azure ML.

Finally, statistical modelling tools like R, SAS, SPSS are versatile tools that perform descriptive, predictive and prescriptive analytics. They tend to require more programming skills than the others.

Data Visualization

Visualization tools enable a better representation of analytics performed on data. They tend to follow the principle: simple and clean. They use a variety of graphical methods to communicate data insights. Most BI, ML and statistical package tools tend to have some visualization component to them*.

Analytics Team Roles

Analytics projects within organizations happen in teams. Every team has members with different roles for different skill sets for the project. The roles may have the same or different names based on the organization but they all tend to have the same four main roles. These four main roles in an analytics team are:

Data Translator
Data Analyst
Data Scientist
Data Engineer

Data Translator

The translator is the interface between the data analytics team and the business.
They tend to be business savvy and have broad understanding of the technical aspects of data analytics and science.
They define the business goals for the data projects, help translate the results into business insights, develop any strategy for getting the appropriate data and aid stakeholders use these insights to make the critical decisions.
They tend to also oversee the entire process and sometimes, a separate Data Manager, can be involved in managing this process.

Data Analyst

The analyst tends to be the junior in the team.
They tend to do a lot of the preliminary tasks such as creating databases, warehouses, use BI and visualization tools to create charts and reports.

Data Scientist

The scientist is responsible for constructing and executing analytical models from the data obtained.
They tend to be more like a statistician well versed in statistical analysis and more advanced techniques such as machine learning, natural language processing, deep learning and social network theory and analysis.

Data Engineer

The engineer is responsible for data collection, transformation and organization of the data for the project.
They tend to be computer scientists and software engineers with background and experience in data modelling, programming and databases.