Adding context to analytics with different types of data
Data can be vast and overwhelming, so understanding the different types we can use in our analytics projects can help us narrow our focus. Even with the treasure trove of data most organizations have in-house, there are tons of additional data sets that can be included to add valuable context and create even deeper insights. It’s important to keep in mind what type of data it is, when and where it was created, what else was going on in the world when this data was created, and so forth. Using the example of a restaurant, let’s look at some different types of data and how they could impact an analytics project.
Numerical data is something that is measurable and always expressed in numerical form. For example, the number of diners attending a particular restaurant over the course of a month or the number of appetizers sold during a dinner service. This can be segmented into two sub-categories.
Discrete data represent items that can be counted and is listed as an exact number and take on possible values that can be listed out. The list of possible values may be fixed (also called finite); or it may go from 0, 1, 2, on to infinity (making it countably infinite). For example:
- Number of diners that ate at the restaurant on a particular day (you can’t have half a diner.)
- Amount of beverages sold each week.
- How many employees were staffed at the restaurant on a day.
Continuous data represent measurements; their possible values cannot be counted and can only be described using intervals on the real number line. For example, the exact amount of vodka left in the bottle would be continuous data from 0 mL to 750 mL, represented by the interval [0, 750], inclusive. Other examples:
- Pounds of steak sold during dinner service
- The high temperature in the city on a particular day
- How many ounces of wine was poured in a given week
You should be able to do most mathematical operations on numerical data as well as list in ascending/descending order and display in fractions.
Categorical data represent characteristics such as sex, or the types of books someone likes. Categorical data can take on numerical values (such as “1” indicating male and “2” indicating female), but those numbers don’t have a mathematical meaning and can’t be added together. (Categorical data may also be referred to as qualitative data, or Yes/No data.) For example:
- Marital status
- Favorite types of restaurant
Ordinal data combines numerical and categorical data. The data does fall into categories, however the numbers assigned to each category have meaning. Take Yelp as an example. Rating restaurants on a scale from 0 (lowest) to 5 (highest) stars yields ordinal data. Although ordinal data is often visualized in charts and graphics, unlike categorical data, the numbers do have an associated mathematical meaning. From a survey of 1,000 people rating a restaurant on a scale from 0 to 5, taking the average of the 1,000 responses will have meaning and thus would not be classified as categorical data. Other examples:
- Average customer satisfaction rating for a given month
- Google seller rating
- OpenTable rating
Looking at our restaurant, there are a lot of different ways we can approach analysis using contextual data. Incorporating weather data and seasonality with sales data may help us better understand which items sell better during specific seasons and weather conditions. Additionally, being able to identify how the number of employees working in the restaurant and the number of diners at the restaurant on a given day effect the average amount of sale per diner and the average Yelp review score could be interesting. If we owned a chain of restaurants, we could create a benchmark scorecard for each location to enable performance comparison within our group of restaurants.
The specific data sets that will be most relevant to a particular analytics use case will vary based on industry and the focus of the project, but the main point to keep in mind is that no data point lives in a vacuum. Regardless of the type of data, what these examples highlight is that there is plenty of information you can gather from the information you already have; when you enrich that data your insights grow more profound.
If you’re interested to learn more about Keboola, check out how we helped CSC at context to their digital marketing analytics by bring together over 50 data sources.