Little Crash-Course in Big Data – Collecting Data
A designer’s guide to quantitative data from one amateur to another (Part 1 of 3)
If you are a designer, chances are that you are quite excited about Big Data. But chances are also that data basics and statistic were never part of your curriculum at university. This means that you’ve got some catching up to do. Before jumping in the deep end with Big Data, let’s get a bit more familiar with Plain Old Data™ first.
This article is the first in a series of three about how to collect, analyse and visualise quantitative data while avoiding common pitfalls. You can find the other parts here — Analysing Data and Visualising Data.
Big, dark or dirty?
First, let’s cover a few data-related terms.
Big Data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. (Wikipedia)
Gartner defines Big Data as: “Big Data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimisation.”
Big Data is tricky. Big Data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data. (Wikipedia)
Big Data is awesome. It allows correlations to be found to “spot business trends, prevent diseases, combat crime and so on.” (Wikipedia)
Big Data is a popular buzzword in the design world. When we’re working with data as designers, are we working with Big Data or just Plain Old Data? Here are some examples of Big Data projects that can help orient you.
Here are some other useful terms in the data space.
Dark Data is used to refer to underutilised information assets that have been collected for single purpose and then archived. But given the right circumstances, that data can be mined for other reasons.
Dirty Data is inaccurate, incomplete or data full of errors. Unclean data can contain mistakes like spelling or punctuation errors, incorrect data associated with a field, incomplete or outdated data etc.
Small Data is a term to describe data that is small enough in size for human comprehension. Small Data connects people with timely, meaningful insights (derived from Big Data), organised and packaged — often visually — to be accessible, understandable, and actionable for everyday tasks.
Designers’ current expertise is probably more closely related to Small Data rather than Big Data. Unfortunately, the term Small Data lacks a certain cachet.
Collecting the data
A variable is value that can vary. For example: yes-no, a colour, scales like “how much do you like this on a scale from 1 to 5,” temperature, distance you’ve walked, number of hard breaks while driving, time to deliver etc.
LEVELS OF MEASUREMENT
Variables are measured on types of scales: nominal, ordinal, interval and ratio scale.
The nominal scale can be organised in categories but has no natural order. Examples are political opinion, gender, concept preference, cat-owners vs dog-owners etc.
The ordinal scale can be arranged in order e.g. a Likert scale which is commonly used in surveys (“Indicate on a scale 1–5 how much you like x. ‘Strongly agree, Agree, Neutral, Disagree, Strongly disagree‘”).
Most psychological data collected by psychometric instruments and tests are measured on an ordinal scale. A lot of data in design is also measured this way, especially early in the design process when we are trying to gage sentiment towards ideas or statements e.g. a survey about which concept is more appealing.
The interval scale has equal distance between values but doesn’t have a true zero. A great example is temperature. Contrary to the ordinal scale, the interval scale has equal distance between numbers but the zero point is arbitrary. Fahrenheit and Celsius chose different zero-points and the scale goes both up and down from that. (Note: Kelvin has a true zero.)
The ratio scale has equal distance and a true zero. Weight, height, speed etc are examples here. Most measurements in the physical sciences and engineering are measured on a ratio scales. Many things measured later in the design process fall into this category as well. For example, when we can directly measure clicks, time spend on tasks, money spend etc.
Depending on the level of measurement you are working with, different mathematical operations (addition, subtraction, multiplication and division) should be used.
The ordinal scale is the most controversial in this respect. For example, strictly speaking one shouldn’t calculate a mean of a sentiment scale like a Likert scale since multiplication and division is reserved for interval and ratio scales only. However, this is often done anyway since it does provide a useful measurement, albeit less stringent than some would like. Read this excellent article about ordinal scale if you are curious to find out more.
An example from the interval scale is when people make the mistake of saying “it’s twice as warm as yesterday” if the temperature goes from 10°C to 20°C. The interval scale doesn’t have a true zero so multiplication, division and ratios are not appropriate.
RELIABILITY AND VALIDITY
To make sure we get a good result when gathering and measuring data, we need to consider reliability and validity.
Reliability is about consistency. A tool that always gives you the same (or almost the same) result every time you measure the same thing is considered a measurement/tool with high reliability. A tape measure is a more reliable tool than eyeballing it. This get’s trickier when you want to reliably measure things like stress, confidence, mood and productivity.
Validity is about accuracy. It’s about making sure you’re measuring what you’re intending to measure. A method that truly measures the idea or construct in question. Using a scale for weight and a tape measure for height is obvious but what about measuring stress, happiness and affinity?
Below are a visualisations explaining reliability and validity.
This is why scientists use tests that have been proven previously and don’t just make it up. They don’t want to “test the test” and the thing they want to measure at the same time.
One way to think about it is that someone else has hooked up people to sensors and measured cortisol levels in people while at the same time making them answer questionnaires about stress. They have figured out that e.g. three specific questions are really good predictors for stress. Those three questions are much more likely to be a better measurement of stress than your “made up” questions.
The term ecological validity refers to making sure that what you’re measuring isn’t a product of the lab or, in our case, the interview or prototype session.
This speaks to the advantage of testing ‘in the wild’ rather than in focus groups. There are many things we can do to increase ecological validity. Resisting to explain the service, tool, design we are testing will simulate a more realistic first encounter with the new service or design for example.
Using more advanced prototypes that users can live with over time and we can remotely monitor (and do follow-up interviews with) does even more to assure ecological validity. The question then becomes one of cost. Is the investment worth the potential data and insight you’ll get from such a high investment approach justified?
POPULATION AND SAMPLE
Normally you can’t test everyone, a.k.a the population — e.g. all UK citizens or all Volvo drivers — so you test a few, a.k.a. a sample. The goal is to generalise the findings from the sample to the whole population.
Large samples tend to be better since they balance out outliners and extremes, that might skew the data in a small sample.
It’s very important to make sure that the sample is representative of the population so that you can confidently generalise your findings and say something about the population based on your findings in the sample. Random sampling is usually best but it is hard to do practically.
Remember, you don’t always want to exclude extremes and outliers. Especially early in the design process, outliers and extremes are often the source of great insight since they display behaviours and needs present in the “average user” in a more obvious way. These extremes also often spark new solutions through their workarounds and personal strategies.
DEPENDENT AND INDEPENDENT VARIABLE
Experiments are used to determine causal relationships e.g. does noise affect math performance? Or does our app affect the amount of walking people do?
The dependent variable is the variable you want to measure (e.g. amount of walking). The independent variable is the variable that you suspect affects the dependent variable (in this case, our app or a feature in our app).
Best practice is to start with an hypothesis and not just to look at the data after you have collected it. It’s easy to fall into the trap of post-rationalisation, interpreting the data favourably, regardless of the result.
Confounding variables are other variables (other than the independent variable) that have affected the results. This could be age, time of day, education level, gender, weather etc. This is why good sampling and random assignment to the test group and the control group is so important. Potential confounding variables are often measured as well so that when it comes to analysing the data, the influence of those can be assessed and corrected for.
The mere fact that someone is being tested/monitored can be a confounding variable and skew the data.
To set up an experiment, the sample is divided into a test group which is exposed to the independent variable and a control group which is measured but not exposed to the independent variable. We could e.g. recruit a group of people and randomly divide them into two groups.
One group gets our new app that we think will increase the amount of walking. The control group also gets and an app that tracks walking but there is no interface, the app simply tracks in the background.
After a few weeks we can compare how far the two groups have walked and if our app made a difference. But now we’re getting into analysing data and that’s the topic for the next article.
Ok, that’s some basics when it comes to collecting data. I’d love suggestions for other terms/concepts to add to this article, so please speak up if you think of any.
Part 1: Collecting Data (this article)
Part 2: Analysing Data
Part 3: Visualising Data