A Quick Guide to Statistics for Empirical UX Research
If you’re not sure where to start then this guide can help you get your bearings.
If you’re anything like me, you’ve taken statistics courses at university but it was too general and didn’t really prepare you for doing quantitative research and data analysis. I found myself super confused by the names of all the different tests and how to use them, and while I had a basic statistics understanding, honestly nothing made sense and there were so many unfamiliar concepts. I was really good at qualitative analysis, but my quantitative analysis skills were falling short.
So, I decided to change that. Using a mix of this amazing (and free!) course from TU Delft and asking ChatGPT for a lot of help, I managed to really refine my knowledge of statistics tailored for empirical research, and especially human-centered and user experience related research.
What are the topics you need to know?
First things first, I asked ChatGPT for a comprehensive list of every topic and test I should be aware of, and also did some research using other sources, and this is what I came up with:
Basic/Necessary:
1. Descriptive Statistics — Summarise & describe data
Mean, Mode, Median, Range, Standard Deviation, Variance
2. Data Visualisation — Visualise data
Bar charts, Histograms, Box plots, Scatter plots, Distributions and Tests for Normality
3. Inferential Statistics — Inferences and conclusions about populations given a sample
Hypothesis Testing, p-values, Confidence Intervals, Standard Error, Type I & Type II Errors
4. T-Tests — Comparing two groups or conditions for significant differences in a dependent variable given a change in an independent variable
a. Independent t-tests & Paired t-tests, Degrees of Freedom
5. Analysis of Variance (ANOVAs) — Comparing more than two groups or conditions for significant differences in a dependent variable given a change in an independent variable
One-Way ANOVAs, Ominbus ANOVAs, Factorial ANOVAs, Repeated Measures ANOVAs
6. Correlation Analysis — Capturing the strength and direction of the relationship between two continuous variables
Pearson Correlation Coefficient, Spearman Correlation Coefficient
7. Chi-Squared Test of Independence — Checking for significant associations between categorical variables
8. Reliability & Validity — Verifying the consistency, stability, appropriateness and accuracy of the results obtained
Cronbach’s Alpha for Survey Questions, Inter-Reliability Rating for Interview Coding, Split-Half Method & Spearman-Brown Prophecy Coefficient
Advanced/Extra:
1. Non-Parametric Tests — Alternatives to parametric tests (e.g. t-tests, ANOVAs, etc.) when the assumptions of parametric tests are violated
a. Mann-Whitney U Test, Wilcoxon Signed-Rank Test, Kruskal Wallis Test
2. Effect Size — Understanding the magnitude of the relationship/difference between variables
Cohen’s D, Eta-Squared, Partial Eta-Squared
3. Power Analysis — Understanding the needed sample size for the desired effect size and power size
Sample Size, Power Size
4. Multiple Comparisons Correction — Correcting for false positives when using multiple tests on the same results
Boneferroni Correction, Tukey’s HSD Test, False Discovery Rate (FDR)
6. Resampling Methods — Overcoming small sample sizes or non-standard distributions
Bootstrapping, Permutations
7. Data Cleaning & Preparation — Dealing with outliers and missing data
Outliers and Missing Data, Data Transformation
8. Regression Analysis — Checking the relationship of a dependent variable and independent variable(s) to see the effect
Simple Linear Regression, Multiple Regression
9. Factor Analysis — Understanding the underlying structure of a set of variables and latent factors
Explanatory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA)
A handy summary of everything you need
Below is the summary of everything I’ve learnt after months of research. There’s a flowchart of the whole process you need to go through from start to finish (keep in mind a lot of steps are very optional, but it’s nice to know about them), and a table with all the different tests involved in hypothesis testing and when to use each of them. I hope you find these useful!
Where I Fit In
My PhD project looks at leveraging tools and techniques from design fields to make the design of AI systems accessible and inclusive. I’m working towards creating a participatory process, and a toolkit to support it, to systematically involve people throughout the AI life-cycle — with a focus on respecting stakeholders’ values.
You can check out the official page for my project on the Imperial College London website. You can also check out this other article I wrote explaining the details of my PhD project.
I’ve set up this Medium account to publish interesting findings as I work on my PhD project to hopefully spread news and information about AI systems in a way that makes it understandable to anyone and everyone.