Member-only story

Fairness and Bias, Notes from Industry

Race and Ethnicity in Health Data Science

Why it’s important and how we should approach it

Lathan Liou
Towards Data Science
9 min readAug 23, 2021

--

Photo by Jon Tyson on Unsplash

It’s undeniable that considering race or ethnicity (abbreviated as R/E; used as a singular noun in this article, although the statements in this article will refer to race and ethnicity collectively) is important in quantitatively studying healthcare outcomes. Ask any respectable statistician/epidemiologist/data scientist and they’ll tell you at least this much! I think while the understanding of the importance of R/E is ubiquitous, we can always strive to build a stronger fundamental vocabulary of why it’s important. I wanted to write an article that aims to summarize conclusions from (fairly mature) literature about R/E in model-building. Specifically, I wanted to briefly cover R/E in explanatory and predictive contexts (for more info on the difference, check out my previous article on the topic!).

Below is an outline of this article. While I’ve ordered the topics based on my personal progression of understanding R/E (1. what does this variable represent 2. how do we record this variable’s measurement 3. why this variable is important 4. how do we make statistical conclusions about this variable), please feel free to skip around to a topic that most interests you!

--

--

Towards Data Science
Towards Data Science

Published in Towards Data Science

Your home for data science and AI. The world’s leading publication for data science, data analytics, data engineering, machine learning, and artificial intelligence professionals.

Lathan Liou
Lathan Liou

Written by Lathan Liou

Data Scientist at Merck. Tidyverse enthusiast and a neRd.

Responses (1)