Why Data Science Needs Diversity

by Emily Glassberg Sands

I’ve always liked math, more for the problems it could solve than for the theory itself. As a child, I loved word problems — shortest paths, miles to the gallon, price with tax. The more applied the question, the more interested I was in solving for the answer. By early college, the connection between the math I was learning and the types of education, labor, and policy issues I cared about felt tenuous. I considered majoring in a less quantitative field. I would not have been alone. While women account for more than 75% of U.S. undergraduates in health and education, they are the minority of math and statistics majors, and comprise just 20% of computer science and engineering majors.

I was lucky to stumble on inspiring mentors in college, women who showed me how math could be applied directly and powerfully to social issues. Ceci Rouse, a labor and education economist who later served on Obama’s Council of Economic Advisers, was at the time studying the causal effect of student debt on employment outcomes. Chris Paxson, a health economist who chaired Princeton’s public policy school and is now president of Brown, was researching the impact of Hurricane Katrina on the mental and physical health of low-income parents. Their work, and the work of applied economists like them, motivated me to stay in the game. I’m glad I did. Economics has empowered me to ask and answer questions that are important to me including how gender impacts success, what drives herd behavior, and why job referrals matter.

Today I’m an economist in tech or, I suppose, a “data scientist.” Data scientists are solving problems in education, labor, health, personal finance, and the environment with products that touch lives globally. Given the scope and difficulty of the problems we’re tackling, diversity will be key to identifying and answering questions that might have eluded us before. Yet only 16% of technical roles at major tech companies are held by women.

The need for diversity is particularly strong in data science. The core of the profession is in identifying, framing, and answering key questions about why humans (or some other actors) do what they do. It’s about drilling down into major product, business, and societal challenges to come up with solutions using math, theory, and your choice of applied methods — experimentation, causal inference, machine learning, you name it. Empirical and computational skills are tools in the data scientist’s toolbox: necessary to be good, but not sufficient to be great. The heart of the discipline is analytical creativity and, as the literature consistently reminds us, diverse teams are more creative. Our own diversity can also facilitate empathy for our diverse users.

Take the case of Coursera. Our vision is ambitious: to be a place where anyone, anywhere can transform their life through access to the world’s best learning experience. The questions we face day-to-day are correspondingly big and hairy. What learning experience does each potential learner need to reach their goals? How can we help them commit to investing in their education? How can we optimize their experience both for enjoyment and for quality of learning? We have fascinating data that can teach us how people learn second by second, what motivates and demotivates them, and which learning pathways facilitate particular personal, educational, or career outcomes for which individuals. These data can inform our strategy, drive our product and business, and help us reach our ambitious vision. That requires a creative and diverse team bringing unique skills and perspectives to the table. It also requires empathy for our learners, who are themselves diverse.

The tech industry is starting to see the value of diversity. Companies are filling out their data science teams with talent from a broadening array of fields, including the social sciences. Statisticians, operations researchers, and economists are joining the ranks alongside the more traditional computer scientists and mathematicians.

Further up the talent pipeline, the proliferation of interesting and impactful data science roles in a range of tech sectors has the potential to attract increasingly diverse talent to technical studies. Imagine a world where anyone dreaming of making an impact on education, health, or other societal challenges sees training in math, stats, econ, or computer science as a clear and direct path to that impact. For me, inspiring academic mentors showed the way. I hope the robust and engaging data science market can illuminate the path for many more.

To the current tech minorities: join us. Solving the global education challenges of today demands socially-minded technical talent of both genders and all races. We need you. And it’s never too late. At Coursera, we aspire to be the place where anyone, anywhere can get the skills they need to ask and answer the questions that matter to them, and to the world.

Originally published at building.coursera.org on March 11, 2016.