Why I steered my career away from Theoretical Physics and Towards Data Science for Good

Transitioning away from science for curiosity’s sake to working with HIV and global development, the data science way.

Amir Emami
Palindrome Data
Published in
5 min readMar 30, 2020

--

Happiness, wealth, knowledge, status? We’re always optimising for something, usually prescribed to us by the society we’re raised in. We work and work, trying to reach some externally-set benchmark to deem ourselves successful or failing this, failures.

What if we took control and optimised for goals that align with our own personal values? How would that shape our day-to-day lives and the world we live in?

This is my early-career journey from Science-for-Curiosity to Data-Science-for-Good that led me to work with Palindrome Data in summer 2019.

“When I’m older I want to be an ice cream van man!”

…is what the 3-year-old version of me would exclaim to anyone who asked the classic question.

I was optimising for pleasure, it seems, given that concepts like wealth or knowledge were beyond my grasp. As I grew older and thoughts about ice-cream started to take less than 50% of my brain-space, “inventor” became the dream. This dream was lost to the Iranian school system that rewarded me for getting answers right to predetermined questions, eventually replacing my ideal of “invention” with “discovery”.

The caricature of Da Vinci had now been replaced with that of Einstein, and optimising for knowledge became the goal. Over the following few years, the dream of becoming a Theoretical Physicist was born.

A transition of idols.

In Pursuit of the Truth

Seven years after a move to Scotland, I began studying maths and physics at the University of St Andrews. As I was exposed to new topics and problem areas outside my field, however, my ideas about what I should dedicate my life to started to lose footing.

Enter…

Effective altruism is about answering one simple question: how can we use our resources to help others the most?”

While working as a videographer on a few interviews that later made their way onto this piece on effective charities, I was hit with concepts that crucially exposed the mismatch between my values and my career plans. Effective Altruism brought about another re-evaluation, but this time consciously optimising for impact. An “impactful” career that utilises my BSc-born mathematical and analytical skills became the focus — but what does “impactful” mean, anyway?

The Three Labours of Impact Maximisation

After months of research and reaching out to professionals already working in different for-good areas, it became clear to me that modern data were under-utilised in efforts relating to global development by charities, NGOs, and government agencies. Given how closely data analytics aligns with my personal strengths and interests, it began to draw me in.

The effective use of data for global development as a problem area presented itself as a worthwhile cause when assessed on the three-component framework developed by 80000hours (a career guidance resource with EA principles):

  1. Scale — What is the magnitude of the positive effect once the problem is addressed?
  2. Solvability — How tractable is solving the problem?
  3. Neglectedness — How much resources are already dedicated to the problem?

It was at this point in my research that I came across Palindrome Data — a South African development analytics partnership working on the exact problem that I was beginning to understand. In a post on the UN’s Global Partnership for Sustainable Development Data website, one of Palindrome’s founders outlined how and why the company came to be, and I was hooked.

Eager to learn and to gain first-hand experience with the concepts I had been wrestling with for months, I emailed Palindrome looking to get involved (side-note: Steve Jobs told me to do it). After interviews and a background check, I was welcomed by the team and given the task of performing exploratory analysis of South Africa’s national mother-to-child HIV transmission data.

My First Data-for-Good Engagement

At Palindrome, not only was I given the responsibility of working directly on a national health dataset, I was also given the agency to explore as a creative data researcher.

As I began my first encounter with real-world development data (as opposed to the clean laboratory data of academia and university courses), it became clear that possible findings wouldn’t be restricted by modelling or statistical methods, but by the quality of the data itself. Although there were many positive things to be said about the mother-to-child HIV transmission dataset (for one, its size was very close to the full population it was trying to capture, missing only a few entries), there was one major systematic flaw that could not be overlooked. Since each baby was not identified by a unique ID upon testing (but simply by name and date of birth), there was no robust way of tracking the babies’ test results over time.

Thankfully, work had been done on this dataset by Boston University to produce approximate unique identifiers for the babies, enabling longitudinal analysis. After attaching these identifiers and labelling each baby with the transmission route through which it had been infected, the team at Palindrome delegated to me the task of exploring how different factors affected which transmission route HIV took to infect each child.

The analysis suggested that indeed the baby’s gender, whether it was living in a rural or urban area, or even in different districts within South Africa correlated with the route through which HIV was more likely to be passed down to the baby. These were insights that, upon further exploration, could inform future policy regarding care services to more effectively target the spread of the disease.

Optimising for Impact

The fact that there was (and is) still so much left to uncover from such an important dataset is itself testimony to the urgent need for those of us who are inclined towards mathematics and the sciences to be working on these problems.

We don’t have to float into the private sector on autopilot or carry on into an obscure sub-field within our academic disciplines never to be seen again, optimising for things we never chose.

Partnerships like Palindrome Data are needed to transfer data knowledge across sectors to utilise it for good, and our careers are arguably our biggest contributions to the world, so let’s optimise them for impact.

--

--

Amir Emami
Palindrome Data

Master’s graduate focused on utilising data science for public good.