My Journey into Data Science

Vivian Peng
4 min readDec 17, 2019

--

The last time I blogged about data science was 5 years ago, when I wrote a post for Gilt Tech on how I got into data science.

If you missed that story, here are the cliff notes:

It’s hard to pinpoint what exactly about data science captured my interest that day. I think it was this sense that we can take on really complex issues that we previously would have shied away from. Coming from a public health background, this felt like a game changer. In public health, we are constantly working with population data and trying to predict patterns to design interventions that either treat or prevent an event from happening. But I felt our tools were limited and we’re siloed from other industries.

In data science, I found the tools to gather information and make connections that weren’t immediately intuitive or obvious. I liked the sense of wanting to dive into an issue and trying to see connections and understanding.

Since that first Gilt workshop, Jared’s become my mentor. I’ve assisted him in training workshops for R and given talks at NY R, DC R, ODSC, and Strata Singapore.

My talks were mostly inspired by my work at Doctors Without Borders, advocating for access to healthcare for those most in need. For several years, I ran a campaign to get Pfizer and GSK to lower the price of the pneumonia vaccine. (Which, they did!) Then I switched to field work, completing a mission in Kenya in 2017 and Myanmar in 2018. As the person in charge of creating the content and strategy for how our stories are told, I became really familiar with the media landscape. I found myself feeling frustrated that so many important data stories were either misinterpreted by the public or quickly buried because the information was not told in such a way could compete with headline news. So I gave talks about storytelling with data — how to tell compelling stories, and what ethical considerations to keep in mind when telling data stories.

This was the part of data science I fell in love with. How to interpret dense, complex information and visualize it. But I always felt a limit to what I could do with data visualizations for two reasons:

1. My work is not reproducible. As you can see, my illustrations are hand drawn, which means that I can’t create templates for visualizing certain types of data. As a data scientist, I believe reproducibility is really important and I would love to contribute some useful templates/tools to the open source community.

2. My understanding of statistical modeling is limited to what I learned in one epidemiology course during my grad program. I know enough to be able to run models in R, but not enough to interpret and extract recommendations. With a limited understanding, I cannot effectively visualize the overall message. As someone who nerds out on the data side and art side of things, I want to bridge that flow of information.

With these reasons in mind, I recently enrolled in a data science immersive at General Assembly. I want to master my technical skills so that I can be the most effective communicator and create tools/templates that would be useful for others. In addition to that, I am motivated to be a technical expert who can help shape the impact of tech in the healthcare industry. I see Big Tech’s increasing reach into the health sector, and am cautious given their track record of mishandling data. As public health practitioners, we have to update our tools and actively inform how health data should be collected and shared.

It’s been one week of class and I feel like I’m back in my element — with one exception. This course is taught in Python! I feel like I’m cheating on R (sorry), but I know being able to program in both languages will make me a better data scientist at the end of the day.

Since more and more folks are becoming bilingual these days, here are a couple musings (more to come, maybe…):

--

--

Vivian Peng

Artist, activist, data scientist. Currently Lead Data Scientist at The Rockefeller Foundation. Formerly @ Mayor's of Los Angeles and Doctors Without Borders