Faces of data science 3

Casey Doyle
Data Science at Microsoft
11 min readMar 31, 2020

--

For the third in our “Faces of Data Science” series, I’ve interviewed three colleagues working in data science: Rose Nyameke, Kirk Li, and Yael Brumer. All are currently members of the Customer Growth Analytics team in Microsoft’s Cloud+AI division.

Rose Nyameke

Data & Applied Scientist II

What’s your educational background, Rose? I have a bachelor’s in neurobiology from Harvard University and a master’s in analytics from North Carolina State University. When I came to this country to further my education I intended to go into medicine, and because Harvard doesn’t have an undergraduate pre-med program, I had to choose another that would fulfill pre-med requirements. I decided on neurobiology because it seemed super cool when I was exploring majors. For my graduate work, it was Microsoft that influenced my decision.

Interesting. How so? I interned at Microsoft in 2013 in marketing during the summer between my junior and senior years. That came about because the prior summer I’d interned as part of a study abroad program at a pharmaceutical company in Switzerland, and I was accidentally placed in the marketing department instead of in the research internship that was more aligned with my background. But I rolled with it and really enjoyed it, though I wasn’t sure if what I enjoyed was the healthcare aspect or the marketing aspect. I did another marketing internship outside of healthcare to confirm, and that was at Microsoft in the Server and Tools group. My mentor was trained as a data scientist and I was explaining to him that I wanted a more scientific basis for the marketing recommendations I was making, and he told me about data science. I started thinking about data science for graduate school, and when considering programs, North Carolina State appealed to me because of its reputation, student placement record, and that they compress what is essentially a two-year graduate program into one year.

In addition to your mentor, were there any other factors that led you to consider data science as a career? Yes. My favorite part of neurobiology was looking into human behavior and memory, and so when I first started exploring data science, it was another way for me to understand human behavior. Except instead of trying to raise zebrafish or slice into brains, I could approach it from a data perspective in trying to figure out what are people doing. The underlying motivations are the same between neurobiology and data science.

What do you like most about your work in data science? I like that the possibilities are endless regarding the questions that can be explored — it’s really just a matter of figuring out whether the data exists or what it takes to generate that data. And it feels very objective, which I like. For a lot of other things I’ve been interested in, much of it has had to do with having an instinct or just being able to think and speak the language naturally. With data science, I feel that even though there are some elements such as having good business intuition, at the end of the day it’s writing code and it’s math. If someone asks me why I enjoy it, I can objectively say that’s why.

What has surprised you the most about your work in data science? I’d say the frequency with which the answer is that nothing is there. Let me explain. When I was in graduate school, we laughed in our time series forecasting class when we had super low error rates and our instructor would remind us we were working with manufactured data and not with data in the real world. In reality things are often not as predictable because everything is blurry; sometimes the data is erratic and sometimes the answer to the question being asked is that it’s not significant, even if you expect it to be. Sometimes it’s inconclusive, but sometimes the conclusion is that, in looking for the factors that lead to X, Y, and Z, sometimes the answer is that there are no factors. The factors you’re exploring aren’t there or are not significant.

How do you continue to learn? My primary learning comes from my co-workers. I like to talk to people about what they’re working on and I ask myself, could I do that? And if the answer is no, I ask myself, why can I not do that? In the beginning when I joined the team and the answer was no, it was because I didn’t yet know a certain tool or it was because I didn’t have a particular context. I would then take the relevant internal course for knowledge or context. I also use Reddit, where I primarily spend time in the comments of a data visualization subreddit. I get to see how people think about data and interpret it, and what they find confusing or misleading. I also look for online courses that are relevant or will help me sharpen my skills. I’ve also used online courses such as Coursera, where I particularly liked the content from instructors at Johns Hopkins University, and I’ve used Pluralsight and DataCamp.

How would you advise someone to get started in data science? I think it depends on what your background is. I had to ease into it a bit because by the time I realized what data science was, it was my senior year of college. Luckily, I had already taken a programming course, but not statistics, so I took it my senior year. I worked for two years between college and grad school, and during that time I started to look for more responsibilities that would give me a feel for what I could do in data science such as learning from the DevOps team and sharpening my data, querying, and architecting skills, and also taking the Coursera courses I mentioned earlier. All in all I would say to be sure you want to get started in data science by getting your hands dirty and taking advantage of free resources or looking at your current work to see if you can incorporate elements of data science into it. I also think it’s important to think about what kind of data scientist you want to be, because I think that shapes what you learn and what you think about exploring. So, if you want to be a model builder you have a different path from doing something that’s more stats heavy. I think it’s important to explore job postings and look at the types of roles out there and what they require. Figure out which one appeals to you the most and tailor your education that way.

Anything else you’d like to add? One thing I’d add is that there is no single definition of a data scientist, and so it’s OK to want to be a specific sort of data scientist. I think a lot of people get caught up in things like the glamour and shininess of building something like a very complicated neural network without understanding some of the questions involved, such as what does it mean if I’m tuning this parameter, or if I see this type of result? Statistics can really help with that, as can matrix algebra and other concepts we don’t often confront — they’re useful for troubleshooting when things don’t make sense. So I’d say that having a solid foundation and understanding it’s OK to not seek the most shiny part of data science are key things it’s important for everyone to know who wants to be a data scientist.

Kirk Li

Senior Data Scientist

What’s your educational background, Kirk? I studied applied math and statistics at Stony Brook University in New York for my bachelor’s and master’s degrees, and statistics at the University of Washington for my Ph.D. I also had a second major in economics for my undergraduate studies and earned a graduate certificate in computational finance during my doctoral studies. Most of my education has focused on quantitative analysis. I always had a lot of interest in math since high school and did pretty well in it, but I also wanted my knowledge to be applicable to real world scenarios. I like using numbers and doing hypothesis testing to justify my understanding, including calculating the significance of one statement versus another. The economics second major was a good complement to statistics, because economics is also very quantitative, but it’s also focusing on real world problems. Computational finance was also very quantitative and had a lot of programming.

What led you to work in data science? For me it was a natural direction given my educational background. Statistics is the closest field of study to data science and everything I learned during my Ph.D. was very applicable to data science problems.

What do you like most about your work in data science? I like everything, but I really enjoy coming up with conclusions and decisions based on data-driven approaches. I trust data more than I trust the information I collect from what people are saying (laughs). If you handle data correctly, it doesn’t create much bias and provides a foundation for decisions. I also like to see my ideas realized in actual business scenarios. In this way data science is one of the areas in which people can translate their creativity and channel their passion into influencing decision making around actual products.

What is surprised you the most about your work in data science? How other backgrounds also contribute to the data science world. People come from different educational backgrounds. They can also do very good work in data science and provide unique contributions and perspectives. For example, I see how data visualization or design helps make data science more intelligent and more interpretable, increasing impact and the ability to explain or communicate the ideas behind the data. It’s surprised and excited me to see how data science can merge and blend with other backgrounds to make everyone more successful.

How do you continue to learn? Within my team we have a weekly continuous learning study group. We study research papers and ideas in academia and see how we can adapt those ideas to be more applicable to our business. We participate in online courses like Coursera and DataCamp to study common practices in data science. We also attend academic conferences to stay updated with the latest developments in our research areas so that the models and algorithms we’re developing are world class.

How would you advise someone who wanted to get started in data science? What steps do you think they should take? I think it’s useful to have education in a related field. Many schools and programs are offering learning materials on data science and there are also good ones online. Course providers like Coursera provide an entry-level data science curriculum. Practice is also very important. For those with no educational background in data science, participating in Kaggle competitions, joining study groups, and keeping track of the latest developments are good ways learn from a DIY perspective. For everyone, data science is a fast-changing area, and knowledge can easily get outdated, and so you always need to refresh yourself and re-invest your time to learn the latest knowledge, technology, models, and algorithms. But one thing I also say to my students when teaching is that you don’t have to be very sophisticated in math or programming to be a good data scientist. Data science is a broad area, and so you can make your interest your expertise. You don’t have to be the best programmer or the best in math. If you’re a problem solver you can make yourself successful in data science. For example, if you can make sense of data and explain it so that more people can understand the problem and the solution, that’s a great achievement. Likewise, if you are a UI designer or writer. I have seen people who develop awesome UI apps for mobile or desktop applications that explain everything very well, and those who write articles to explain concepts and help readers avoid data science fallacies — that’s all very useful too.

Yael Brumer

Senior Data & Applied Scientist

What’s your educational background, Yael? I have bachelor’s and master’s degrees in software engineering from Ben-Gurion University in Israel. My graduate degree includes a focus in machine learning. I have always loved coding and math, and I love the challenge of figuring out and solving problems. In Israel, where I grew up, it’s very common for people to pursue these areas of study. They’re very popular, and there are many jobs in these areas. And Israel is a start-up nation, and so there’s lots of software engineering everywhere, even in school. I started my master’s degree while working at Microsoft with two small children at home, but I was excited about being a data scientist and so I told myself, despite all the difficulties, I’m going to do this. I define myself as a person who always tries to stretch herself as much as possible. Sometimes it’s hard, and sometimes I say, OK, that’s too much for me, but I’m always trying to reach the upper limit as much as possible.

What made you want to work in data science? I didn’t know at first that I wanted to be a data scientist. But I always knew I loved working with data and customers. I worked as an intern at Intel in business analytics and it helped me understand that was the direction I wanted to go. This was six years ago and the data science discipline didn’t exist yet, and so I thought being a program manager would be the best fit for me because it involves working with customers and data. But in Israel, there was a requirement to be a software engineer before being a program manager, and so I started with that and then began to navigate myself to roles involving more data and customers. When Microsoft CEO Satya Nadella announced the data discipline in 2014, I had the opportunity I’d been preparing for.

What has surprised you the most about your work in data science? The transformation over the last few years across many disciplines about making decisions based on data. When I was getting started, it was more common to add features without looking at customer data. I remember working on a project and the data indicated customer drop off at a certain point, and so I started to investigate. It turned out there was a problem in the installation process. I convinced the team to make sure that customers were using the product as we expected and if not, to approach them directly to see if we could help them and learn about their experience. It was a mind shift for everyone at the time. Today I think it’s much more clear to everyone that we should base all our decisions on data.

How do you continue to learn in your role in data science? I’m always trying to stay updated on the most recent papers in the field, and I take online courses, such as Coursera. There are also conferences, and I submit papers to them. For me, the papers provide a lot to learn from. What are the advanced technologies out there and what is the state of the art for all the algorithms we’re working with? I’m trying every year to pick the top four or five papers to focus on. I also keep up with blogs, particularly Medium and Stack Overflow.

How would you advise someone who wanted to get started in data science? Everyone’s case differs, but I think it was beneficial for me to start as a software engineer. It gave me a basic understanding of how to design a solution and write code, which is still helping me every day. It has helped me push things much faster, without having to wait for someone to implement my code into production. It’s also helped me talk with software engineers about the problems I’m facing as a data scientist. Besides my own experience, someone can take online courses, which are helpful. Because sometimes online courses can be overwhelming, it helps to be focused on what you’re looking for, and persevere over the long term, because there is so much to learn. If you are already working in a company, try to talk with people who are working as data scientists. Ask them for ideas about where to focus. I think the combination of doing online courses in areas that are interesting to you along with a couple years working as a software engineer, even if it’s not your passion, is very important for the long-term journey.

--

--

Casey Doyle
Data Science at Microsoft

Principal Data Scientist of a data storytelling program fostering thought leadership in information design and data visualization inside and outside Microsoft.