How to become a data scientist after earning a PhD in the social sciences

Ceren Altincekic
Data Science at Microsoft
7 min readAug 23, 2022

So maybe you got a PhD in a social science field and decided you don’t want to teach or be bound by the constraints of academia. Now what? In this article, I discuss the data science career path for recent PhD graduates and the skills they develop during their studies, as well as the aspects of academic life they need to improve upon to become successful data scientists.

Academia or data science? Photo by Kaleidico on Unsplash.

I had my first-hand experience with the academia-to-industry transition after my studies. After I graduated with a PhD in political science from the University of Colorado – Boulder in 2012, I packed everything I owned into a compact car and drove to Vancouver, Canada, to start a non-academic career.

I had known that I did not want to continue in academia for various reasons, including the precariousness of tenure-track jobs and their typical small-town locations. Instead, I built a career in data science that started with market research, progressed to econometric analysis, and propelled me toward learning to code (in Python, specifically), resulting in me building predictive models for various business questions such as churn, locational analysis, customer similarities, recommendations, and more. As I’ve continued to learn and grow as a data scientist since joining Microsoft, I’m pausing to share my take on the transition from PhD to data science in industry.

I think of what I’ve learned as being in three categories: The good, the tricky, and the promising path ahead.

The good

The good news for graduates with PhDs is that the skills they have developed over the course of their studies make them extremely well suited for a career in data science. Here are six skills that most PhDs develop that can help them further their non-academic careers:

  1. Hypothesis testing: Most, if not all, graduate research projects involve some sort of scientific hypothesis testing process. Graduate studies teach students to be curious and ask multiple questions, testing various hypotheses instead of believing in one solution and pursuing it until the data tells them what they want to hear. This not only makes for better research but also creates an environment of growth and learning that is not stuck in a confirmation bias loop (the tendency to cherry pick evidence to confirm existing hypotheses and biases), promoting a hypothesis testing mindset that does not assume or “want” a particular outcome but questions with curiosity what the data is trying to reveal.
  2. Qualitative research/subject matter expertise: One of the best ways to generate hypotheses to test is to conduct qualitative research. Social scientists typically run case studies on their subject of expertise to understand the topic more deeply, learn from locals and knowledge centers, and then generate new hypotheses to test within their more quantitative work. I find that this process lends itself very well to the data science field, where the data scientist must collaborate with stakeholders and SMEs to understand the business context in order to build high-quality models.
  3. Quantitative/statistical skills: Perhaps surprisingly to some, most social scientists typically complement qualitative methods with quantitative ones in their research. They implement a variety of econometric methods such as regression, classification, time series analysis, and — importantly — causal inference. Any student who has experience with these methods has an easier time transitioning to data science and other analytical fields.
  4. Coding skills: Similar to quantitative skills, coding skills are being taught and used more and more in social science graduate studies. In my experience, statistical software such as STATA, SAS, and SPSS are more commonly used than, say, Python or R, due to their more user-friendly interfaces and ready-made modeling options. But even if the student is not exposed to the more common data science programming languages such as R and Python, any coding experience helps tremendously in the transition to a more flexible programming language. Finally, several social sciences such as economics and political science are more commonly using R now, which is an excellent tool for future data scientists to master.
  5. Communication/presentation skills: Graduate students attend multiple conferences and talks, and also defend their theses during their studies. The amount of time and effort they put into preparing these communications makes them skilled presenters by the time they graduate. This is largely due to the amount of scrutiny they are exposed to by peers, professors, and other experts in their fields. A future academic must learn how to concisely get to the point and make a data-driven, defensible point about a complex subject. These skills are sought after in the data science field: The ability to distill complex model results into actionable business insights is a key skill for any data scientist.
  6. Project management: Graduate studies are essentially projects that the student must manage and see through from end to end. This skill is essential for data science projects where often the individual contributor is asked to take on an ambiguous question, fit it into a solvable data science problem, and execute the solution. I find that this is the skill I most employ in my work at Microsoft daily and it saves me time and energy while also preventing obstacles down the line.

The tricky

Now here’s the tricky part. Getting a PhD is a very long, tedious process that makes the person so deeply specialized that it is hard to find applications of that specialty in almost anything else but academia. Moreover, PhD students tend to feel (perhaps rightly) that they are the expert on a given topic and therefore deserve seniority, at least on certain aspects of a given job. This may become contentious in an industry setting where workplace experience, stakeholder management, and business wisdom are valued, while their lack is considered a handicap for fresh PhD graduates.

In my experience, the best way to resolve this tension comes from humility and an eagerness to learn and grow. Having a growth mindset is an excellent panacea for a mismatch between expectations and the reality of the modern workplace. I started my career without expectations and just wanted to be a true sponge, learning about how the business side of the equation works. Having no assumptions and showing curiosity and genuine interest in “how things work” is by far the best advice I have gotten in industry. This attitude helped me seamlessly transition from academia into more business-facing roles that require business perspectives that are not always acquired during graduate school. Here are some tips and tricks to make the transition less clunky:

  1. Always collaborate within and across teams: Much like a graduate student needs the “buy-in” of their advisor, other professors, peer-reviewers, and editors in their field, in business it is critical to find the right support mechanisms and buy-ins for a project to land and make a difference. Although there are similarities between the two sides, there are still a few aspects of the world of industry that require extra effort on behalf of the new graduate. In academia, peers can sometimes be highly critical of each other’s work, and this can make the outcome better in the end. I find that, in industry, it is much more productive to focus on the complementarity of the work rather than on a critique of it. Because every team in business has a different mandate, it benefits all parties to focus on concerted efforts.
  2. Reverse the engagement timeline: In academia, students typically engage with others in the field after writing and presenting an article. Their initial engagement with existing work comes from literature reviews, not necessarily from talking to the contributors. In industry, we have the luxury of reversing this process by engaging with stakeholders early on, getting an understanding of what they may be looking for before we even start building a model or thinking about a solution. I find this reversal extremely helpful in putting the right foot forward and making the project as relevant and impactful as it can be.
  3. Perfection versus MVP (minimum viable product): This is particularly tricky as timelines can be a lot longer in academia. Typically, an academic paper may take six months to two years to get published in a peer-reviewed journal. These timelines are much shorter in industry. At Microsoft, engineering (which includes data science) works in six-month cycles that we call semesters. A recent graduate must appreciate the import of these cycles and let go of perfection for the MVP way of thinking. This does not mean that the work must be of lower quality. The Pareto rule (also known as the 80/20 rule) states that 80 percent of outcomes stem from 20 percent of assets. In industry, it is much more productive and faster to produce an MVP with the 20 percent and then improve on it as time permits.
  4. High-end data science solution versus basic business need: Academia comes with a certain level of freedom in the research and methodologies employed. One of the big challenges in transitioning to data science is the sometimes-false expectation that you will be building Deep Learning models all the time to answer any business question. I tackle the reality with a middle-ground strategy: I first build the simplest possible model or solution that helps the business. Then I iterate over that with more technical solutions that might improve the results in due time. In my experience, most managers are happy to let data scientists experiment with different methods as long as the MVP is delivered and the business questions answered.

These are some of the differences between academia and industry that might trip a fresh graduate in the workplace. Now here comes my favorite part about transitioning from one to the other: The promising path ahead

The promising path ahead

After learning the ropes in industry and developing the skills to manage the business context, I would like to think that the sky is the limit for recent graduates.

Data science is an almost perfect field for social science PhDs who want to leave academia but also want to make good use of the skills they have developed over their educational career. It has everything from a quantitative bend, infinite learning prospects, presentation, teaching, and communication opportunities to truly making an impact in business outcomes while affording the exploration of new research ideas.

I encourage any and all PhD students to consider data science as a productive, fun, scientific field where they can both contribute in heaps and shine as individuals.

Ceren Altincekic is on LinkedIn.

--

--