Taking the Plunge: How to Transition from Analyst to Data Scientist

Chris Bruehl
Learning Data
10 min readOct 3, 2023

--

Photo by NEOM on Unsplash

Most data professionals begin their career as some sort of analyst. They learn tools like Excel, SQL, Power BI, and go to work diving into data to extract insights that help companies make better decisions.

At some point, whether its driven by curiosity, boredom, or the desire for more pay, many analysts see what data scientists in their company or on the internet are doing and think about whether the transition to data science is right for them. I wrote about the factors analysts should consider before making this decision here.

For those of you who decide you are interested — I’m going to lay out how to make that transition, including a basic curriculum of topics & skills to get familiar with, common paths to transitioning roles, and how to position yourself when applying to data science jobs without prior experience.

What Skills do I need to know?

Data scientists often joke about how job postings seem to require a million skills and an improbable amount of experience with new tools. The truth is, few data scientists are truly masters of all of the following skills, and instead have varying levels of strength with them.

The curriculum can be daunting, and I will add some commentary to areas that can be picked up in the job or require basic familiarity rather than proficiency to get your foot in the door.

Probability & Statistics

Whether you are focused on A/B testing, machine learning, or simply tasked with a deep technical analysis, you will need to be strong in this area. You don’t need PhD level course work, but should be familiar with the following concepts:

  • Central Tendency: Mean, median, mode and when to use each
  • Distributions: Standard Deviation and Variance and skew
  • Correlation & Linear Relationships
  • Bayes’ Theorem
  • Probability distributions (e.g. Normal, Binomial, Poisson)
  • The Central Limit Theorem
  • Hypothesis Testing: p-values, types of error, A/B Testing

Mathematics

Data Scientists do not need to be able to perform complex multivariate calculus by hand, but they do need to have some intuition about the following concepts

  • Multivariable Calculus, including deriviatives and gradients
  • Linear Algebra: Understand basic concepts like vectors, matrices and transposition, as well as dot products, eigenvalues, and eigenvectors
  • Optimization methods — particularly Gradient Descent

Programming

You do not need to be an elite coder, but should be proficient with SQL and a data oriented programming language like R and Python. Within Python, having experience with NumPy, pandas, and sci-kit learn is a good foundation to start with. In many cases you will learn machine learning libraries as you learn machine learning models.

Data Manipulation & Cleaning:

If you’re an analyst you likely already have these skills.

  • Be able to work with relational databases, understand star schemas, etc
  • Be able to select, filter, join, aggregate, and calculate statistical metrics for data
  • Undertand how to work with missing data and the pros and cons of methods like deletion and imputation

Data Visualization

Again, a skill you likely already have some proficiency in, but being familiar with common visualizations, like scatterplots, histograms, line charts, etc and knowing when to use them is just as essential for data scientists as it is for analysts.

  • In Python, this means being able to confidently work with some combination of libraries like matplotlib, seaborn, and plotly

Machine Learning

This is the piece that draws most folks into the field, and it is very satisfying to build and deploy a model that has a postive impact for your organization, but we can’t credibly build models without the skills listed above. You should be familiar with how and when to use the following concepts and models:

  • Supervised vs. Unsupervised Learning — Are we predicting a clearly defined value (supervised) or are we looking for hidden patterns in our data (unsupervised)
  • Linear Regression — The first model most people learn, used for predicting numeric targets in an intuitive manner
  • Logistic Regression: A simple and effective classification modelling technique
  • Tree Based Models: For both regression — predicting numeric variables, and classification — predicting categorical variables. Decision Trees, Random Forests, and Gradient
  • Clustering & Dimensionality Reduction: K-Means Clustering and Principal Components Analysis are good starting points
  • A basic understanding of time series forecasting — know how to split data and use a forecasting tool like Facebook Prophet to start, and branch into more advanced techniques like ARIMA or deep learning methods if needed
  • Natural Language Processing: Often a specialty domain, having the basics of a NLP down can help with more traditional ML problems.

The Data Science Workflow

Know how to plan and execute a data science project. From project scoping, target definition, how to define success in a project, data splitting, and model selection & evaluation methods. You will learn many of these as you learn machine learning methods.

Other Tools

Usually nice to haves for entry level roles, but a basic understanding of Cloud Computing environments like AWS or Azure, Git/Github, and a big data tool like Spark can help your chances of landing a role. Deep learning is a domain that can also be learned later on, depending on your area of interest.

Again, this can seem daunting, but many of these topics take a few hours or days at most to learn, but when pieced together, will take several months to really get comfortable with. I want to stress that you don’t need to be a master of all of these to land a role, but the more boxes you check, the better your odds.

Common Paths to Data Science Roles

Transitioning into data science is hard. There is a lot of gatekeeping in the field for data scientist roles, and you will often be interviewed on topics that have little bearing on your day-to-day work. Many data science roles also are primarily analyst roles with a larger focus on statistical testing and quantitative metrics. I’ve seen data scientists get roles in a variety of ways, and as you move down the following list, the cost increases, but the difficulty of getting a job decreases.

Self Study

There are a ton of great free resources that cover data science topics on the web. One of my personal favorites is StatQuest on Youtube by Josh Starmer. There are also relatively inexpensive courses and books on these subjects (like mine on Maven Analytics or Udemy), but given the length of the curriculum, it can be challenging to stay focused and tackle all of these topics.

It’s also hard to build confidence in these fields when you’re not getting feedback from someone with experience. In my role at a data science bootcamp, a lot of the feedback I’d give to students was simply “keep going, you’re on the right track”, but there’s a lot of small things like knowing when your model is truly done or what makes a good project that many of these courses simply can’t cover. If you are disciplined and able to participate in communities that provide feedback, this is a good option, although even then many organizations might not believe your skills are credible. Some will give you the option to demonstrate your skills, but getting past an initial resume screen can be hard.

Mentorship

If you are currently working as an analyst in an organization that has data scientists, it is possible to your interest in data science and try to find opporunities that get you closer to a data science team. I’ve seen several folks make the transition into this field by working with data scientists and slowly transitioning to data science tasks as they learn more about the work they do. Many teams will be happy to mentor you up in exchange for a bit of grunt data work ;), but this will really depend on your organization and the culture of the teams.

Bootcamps

Bootcamps can be a great way to build these skills quickly, especially if you are already coming from an analyst or STEM background. If the bootcamp has a good reputation (you need to be careful here), this is often a great path in terms of cost benefit analysis.

The pace of bootcamps can be grueling, but learning with a group of people who are on the same path has always been a massive benefit to me and many of the students I taught. Being immersed in a group and learning the ‘language’ of data science, while sharing ideas and seeing what other folks are building can help you wrap your head around these concepts much more quickly.

My biggest piece of advice here is try to dig into the track record of graduates, and bootcamps that have some incentive or staff to help you find a job afterwards are a big plus. Trying to navigate the job market on your own can be an emotional rollercoaster.

Formal Degrees

There are now a host of Bachelor’s and Master’s degrees in data science. They are expensive relative to the other two methods, but as long as you’re applying to reputable institutions, you have a much higher chance of landing a job via this route. Employers tend to rate a formal degree as a much stronger signal of competence than self study or bootcamps, and universities tend to have established career services and relationships with employers that make getting interviews much easier.

Given what I’ve laid out above, it’s obviously not a trivial decision to jump into this field, but the ROI even for formal degrees is worth it, if you are passionate about these topics. If you’re not sure, then I suggest checking out a few resources for self study and seeing if they click. It’s a tough road to go down if you’re only doing it for the money.

How do I get a job once I’ve learned what I need?

I’ll be blunt — it’s not easy. This is especially true if you’ve taken the self-study path and are navigating the job market on your own. But it’s not impossible, either.

More than anything, it takes a healthy amount of patience and persistence. Even with some experience, and no problems getting called in for initial interviews, every interview process is unique and each team seems to have specific factors that might be red flags for them.

So it’s important to keep putting yourself out there and know that a lot of the time, if you interview a for a role and don’t get it, there aren’t any lessons to be learned, it just wasn’t a fit. (Note: I still haven’t been able to avoid the emotional rollercoaster of interviewing entirely, but it does get easier).

All that said, let’s focus on some things we can control:

  1. Project Portfolio: This is huge for career changers. You don’t have the benefit of having a host of data science roles on your resume that you can speak to. Recruiters and hiring managers will be going out on a limb to bring a career changer into the interview process, so you’re going to want to show them with projects that you have the skills necessary to complete data science projects. I recommend tackling a few different types of projects (regression, classification, unsupervised), as well as industries (e.g. healthcare, retail, e-commerce, etc) to showcase your versatility. If you have the time, or really want a specific role, trying to build a project that is related to the role your applying to is a great idea.
  2. Target Entry or Junior Level Roles: This probably shouldn’t be too controversial, but unless you have a very powerful referral, it’s very unlikely you’ll be considered for a Senior role, even if you had a Sr. or Lead analyst title in the past — remember, this is a new career ladder, and most of us have to start at the bottom or one rung up from the bottom.
  3. Networking & Meetups: People will be much more likely to take a chance on interviewing you and hiring you if they’ve met you and can put a face to your resume. Cold applying to roles means your resume without data science experience will be compared directly to many others that likely have that experience. But you can tilt the odds significantly in your favor by meeting hiring managers at meetups and career fairs as well as using LinkedIn to connect with recruiters and hiring managers. At the very least they may save you some time by telling you it’s not a fit before spending all that time applying.
  4. Interview Prep: Make sure you are ready for interviewing before you apply. There are a lot of free and paid sites that have interview questions asked in real data science interviews. If you aren’t getting many correct — make sure to hit the books more. But once you begin applying, make sure to look at GlassDoor and other sites for hints as to what might be covered in the interview process — your competition will be doing the same. Also, don’t be afraid to ask your recruiter for clues — larger tech companies provide entire interview guides that lay out what is fair game, but smaller ones will often give you some general topics to focus on if you ask.
  5. Technical Challenges: Many interviews will ask you to code live in SQL or Python to solve a data problem. These will vary vastly in difficulty, and sometimes you’ll get brain freeze despite your best preparation. That said, make sure to practice querying and analyzing new datasets with the tools you’ll be using on the job to help preprare, and if you can, interview with a timer, or even better, a partner. I write a lot more about interview prep here.

So, I’ve just laid out in broad terms the steps it takes to transition from analyst to data scientist. If it seems like a lot, that’s because it is. What will carry you through this process is a genuine passion data, problem solving, statistics, mathematics, and coding. Patience and consistency are also critical. You won’t be able to absorb all of these skills and knowledge in a month, nor will you retain that knowledge without consistent practice.

But the journey can be very rewarding, and if you stick with it through the difficult concepts in the curriculum, as well as the interviewing process, you might just find yourself with the title of Data Scientist.

Do you have more questions about making the leap? Or perhaps disagree with what I’ve laid out here? I’d love to hear it — drop a throughtful comment and I’ll do my best to reply!

In the meantime, if you’re curious about becoming a data scientist or want to dip your toe into the water, check out our Python and Data Science courses on the Maven Platform or Udemy.

Ready to build practical, job-ready data skills of your own?

Spring Savings: Up to 40% off at Maven Analytics!

Create your custom learning plan today, and save up to 40% on all-access memberships when you upgrade to a paid account.

All Maven memberships include:

✓ Unlimited access to ALL courses & paths

✓ Customized learning plans

✓ Skills assessments

✓ Free practice data sets

✓ Guided projects

✓ Portfolio builder & Showcase

✓ Private student dashboard

✓ Live instructor chat support

Join today and see why we’ve earned 50,000+ perfect 5-star reviews from students around the world.

This is a limited-time deal; take advantage of the savings today!

--

--