Data Scientist. What does it mean and how do you get in on it?

Guy Tsror
That Data Guy
Published in
4 min readNov 21, 2018

Let me just open by saying — if you’re in the field of data science, this post is probably completely irrelevant for you. But if you are on the cusp of a big career move, a new job opportunity, with certain amount of curiosity and find yourself attracted to the words ‘data’ and ‘science’ — this might be more relevant for you.

I decided to write this piece mostly for myself, to recap some things I learned and realized during TMLS2018, which helped me get my head around some basic facts and fictions around data science in general and machine learning in particular. Most significantly, I think this is where I gave myself the rubber stamp acknowledging this area is definitely where I would like to grow and evolve, and I can’t wait to get my hands dirty.

Photo by Zia Syed on Unsplash

So, are you confused about what does the often-vague term ‘Data Scientist’ mean and what should you think about when considering a career step in such direction? Well, the next couple of paragraphs might clear some things out.

First, data science is a very, very broad area, and can include different types of work. To simplify, we can speak of at least 4–5 sub-roles under data science:

  • Data Engineer — responsible for data sourcing — bringing the data in, cleaning it and handling it before it gets to the data scientists and others down the funnel.
  • Data Scientist — the one who gets to fiddle around with the data and create models based on it. Sometimes this might include cleaning and fixing into databases, and often includes analysis of the data for better understanding of products and customers, and communication of results to the team and management.
  • Business Analyst — the person responsible for the front-end of the data — visualization and insights — and the higher level understanding of the business itself and its clients.
  • Research Scientist — takes part in fundamental ML research, as in, researching new approaches and develops new methods in the field.
  • Machine Learning Engineer — responsible for the practical implementation of the models a data scientist may come up with, and possibly develop more ML models.

Keep in mind, you should take these with a grain of salt. These are generalizations and can differ significantly from one company to another, and sometimes a single person might be required to do almost all of the aspects mentioned above. But as the field of data science is growing and evolving, companies understand that there is just too much to do for a single person to handle.

Still in for the ride on the data scientist train? Perfecto.

Photo by Carlos Muza on Unsplash

Now comes up the question of how to get on it, if you are not coming from previous work in the field. This has (of course) a lot of variables, mostly your professional and educational background, but in general, you should probably pick up some of those skills:

  • Methodology and terminology — what is a schema? what is a database? relational DBs? RDBMS? Learn the basics (might want to consider to take a look at some Coursera courses like this and this).
  • Probability theory and statistics — these are essential parts of data science and machine learning in specific, and you better have some sort of understanding of it.
  • Data tools and some basic programming — programming is essential like in so many other technical fields, sorry if you thought you can get by without. Lots of languages are being used today for data science purposes, but more and more companies rely on Python as the go-to. Mastering data processing or data management tools like SQL and R is going to be something you deal with on daily basis, so better pick up some of that as you go.
  • Build your portfolio — data science can be an art, in a weird way, and like fellow designers and other artistic professions, having a portfolio is going to give you points. Start playing around with data sets online (Kaggle is fantastic for that), implement some of the things you learned and practice the different tools you used. Having something to show or link to is always more convincing that saying ‘I know python, I swear’.

Some of the skills I mentioned above are very basic and you probably have a grasp of them from your studies or work experience, but others are really important if you are switching to this field from something not directly related, like different areas of engineering for example, or general software development.

As you work your skills up, stay up-to-date with what’s happening in the field. Read some blog posts. Follow companies that you are interested in and understand their use of data. Go to meetups in your city. Interact with professionals from an early stage, and you might learn something on the path they went through, and who knows, maybe even get your foot in some company’s door!

As I stated in the beginning, I wrote this mostly to organize some thoughts and pieces of information I had in my head, and decided to share with the world. I hope you find even a small piece of it to be helpful! Feel free to let me know how do you feel and what do you think, I just love me some criticism ❤.

--

--

Guy Tsror
That Data Guy

A data-nerd trying to adult my way through life.