Image for post
Image for post

New Series: The Full Stack Data Scientist

What a data scientist should know to build end-to-end data science solutions

Chris Schon
Apr 29, 2019 · 3 min read

Stack Overflow recently released their 2019 developer survey. It was full of interesting developer insights into everything from preferred technologies to optimism of the future. It made me think about the role of data science in technology and the skills required to have the role integrated into the wider ecosystem. Developers have coined the term ‘full stack’ for a developer who is comfortable working on all aspects of web development. What would be the equivalent for data science?

Most respondents (51.9%) identify their roles as ‘full-stack developers’, with ‘data scientist or machine learning specialist’ taking up 7.9% of responses. Other data-related roles include data or business analyst (7.7%), data engineer (7.2%) and scientist (4.4%).

Stack Overflow survey 2019 ‘Developer Types’

Since many data scientists don’t have the luxury of the support of large teams of developers, they must be able to build things and perform tasks that aren’t traditionally thought of as part of their role. This could relate to business analysis, data engineering, DevOps, database management and web development. I would consider a data scientist who is capable in all these areas to be a full stack data scientist. It’s not an option in the survey, yet… :)

The ability to build end-to-end solutions is the best way to prepare yourself for any role or project, work with a variety of teams, and ensure your insights bring value to the business. I believe that in order to do this, you must have a good knowledge in each of these areas:

💼 Business analysis. A sound understanding of the requirements, available data and goals of a project.

🏛 Infrastructure. The ability to efficiently design, deploy and work with a wide range of technologies and data management systems.

🚂 ETL. Data scientists should be able to build effective data processing pipelines so that their models and analysis are easily maintained.

💡Machine learning. Extensive knowledge of techniques to build intelligent systems.

🖥 DevOps. Source controlling, deploying and monitoring solutions is made easier using tools like Git, Docker and Airflow.

📱Web app & API development. Building simple web applications and API endpoints will make it easier to integrate insights into other applications.

📊Data visualisation. Create intuitive visualisations using a variety of tools.

The aim of this series is to cover each of these areas. If we are showcasing a particular tool, the post will walk through a Github repository.

The first part of the series is already live! Check out The Full Stack Data Scientist Part 1: Productionise Your Models with Django APIs.

What would you like to see next? Vote for the next post below!

Image for post
Image for post

Applied Data Science is a London based consultancy that implements end-to-end data science solutions for businesses, delivering measurable value. If you’re looking to do more with your data, please get in touch via our website.

Applied Data Science

Cutting edge data science, machine learning and AI projects

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store