The Definitive Data Scientist Environment Setup

David Adrián Cañones
DataTau
Published in
13 min readMay 4, 2020

--

Two Data Scientists at work
Two Data Scientists at work

Intro and motivation

In this post I would like to describe in detail our setup and development environment (hardware & software) and how to get it, step by step.

I have been using this setup for more than 5 years with little changes (mainly hardware improvements), in many companies, and helped me in the development of dozens of Data projects. Never missed a single feature while using it. This is the standard setup both Pedro and me use at WhiteBox.

Why this guide? Over time, we found many students and fellow Data Scientists looking for a solid environment with some fundamental features:

  • Standard Data Science tools like Python, R, and its libraries are easy to install and maintain.
  • Most libraries just work out of the box with little extra configuration.
  • Allows to cover the full spectrum of Data related tasks, from Small to Big Data, and from standard Machine Learning models to Deep Learning prototyping.
  • Do not need to break your bank account to buy expensive hardware and software.

Hardware

Your laptop should have:

  • At least 16GB of RAM. This is the most important feature as it will limit the amount of data you can…

--

--

David Adrián Cañones
DataTau

🔬 Data Scientist | 🤖 Machine Learning Engineer | 💹 MBA