Data Scientist — Training Plan

Mikhail Raevskiy
Deep Learning Digest
2 min readJul 19, 2020

Thanks to Data Science, we can control takeoffs and landings at the largest airports, we can analyze and predict the emergence of epidemics without doctors. Thanks not only to modern technology but also to those training programs that are installed in it, doctors can make complex diagnoses almost accurately in some branches of medicine. Moreover, we even have unmanned vehicles, and every year it is better and smarter! I suggest you not to stay away from such an interesting and promising science and take the side of its development.

Photo by Campaign Creators on Unsplash

Basic (bridging module)

  1. CS50 — Harvard Programming Course. The construction of algorithms, the search for the most effective methods for solving the problem is considered. There is also practice in C, Python and JavaScript;
  2. Mathematics and Informatics: Why does a programmer need computer science?
  3. Mastering programming with the program “Nanodegree — Learn to Code” from Udacity.

Why is this module needed?

If you came to programming from another specialty or were a layout designer or web designer, then this module will help expand the boundaries of your knowledge of programming.

The core of Data Science

Additional education program from Udacity “DataAnalyst” (paid and in English).

If you have a desire to adjust the education program for yourself, then it will be free. We offer the following sequence of free courses included in the program above:

  1. Introduction to inferential statistics ;
  2. An introduction to descriptive statistics ;
  3. Introduction to Analytical Data Processing (using NumPy and Pandas);
  4. Introduction to primary data processing ;
  5. SQL for analytical data processing ;
  6. MongoDB for analytical data processing (if possible and desired, you can study analytical data processing using the R. language );
  7. Introduction to Machine Learning ;
  8. Data visualization and work with D3.js ;
  9. A / B testing.

Machine Learning

  1. Introduction to machine learning ;
  2. Training to work with TensorFlow with application in real applications ;
  3. Machine Learning and Scaling ;
  4. Neural networks and machine learning.

Software Development

Python

  1. Testing and debugging ;
  2. Using Git and GitHub for version control ;
  3. Building Reactive Analytical Web Applications in Python (article).

R

R Software Development (The individual courses included in the R Software Development Curriculum are listed below).

  1. Environment and R ;
  2. Advanced level of writing programs in R ;
  3. Building R packages ;
  4. Build data visualization tools.

Additional materials

  1. Introduction to Hadoop and MapReduce;
  2. Python as a web data access tool.

You don’t have to try to master everything at once. There is an opinion that for Data Scientist it’s enough to know only Python, and you can do without R. Others think the opposite. In any case, try, choose what is closer to you and you will undoubtedly succeed.

--

--

Mikhail Raevskiy
Deep Learning Digest

Bioinformatician at Oncobox Inc. (@oncobox). Research Associate