Data Science Progress — 6 weeks

EKS Project
Into becoming a Data Scientist
3 min readMar 8, 2021

I have a clear goal which is to become a Data Scientist specialized in Bio-Sciences. Following a colleagues advice I decided to image where I want to work and take advantage of being a the beginning of my career. He said once “Commit to something and go towards a specific goal”

It is time to invest my time to take courses, a lot of courses , write a shining project, create my personal brand and brush up my skills to combine it all with my previous experiences.

On January 26th I started a bootcamp on Data Science at the school “The Bridge” in Madrid.

It was exciting and I have to say that I have not realized how much I missed interacting with others. I guess “remote working” has its pros and cons.

THE CURRICULUM

The course started with a so called “Ramp up” in order to introduce us to the basics of python and homogenize the level of the class.

The topics we covered are the following:

PYTHON

0. Installation of Anaconda and Jupyter notebooks

  1. Markdown language in Jupyter
  2. Variables, print, comments, del function, types of data, data type conversion , input & none type.
  3. Basic arithmetic operations , comparative operations, boolean algebra, built-in functions, methods, lists
  4. Work flux using loops (for, while) and conditional clause ( if/elif/else), break/continue and try/except
  5. Collections: lists, tuples, dictionaries and sets. Operations, methods and converting between collections
  6. Functions: Definition, syntax, applications
  7. Object oriented programming: Classes, attributes, constructor, methods, documentations
  8. Libraries and modules.
  9. Functional programming: Map, reduce, filter, timeit

SQL

I have to remark that we were using SQLlite which differs slightly from SQL. We worked with it, inside the Jupyter notebooks using python libraries and other functionalities.

0. Configure the environment

  1. Data model
  2. Queries: SELECT, LIMIT, DISTINCT, WHERE, ORDER BY, GROUP BY, JOIN
  3. Accessing relational Databases
  4. Creating relational Databases
  5. NoSQL databases

MATHEMATICS

Calculus

  1. Functions
  2. Derivatives
  3. Optimization— Least Squares

Algebra

Working with Numpy

  1. Matrices
  2. Identity matrix, Transpose and inverse
  3. Vectors
  4. Operations with matrices & vectors
  5. Dot product
  6. Similarity measurement (cosine)
  7. Linear combinations

Statistics

  • Descriptive statistics
  1. Mean, standard deviation
  2. Position statistics
  3. Histograms and frequencies
  4. Relations between numerical variables
  5. Exploratory data analysis
  6. Interpretation and presentation of data
  • Inferential statistics
  1. Probability
  2. Random variables
  3. Probability distributions
  4. Normal distributions
  5. Confidence intervals
  6. Absolute error
  7. Sample size
  8. Hypothesis contrast

After all this we started with “Data Analysis” module, covering the following:

NUMPY

  1. Lists and matrices
  2. Numpy
  3. Arrays
  4. Array attributes
  5. Indexing
  6. Slicing, subarrays
  7. Reshape
  8. Type of data
  9. Concatenate
  10. Substitution
  11. Copy
  12. Split
  13. Aggregations

PANDAS

  1. Introduction to pandas objects
  2. Series
  3. DataFrame
  4. Index
  5. DataFrame dimensions, columns, few observations, types of data, missing values and statistics
  6. Indexing ( loc, iloc)
  7. Silicing
  8. Masking
  9. Fancy indexing
  10. Missing values None, np.nan, null
  11. Aggregation and grouping: GroupBy (aggregate, filter, transform, apply, applymap, map)

All this content is both theoretical and practical. As it is commonly said “Practice makes perfect”

Now reflecting upon all this content I think I am quite proud of what I have been able to learn in these 6 weeks. It has been very productive and even if I do not master all the skills learnt I would say I have increased my python skills by 500%.

Also I have to say I have been very strict on the first weeks and it did me wrong to be so focused that I will no go out or meet any friends or family. I think this is a long run race and I need to keep a good balance between effort and leisure time.

I will keep learning and posting my progress. I am looking forward to start dealing with real life datasets and applying all my new knowledge to everyday life problems.

See you in the next post! Thanks for reading!

--

--

EKS Project
Into becoming a Data Scientist

EKS project is about a young woman hyped about life. Looking for a purpose. Data Scientist & Biomedical Engineer working my way up. erika.kvalem@gmail.com