Data Science

The Basics.



You know when you commit to something, then you realize it's bigger than you thought it would be — you start to thank yourself for that first step you took. Such a journey in my life has been the udacity Bertelsmann scholarship.

As big as the term ‘Data Science’ sounds, it’s fundamentals are so simple you might just miss it. This is why I’m putting together a quick fix for those starting late in this journey or others who just want to revise the basics again, like me.

I begin by first crediting the udacity Bertelsmann lessons for most of the info in this article and some reading outside. I’ve made the below illustrations to make sure the understanding and learning is more fun. Enjoy!

Let’s get started:

If you’re just starting out the udacity journey, do NOT miss out on the following. Make sure to understand them and then move onto other lessons.

Data Science basic definitions

Now that we got the definitions, let’s look at some examples of constructs and operational definition.

Constructs and Operational definition

Population and Sample.

Example of Kenya’s population and sample
Population and Statistic

In this data science course, there was a lot of emphasis on EXPERIMENTATION and elements associated to it. Let’s break it down:

(i)Case study:

Firstly, in any experiment, there needs to be a treatment group. A treatment group is the chosen group of subjects to receive treatment. This group will be monitored to understand the results of the treatments. Each subject in the treatment group is the variable measured and observed, they are also known as the dependent variable. The dependent variable is what is plotted on the y-axis of the graph. The variable that the experimenter can change is called independent variable, and plotted on the x-axis. In this case, the independent variables are the fertilizers.

In the above illustration, there are 4 subjects in the treatment group. The first 3 pots/subjects are the dependent variables. They receive three different treatments. Treatment 1 is to pour high-nitrogen fertilizer. Treatment 2 is to pour high-potassium fertilizer and finally treatment 3 is to pour a high-phosphorous fertilizer. Now, the last pot remains untouched. This is called a ‘control group’ — A control group is a subject in the group that receives no treatments. It’s the constant that’s used for comparisons to know the actual impact of the treatment and its effects on the other subjects.

(ii) Experiments where there are NO treatments are called : Observational experiments. These kind of experiments do not interfere with the subjects but merely observe. These kind of experiments make a comparison of two groups of people from a population.

Case study:

The observation on the left is only for comparison purposes. Thereby, no subjects are altered and no treatments are given.

This sort of experiment results in associations. To know more on this read my previous article on

In short, are the energy levels of these people only as a result of what they eat? Energy levels can also be dependent on number of hours slept, general health conditions or perhaps even dehydration. Always be careful before saying x is the result of y. There might be a correlation but not a causation.

(iii) This third part of experiments is to make a mental note of how placebo and blinding.

Generally, a placebo is packaged to look like the real drug/treatment or shot but inherently it’s a dupe. It has no effect on the general health. Scientists include placebos in their experiments to understand the effects of the real drugs.

Below is an example:

The sample initially don’t know if they’re given the real drug or the placebo. This’s is known as blinding and a single blind from the participants side. This way the experimenters can determine the effectiveness of the multi-vitamin and its side-effects.


This should give anyone a good head start in the udacity Data science course. I hope to be making more summaries and illustrations for the other lessons more often.

Remember, there’s more to data than you might think!

Let’s talk on:



Kalpa Vrikshika
Udacity Bertelsmann Data Science Scholarship 2018/19 Blog

~Data foundations graduate~ ~Udacity Bertelsmann Data Science Scholar~ ~Believing until I become it~ ~Happy place~