My Bachelor’s Degree in statistics. It means is that I am familiar with numbers but without a coding background. After graduation, I decided to enroll in a Big Data Analytics program in Canada. What I have learned through my education is that if I combine numbers with coding knowledge, I can catch the new century. However, there is an important point to remember is that data is the oil of our stories. Without data, we cannot do anything even we have programming knowledge.

The world is a complex place and people need answers to make a decision or discover new things in regards to many topics. Yes, that is why numbers provide a way for people but, sometimes people can get confused about which numbers are useful or what is the meaning of them… So, statistics can demonstrate the situation using numbers and help us quantify uncertainty.

In today’s world, also AI is a hot topic, but also most people don’t realize how statistics is necessary. Our daily life is surrounding by data and we need statistics to make that data useful. Let’s begin to understand the basic concept of statistics.

According to Wikipedia “statistics is the discipline that concerns the collection, organization, analysis, interpretation and presentation of data.”

Data Collection

The first step is data collection whether you’re a business, researcher, or student, collecting data needs to be one of your top priorities.

Data collection is one of the most important stages of conducting research. It does not matter you are doing the best project in the world if you cannot collect enough data. Data collection could be a tough part of the project, that is why you need to thorough planning, hard work, and patience to be able to complete the task successfully. Before start to the data collection, you need to identify your goals/questions of the project and according to it, you can select a sample from a certain population.

Types of Data

Statistical tests are important in order to make a decision about processes. The purpose is to determine whether there is enough evidence to reject or accept a hypothesis about the process. To decide which tests are useful, we need to understand which data type we are dealing with.

A discrete variable can only take certain values, countable items. For example, sex, blood group.

Continuous variables can take any value within the range of the scale. For example, it could be float numbers such as 1.50.

In general, to contribute measurements, types of data are valuable for data collectors. There could be a question that which data is good- we are concerned with many things and for example, opinions can’t be measured effectively (opinions are discrete data). Generally, high sensitivity (how close to the target) and various options for analysis show us the reason why continuous variables are “better”.

Population and Sample

A population is any group of interests or any group that researchers want to learn more about. The population can be people or other things, such as animals or objects in which information is desired.

A sample is a group of data is drawn from the population of interest.

A population parameter is the characteristics of the population.

A sample statistic is the characteristics of the sample.

Measures of Tendency

Measures of central tendency play a very important role in descriptive statistics. Descriptive statistics provide simple summaries about the sample and the measures.

Mean(Arithmetic): The mean is the sum of the value of each observation in a dataset divided by the number of observations.

Mode: The mode is the most commonly occurring value in a distribution.

Median: The median is the middle value in a distribution.

In the example;

Thanks a lot for reading!



Öykü Başaran
Deep Learning Türkiye

Bachelor’s Degree in statistics at YTU, and post-grad in Big Data Analytics in Canada. Currently, I work for Global AI Hub as a Business Development Specialist.