Basic Statistics

Kerem Erçin
EVERYTHİNG
Published in
4 min readMar 7, 2021

Statistical Thinking

Statistical thinking is at the very beginning of the data analysis process. The first thing we need to learn in data science is basic statistics. Statistical thinking also increases your analytical thinking skills. Analytical thinking skills help you solve complex problems quickly.

Photo by KOBU Agency on Unsplash

Now I will explain the basic statistical concepts to you.

Sample:

A small group of data selected from a group of data is called a sample. If data is collected from the entire large group, this is called an integer.

For example, it can be very difficult to process 100 million data, or it may take a lot of time, but it will be easier for you to take a sample of this big data and process 1 million data.

Observation Unit:

The unit of observation is the resource that has the feature examined in the analyzed data set and can be expressed numerically. The smallest piece of society is called an observation unit.

For example, in a review where we will consider schools, the observation unit could be students or teachers. In hospitals, there may be patients or doctors.

Parameter:

We refer to the properties of the main body that can be analyzed and expressed numerically. To determine the parameters, all the information in the main mass must be accessed. The main mass can have more than one parameter.

We call statistics to numerical values ​​measured from the sample. We call the numerical values ​​measured from the main mass as parameters.

Variables:

Variables are objects or properties that can remain constant and vary. Variables are divided into 6 as quantitative, qualitative, continuous, discrete, independent, and dependent.

Quantitative variable:

Variables that can be measured and expressed numerically are called quantitative variables.

Qualitative variable:

Variables that can be expressed in words and words are called qualitative variables.

Continuous variable:

Variables that can take unlimited values ​​are called continuous variables.

Discrete variable:

Variables that take value in a certain number of limited ways are called discrete variables.

Independent variable:

Variables that do not change for any reason are called arguments.

The dependent variable:

Variables that vary according to independent variables are called dependent variables.

Scale:

The tools used to measure things are called scales. Scales are divided into 4 categories: classification, ordering, equidistant and equidistant.

Classification scales:

The categorization and grouping of objects are called classification scales. Mathematical operations are not performed, it is the type of scale in which mathematical operations are used the least.

Ranking scales:

It is the type of scale that we consider according to the specific characteristics of the objects and order them from large to small or from small to large. Mathematical operations are not performed.

Evenly spaced scales:

It is called taking the starting point of the feature and measuring it at certain intervals. Mathematical addition and subtraction can be done.

Equally proportional scales:

It is called taking the starting point of the feature and measuring it with certain ratios. Mathematical addition and subtraction can be done.

The difference between them is that one is at equal intervals and the other is at equal proportions.

Photo by Jeswin Thomas on Unsplash

Arithmetic mean:

The division of the sum of the elements by the number of elements is called the arithmetic mean. It is also a point of balance.

Median:

When the elements are ordered from large to small or from small to large, the number left in the middle is called the median.

Mode:

The most repetitive number in the elements is called a mode. It is also called peak value. It is the element with the highest frequency value. There can be multiple modes.

Quartiles:

3 different values ​​that divide the elements into 4 parts by ordering them from large to small or from small to large are called quartiles.

Range of change:

The difference between the largest value and the smallest value is called the range of variation.

Standard deviation:

The square of the sum of the squares of the differences of the elements from the arithmetic mean is one less than the number of elements divided by the standard deviation.

If the standard deviation is small, it is close to the average, if it is large, it shows a distant distribution.

Variance:

The ratio of the sum of the squares of the differences of the elements from the arithmetic means to the number of values ​​is called variance. It is also known as the square of the standard deviation.

Photo by Antoine Dautry on Unsplash

Distortion:

Skewness and agglomeration tests are performed to test distributions. It is analyzed whether a distribution is symmetrical or not.

In right-skewed distributions, the mean > median > mode, in left-skewed distributions the order is as mode > median > mean.

In cases of kurtosis, the standard deviation may give high results.

Thank you for reading. You can learn more about the software by following me.

--

--