Statistics and Probability

The fundamentals of machine learning

vasanth ambrose
PerceptronAI
2 min readAug 6, 2020

--

Photo by Tomáš Malík from Pexels

Data is the most available and valuable resource of the present age we live in. Data is derived from the Latin word ‘datum’ which means “characteristics or information which are collected through observation”. It can be of any form such as text, audio, video, image, and so on.

Due to the abundance of data, we create models using machine learning, which learns from the previous data to predict the new data in the future. During the learning process, the data in any form is converted to binary form, which is the machine-readable form.

Statistics and probability play an important role in machine learning because it deals with data in numeric form.

Statistics in machine learning

Statistics is generally considered as a prerequisite to the field of machine learning. Raw observations should be converted into valuable information. Statistics helps to transform the observations we have into information and also to find answers to the questions that we have about the similar relatable data. Statistics basically deals with the analysis of the data gathered in the past. The data may be discrete or continuous.

For example, the problem may be regression or classification. The algorithms use statistical methods to produce a model. The errors in linear regression are normally distributed. The mean square error which is calculated in the gradient descent is done by taking the mean of all the inputs and squaring at the end and the process continues until the error becomes minimum.

Probability in machine learning

Probability refers to the prediction of an event happening.

An event is a set of outcomes of an experiment to which the probability is assigned. The variable that represents the outcome is known as the random variable. The final fully connected layer in any neural network should have a value between 0 and 1, which is similar to the probability distribution. Activation functions such as softmax transform the values in the normal distribution to values between 0 and 1, which is the probability distribution. Bayer’s rule deals with the conditional probability, which is used in the Naive Bayes algorithm. However in the end the model predicts the new data entered based on the closeness of the value between 0 and 1.

Conclusion

As a machine learning engineer, we design statistical models that help us to predict the future or outcome using probability.

--

--