Understanding Standard Deviation, Confidence Intervals, and Variance in Machine Learning

Akshay Ravindran
Javarevisited
Published in
4 min readMay 11, 2023

--

Photo by Andrea Piacquadio:

Standard Deviation

Standard deviation is a statistical measure of how spread out a set of data is from its mean or average value. It is a measure of the amount of variation or dispersion of a set of data points.

In machine learning: standard deviation is commonly used as a measure of how spread out the data points are in a given dataset. This can be useful in various tasks such as data preprocessing, feature engineering, and model selection.

In data preprocessing:

standard deviation can be used to identify and remove outliers, which are data points that fall outside of the normal range of values for a given feature. This can help improve the accuracy of the model by removing noisy data.

In feature engineering:

standard deviation can be used to scale and normalize features, which can improve the performance of the model by making the features more comparable and easier to work with.

In model selection:

standard deviation can be used as a metric to evaluate the performance of different models. For example, a model with a lower standard deviation on a test dataset may…

--

--

Akshay Ravindran
Javarevisited

Code -> Understand-> Repeat is my motto. I am a Data Engineer who writes about everything related to Data Science and Interview Preparation for SDE.