Member-only story
3 Easy Steps to Perform Dimensionality Reduction Using Principal Component Analysis (PCA)
Running the PCA algorithm twice is the most effective way of performing PCA
What is the dimensionality of a dataset?
In the context of both statistics and machine learning, the dimensionality of a dataset refers to the number of input variables (features) in the dataset.
If the dataset contains only two input variables as in the following image, it is called a two-dimensional dataset. In this case, the observations (data points) can be plotted in a 2D scatterplot.
If we add another variable called Age to the same dataset, we implicitly add another dimension to the dataset. Now, the dataset becomes three-dimensional and the observations (data points) can be plotted in a 3D scatterplot.
Likewise, the dataset is very high-dimensional when there are many variables in the dataset. It is…