My Experience Learning about PCA: From Clueless to Dreaming About It

Fé Valvekens
3 min readJun 5, 2023

--

Last night, I had a vivid dream about a topic I was not even familiar with, 2 days ago. Yes, it was Full Moon, but still.

The Conference of the Birds, artwork by Juan Ford

I started an online course on Data Science 2 weeks ago, so far I covered: Python (what a ride!) and Statistics. I graduated as an Engineer then moved to the creative field after 7 years of corporate work in a consulting firm. From low/no coding in my 20s, to entrepreneurship in 30s and I am now back to full coding in my 40s. Side note: I’m loving the ease of Python, coming from my first line of code in Turbo Pascal, I can now understand the buzz around Python.

Yesterday, I felt proud when I submitted my first graded assignment which involved performing univariate and multivariate data analysis on a given dataset, exploring correlations between variables and plotting the data to create cool visualizations. Riding on this wave of self accomplishment, I took on preparing myself for the next class, which would be about Principal Component Analysis (PCA) and Stochastic Neighbor embedding (SNE).

I plunged back into the matrices and vectors, multiplying matrices and determining the Eigenvectors. Now, this was challenging because it brought me back to my math classes in Engineering school, where I was lost in the algebra Matrix, so to speak, and decided that I would not pursue a career in this field. I broke through that chain of thought and learned more about dimensionality reductions. I even enjoyed the video from 3 Blue 1 Brown on Eigenvectors and Eigenvalues. Just fascinating.

Back to my dream. I was floating in space together with millions of little points, scattered all around me. Then a force pulled all the points together and I could clearly see the axes around which the points were gravitating around: PC1 along the direction of the largest variance, and PC2, perpendicular to PC1. At that moment, it all made sense, it was a powerful and beautiful visual experience.

A scatter plot of samples that are distributed according a multivariate (bivariate) Gaussian distribution centered at (1,3) with a standard deviation of 3 in the (0.866, 0.5) direction and of 1 in the orthogonal direction. The directions represent the Principal Components (PC) associated with the distribution. https://commons.wikimedia.org/wiki/File:GaussianScatterPCA.svg

With that in mind, I no longer shy away from the mathematical formulas that I once dreaded in Engineering school. To be clear, I am not falling in love with them, but seeing the application in ‘real life’ adds a new flavor.

From feeling clueless and overwhelmed by the math formulas, to having a powerful dream that helped me understand the concept on a deeper level, I have come a long way in just 2 weeks into this course. Whether you’re a creative professional like me, or just someone who is curious about the world of data science, I invite you to take on learning something new, out of your comfort zone. Who knows, you might just have a dream about PCA too!

--

--

Fé Valvekens

engineer, mom of 3, yogi, movement addict, interior designer, budding data scientist