Reducing Dimensionality from Dimensionality Reduction Techniques
Elior Cohen
1.1K8

This is great. What if I have a mixed dataset with numeric and categorical values? What do I need to do? PCA works only on numbers. Is it ok to factorise the categorical variables and use the underlying numeric representation of the factor? If we use the dummy approach (as implemented in the dummies R-package — basically creating 1 binary column for each level we end up with a potentially very high number of columns and the weights of the principal components will be very low so basically you need to consider a lot of components to not loose information). Does it make sense?

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.