Can t-SNE help you analyzing features? The answer is obviously “Yes”! (and with clustering it is even better)
Laurae: This post is about visualizing dense features in a two-dimensional space using t-SNE without having to spend hours and large hours of t-SNE computations, all with a simple clustered visualization. It uses the correlation matrix of features. The post was made originally at Kaggle.
T-SNE on features is pretty interesting I must say. No wonder v50 is loved by (nearly, if not all) predictive models.
Settings: perplexity = 43, exact T-SNE (theta = 0), no PCA, got rid of v107 for being a duplicate
Post-processing: 3 clusters.
P.S: had only 131 lines in my input matrix if you are about to question about it.
From the edit: cumulated 1000 times T-SNE (dimensions are therefore ~100 times smaller). One T-SNE is not enough but it revealed things that cannot be seen using only one T-SNE. 1000 times small here and 1000 times large version here
Don’t ask me why v74 got preferred in cluster 2, (you probably and) I don’t agree with kmeans repeated 500 times. I think tomorrow I’ll strap 10000 separated T-SNE of 1000 size. It takes only 0.30 second to do a 1000-iteration T-SNE on a 131x131 matrix (and not this long for 1000 repetitions).