t-distributed Stochastic Neighbor Embedding: R and Python codes– All you have to do is just preparing data set (very simple, easy and practical)

I release R and Python codes of t-distributed Stochastic Neighbor Embedding (tSNE). They are very easy to use. You prepare data set, and just run the code! Then, the two-dimensional map of tSNE can be obtained. Very simple and easy!

You can buy each code from the URLs below.


 Please download the supplemental zip file (this is free) from the URL below to run the tSNE code.


 Please download the supplemental zip file (this is free) from the URL below to run the tSNE code.

Procedure of tSNE in the MATLAB, R and Python codes

To perform appropriate tSNE, the R and Python codes follow the procedure below, after data set is loaded.

1. Autoscale explanatory variable (X) (if necessary)
 Autoscaling means centering and scaling. Mean of each variable becomes zero by subtracting mean of each variable from the variable in centering. Standard deviation of each variable becomes one by dividing standard deviation of each variable from the variable in scaling.
 tSNE is based on Euclidean distance between samples. If the distance after autoscaling is required, please autoscale X-variables. Usually, autoscaling is done.

2. Decide perplexity
 Perplexity is related to the number of nearest neighbors considered for each sample in visualization. Values from 5 to 50 are recommended.

3. Visualize data set by tSNE
 If samples are not dispersed on the map so much, please change perplexity and rerun tSNE.
 When new samples are obtained and they should be visualized as well, please rerun tSNE with both initial samples and new samples.

How to perform tSNE?

1. Buy the code and unzip the file

R: https://gum.co/dKiRZ

Python: https://gum.co/FvUuE

2. Download and unzip the supplemental zip file (this is free)

R: http://univprofblog.html.xdomain.jp/code/R_scripts_functions.zip

Python: http://univprofblog.html.xdomain.jp/code/supportingfunctions.zip

3. Place the supplemental files at the same directory or folder as that of the tSNE code.

4. Prepare data set. For data format, see the article below.


5. Run the code!

Scores on the two-dimensional space for all samples are saved as “LowDimensionScore.csv”.

Required settings

Please see the article below.

Examples of execution results