# Crowding problem

**What is Crowding problem?**

Generally t-SNE preserve the distance in a Neighborhood(N) this could create a problem while embedding. Let us prove it with the help of proof by contradiction.

Let us assume two dimensional data to be projected to one dimensional data. *x1*, *x2*, *x3 *and *x4 *are the four corners of a square and side of square is *d*.

Neighborhood of x1 contains x2 and x4

N(x1) = {x2, x4}

Neighborhood of x3 contains x4 and x2

N(x3) = {x4, x2}

When t-SNE is applied all the points in the two dimensional space are embedded into one dimensional space one after other. As *x2 *is in the neighborhood of *x1 *and *x3 *need to preserve the distance between them.

In the two dimensional space distance between *x1 *and *x3 *is √*2d *but in the one dimensional space it is *2d *as *x3 *is not in the neighborhood of *x1 *no need to preserve distance between them.

Now the problem is placing the point *x4 *as it in the neighborhood of both *x1 *and *x3*. If we place *x4 *to the right of *x1 *the distance to *x3 *becomes *3d *whereas it is *d *in the two dimensional space, similarly placing *x4 *to the left of point *x3 *the distance from *x1 *becomes *3d *as it is in the neighborhood of both *x1 *and *x3 *the distance to *x4 *need to be preserved. It is impossible to preserve the distance for both. So, crowding problem occurs.

*Sometimes, it is impossible to preserve the distance in all neighborhood(N). Such a problem is called Crowding problem.*

Earlier, Stochastic Neighborhood Embedding (SNE) is used for dimensional reduction they often end up with Crowding problem so, t-SNE were introduced to overcome this problem. t-SNE doesn't guaranty to resolve the crowding problem all the times it gives its best to overcome the crowding problem and preserving the neighborhood distance.

That’s all folks,

See you in my next article.