t-SNE

In this lesson, we show you how to visualize High-dimensional data by t-SNE.

What is t-SNE

Manifold learning is an approach to non-linear dimensionality reduction. Algorithms for this task are based on the idea that the dimensionality of many data sets is only artificially high.

We are not talking about the topic of manifold learning. The purpose of talking about manifold learning is that we would use this technique to do some visualization on our dataset. In the real world, many datasets are very high dimensional data. Such high dimensional data can’t be visualized in a 2D or 3D space. You may think that we can use the dimensionality reduction to process the data, such as PCA, to 2-dimension, and plot it. Manifold Learning can be thought of as an attempt to generalize linear frameworks like PCA to be sensitive to nonlinear structure in data. There are many methods in manifold learning, such as Isomap, MDS, LLE, and t-SNE. t-SNE is one of the most commonly used methods, and you may have already seen it in many papers.

t-distributed stochastic neighbor embedding (t-SNE) is one of a family of stochastic neighbor embedding methods. The algorithm computes the probability that pairs of data points in the high-dimensional space are related and then chooses low-dimensional embeddings that produce a similar distribution. In this lesson, we focus on the t-SNE.

Get hands-on with 1200+ tech skills courses.