Session Ready
Exercise

Plotting spatial distribution of true classes

As seen in the video, you can use the obtained representation of t-SNE in a lower dimension space to classify new digits based on the Euclidean distance to known clusters of digits. For this task, let's start with plotting the spatial distribution of the digit labels in the embedding space. You are going to use the output of a t-SNE execution of 10K MNIST records named tsne and the true labels can be found in a dataset named mnist_10k.

In this exercise, you will use the first 5K records of tsne and mnist_10k datasets and the goal is to visualize the obtained t-SNE embedding.

The ggplot2 package has been loaded for you.

Instructions
100 XP
  • Prepare a data frame with the first 5000 records of tsne$Y output for both dimensions and the true label using the first 5000 rows from the mnist_10k.
  • Plot the obtained embedding of the first 5K records from tsne dataset using ggplot() and you want the label and color of the points to change based on the true digit label.