Session Ready
Exercise

Computing the centroids of each class

Since the previous visual representation of the digit in a low dimensional space makes sense, you want to compute the centroid of each class in this lower dimensional space. This centroid can be used as a prototype of the digit and you can classify new digits based on their Euclidean distance to these ones.

The MNIST data mnist_10k and t-SNE output tsne are available in the workspace. The data.table package has been loaded for you.

Instructions
100 XP
  • Get the first 5000 records from the precomputed t-SNE output tsne as a data.table and set the column names as "X" and "Y".
  • Paste the label column from mnist_10k dataset as label column in the data.table with factor data type.
  • Compute two new columns with names mean_X and mean_Y by calculating the mean() of X and Y for every label.
  • Get unique records, one for every label (10 records in total).