Separating house prices with t-SNE

t-SNE is a non-linear dimensionality reduction technique. It embeds high-dimensional data into a lower-dimensional space. As it does so, it strives to keep points next to their original neighbors. You will create a t-SNE plot that you can compare with the PCA plot in the last exercise. PCA preserves the global structure of the data, but not the local structure. t-SNE preserves the local structure by keeping neighbors in the higher-dimensional space close to each other in the lower-dimensional space. You will see this in the plots.

You will apply t-SNE to reduce the house_sales_df. The target variable of house_sales_df is price. The tidyverse and Rtsne packages have been loaded for you.

This exercise is part of the course

Dimensionality Reduction in R

View Course

Exercise instructions

Fit t-SNE to house_sales_df using Rtsne().
Bind the t-SNE X and Y coordinates to house_sales_df.
Plot the t-SNE results using ggplot(), encoding the target variable in color.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Fit t-SNE
set.seed(1234)
tsne <- ___(___ %>% select(-___), check_duplicates = FALSE)

# Bind t-SNE coordinates to the data frame
tsne_df <- ___ %>% 
  ___(tsne_x = ___$___[,___], tsne_y = ___$___[,___])

# Plot t-SNE
___ %>% 
  ___(aes(x = ___, y = ___, color = ___)) +
  geom_point() +
  scale_color_gradient(low="gray", high="blue")

Edit and Run Code