Session Ready
Exercise

Correlation between PC1 and library size

The first principal component captures the highest variation in the data and is often correlated with the library size, which means that the highest variation in the data is technical (total number of reads sequenced for each cell) instead of biological.

You will plot PC1 (x-axis) versus library size (y-axis) using ggplot() to observe correlation between the first principal component (PC1) and library size. The SingleCellExperiment object sce has been preloaded for you, and the SingleCellExperiment and ggplot2 libraries have been imported.

Instructions
100 XP
  • Extract the first two PCs from object sce using reducedDim(), assigning them to pca.

  • Create a data frame cdata using data.frame() with three columns: PC1, libsize , and batch, containing the first component of pca, total counts of the SCE object, and the batch column of sce respectively.

  • Create a scatterplot of PC1 (x-axis) versus libsize (y-axis), with the cells colored by batch.