Dimensionality reduction: feature extraction

1. Dimensionality reduction: feature extraction

Welcome to Chapter 3 on unsupervised learning!

2. Unsupervised learning methods

As you know, unsupervised learning means that there is no target variable that is used to predict anything. That said, unsupervised learning is used to learn patterns that exist in the data using principal component analysis and singular value decomposition for dimensionality reduction, grouping, aka clustering, and exploratory data mining. We'll cover PCA and SVD in this lesson and clustering in later lessons.

3. Dimensionality reduction != feature selection

Sometimes dimensionality reduction can be confused with feature selection, which we covered early in chapter 2. Although both methods reduce the number of features in a dataset, dimensionality reduction techniques do so by creating NEW combinations of features. Feature selection includes or excludes features based on its relationship to the target variable, but there is no feature transformation happening as there is with dimensionality reduction.

4. Curse of dimensionality

As data scientists, we tend to learn that starting with as many features as possible is best practice. However, there is this phenomenon that happens where model performance decreases as the number of features increase. This is the so-called 'curse of dimensionality.' In a high dimensionality context, the feature space becomes more sparse, which means there is empty space between values contained in the search space. This, in turn, leads to a "perfect" solution which of course indicates overfitting. We've discussed overfitting and why that leads to poor model generalization. And the answer to preventing overfitting due to high-dimensionality is to perform dimensionality reduction.

5. 1-D search

To put this concept into perspective, imagine that you took your dog for a walk to the end of the block and back and somewhere along the way you lost your wedding ring. To locate it, you'd simply retrace your steps.

6. 2-D search

But now imagine you lost it while you were playing soccer with your friends. It wouldn't be as easy to find, but still doable.

7. 3-D search

And now imagine that you're a backhoe operator on a construction site and you just spent the whole day moving dirt. Your heart sinks as you realize your search is 3-dimensional. You can quickly imagine how adding more dimensions continues to exponentially increase the search space. Good thing your ring is at home sitting by the sink.

8. Dimensionality reduction methods

In the exercises, you'll practice principal component analysis, more commonly called PCA, and singular value decomposition, or SVD.

9. PCA

PCA, instead of attempting to predict the target variable y from the X matrix, attempts to learn about the relationship between x and y which is calculated by finding the principal axes. Through translating, rotating, and scaling the data to locate the direction of maximal variance, the first principal component is assigned with the second pc, as it's called, being a direction perpendicular to the first. This results in a lower dimensional projection of the data that maintains the maximal amount of variance.

10. SVD

SVD also gives back principal components, but uses linear algebra and vector calculus to decompose the data matrix into three matrices which results in what are called singular values. The sum of the squares of the singular values should approximately equal the total variance in the original data matrix. We won't get into the technical aspects of it here as it isn't necessary to understand it in order to use it.

11. Dimension reduction functions

As usual, we'll use some packages from sklearn to practice. Decomposition dot pca is used to perform principal component analysis while decomposition dot truncatedsvd is used for singular value decomposition. The 2 methods you'll use work for both pca and svd with dot fit transform doing just that and dot explained variance ratio giving the variance explained by each principal component. Finally, here is a link to many other matrix decomposition methods in sklearn.

12. Let's practice!

Finally time for you to try dimension reduction yourself!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.