Hierarchies of stocks
In chapter 1, you used k-means clustering to cluster companies according to their stock price movements. Now, you'll perform hierarchical clustering of the companies. You are given a NumPy array of price movements movements
, where the rows correspond to companies, and a list of the company names companies
. SciPy hierarchical clustering doesn't fit into a sklearn pipeline, so you'll need to use the normalize()
function from sklearn.preprocessing
instead of Normalizer
.
linkage
and dendrogram
have already been imported from scipy.cluster.hierarchy
, and PyPlot has been imported as plt
.
This exercise is part of the course
Unsupervised Learning in Python
Exercise instructions
- Import
normalize
fromsklearn.preprocessing
. - Rescale the price movements for each stock by using the
normalize()
function onmovements
. - Apply the
linkage()
function tonormalized_movements
, using'complete'
linkage, to calculate the hierarchical clustering. Assign the result tomergings
. - Plot a dendrogram of the hierarchical clustering, using the list
companies
of company names as thelabels
. In addition, specify theleaf_rotation=90
, andleaf_font_size=6
keyword arguments as you did in the previous exercise.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import normalize
____
# Normalize the movements: normalized_movements
normalized_movements = ____
# Calculate the linkage: mergings
mergings = ____
# Plot the dendrogram
____
plt.show()