IniziaInizia gratis

Clustering the fish data

You'll now use your standardization and clustering pipeline from the previous exercise to cluster the fish by their measurements, and then create a cross-tabulation to compare the cluster labels with the fish species.

As before, samples is the 2D array of fish measurements. Your pipeline is available as pipeline, and the species of every fish sample is given by the list species.

Questo esercizio fa parte del corso

Unsupervised Learning in Python

Visualizza il corso

Istruzioni dell'esercizio

  • Import pandas as pd.
  • Fit the pipeline to the fish measurements samples.
  • Obtain the cluster labels for samples by using the .predict() method of pipeline.
  • Using pd.DataFrame(), create a DataFrame df with two columns named 'labels' and 'species', using labels and species, respectively, for the column values.
  • Using pd.crosstab(), create a cross-tabulation ct of df['labels'] and df['species'].

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

# Import pandas
import pandas as pd

# Fit the pipeline to samples
____

# Calculate the cluster labels: labels
labels = ____

# Create a DataFrame with labels and species as columns: df
df = ____

# Create crosstab: ct
ct = ____

# Display ct
print(ct)
Modifica ed esegui il codice