LoslegenKostenlos loslegen

Scatter matrix of numeric columns

You've investigated the new farmer's market data, and it's rather wide – with lots of columns of information for each market's row. Rather than painstakingly going through every combination of numeric columns and making a scatter plot to look at correlations, you decide to make a scatter matrix using the pandas built-in function.

Increasing the figure size with the figsize argument will help give the dense visualization some breathing room. Since there will be a lot of overlap for the points, decreasing the point opacity will help show the density of these overlaps.

Diese Übung ist Teil des Kurses

Improving Your Data Visualizations in Python

Kurs anzeigen

Anleitung zur Übung

  • Subset the columns of the markets DataFrame to numeric_columns so the scatter matrix only shows numeric non-binary columns.
  • Increase figure size to 15 by 10 to avoid crowding.
  • Reduce point opacity to 50% to show regions of overlap.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Select just the numeric columns (exluding individual goods)
numeric_columns = ['lat', 'lon', 'months_open', 'num_items_sold', 'state_pop']

# Make a scatter matrix of numeric columns
pd.plotting.scatter_matrix(markets[____], 
                             # Make figure large to show details
                             figsize = ____, 
                           # Lower point opacity to show overlap
                           alpha = ____)

plt.show()
Code bearbeiten und ausführen