ComeçarComece de graça

Scatter matrix of numeric columns

You've investigated the new farmer's market data, and it's rather wide – with lots of columns of information for each market's row. Rather than painstakingly going through every combination of numeric columns and making a scatter plot to look at correlations, you decide to make a scatter matrix using the pandas built-in function.

Increasing the figure size with the figsize argument will help give the dense visualization some breathing room. Since there will be a lot of overlap for the points, decreasing the point opacity will help show the density of these overlaps.

Este exercício faz parte do curso

Improving Your Data Visualizations in Python

Ver curso

Instruções do exercício

  • Subset the columns of the markets DataFrame to numeric_columns so the scatter matrix only shows numeric non-binary columns.
  • Increase figure size to 15 by 10 to avoid crowding.
  • Reduce point opacity to 50% to show regions of overlap.

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Select just the numeric columns (exluding individual goods)
numeric_columns = ['lat', 'lon', 'months_open', 'num_items_sold', 'state_pop']

# Make a scatter matrix of numeric columns
pd.plotting.scatter_matrix(markets[____], 
                             # Make figure large to show details
                             figsize = ____, 
                           # Lower point opacity to show overlap
                           alpha = ____)

plt.show()
Editar e executar o código