Calculate several metrics by sector and exchange
The .agg()
function allows you to aggregate your data in even more ways. Providing a list of names of statistical methods calculates more than one summary statistic at once. You can provide new names for the aggregated columns using the rename method, which takes a dictionary argument where the keys are the names of the metrics you computes and the values are your desired new names.
In this exercise, you will calculate the mean, median, and standard deviation of market capitalizations in millions of USD. pandas
as pd
and matplotlib.pyplot
as plt
have been imported, and the listings
DataFrame, with reference column 'Exchange'
is available in your workspace.
This exercise is part of the course
Importing and Managing Financial Data in Python
Exercise instructions
- With broadcasting and
.div()
, create a new column'market_cap_m'
that contains the market capitalization data in millions of USD. - Group your data by both
'Sector'
and'Exchange'
, assigning the result toby_sector_exchange
. - Assign the
market_cap_m
column ofby_sector_exchange
to a variablebse_mcm
. - Use
.agg()
to calculate the mean, median, and standard deviation formarket_cap_m
, and call the rename method with a dictionary argument for the keyword parametercolumns
storing the results in'Average'
,'Median'
, and'Standard Deviation'
, respectively, and assign tosummary
. - Print the result to your console.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create market_cap_m
listings['market_cap_m'] = ____[____].div(1e6)
# Group listing by both Sector and Exchange
by_sector_exchange = ____.____(['Sector', 'Exchange'])
# Subset market_cap_m of by_sector_exchange
bse_mcm = ____[____]
# Calculate mean, median, and std in summary
summary = ____.____(['____', '____', '____']).rename(columns={'mean': ____, 'median': ____, 'std':____})
# Print the summary
print(summary)