Calculate several metrics by sector and IPO year
The seaborn
pointplot()
function facilitates the comparison of summary statistics of a numerical variable for different levels of categorical variables:
seaborn.pointplot(x=None, y=None, hue=None, data=None, ...)
In the video, you saw a visualization for the market capitalization (the numerical variable) differentiated by whether the IPO (the categorical variable) occurred before (first level) or after (second level) the year 2000.
In this exercise, you will compare the mean market capitalization for each year since 2000 for the NYSE and the NASDAQ, after excluding outliers beyond the 95th percentile. pandas
as pd
and matplotlib.pyplot
as plt
have been imported, and the listings
DataFrame with reference column 'Exchange'
is available in your workspace.
This exercise is part of the course
Importing and Managing Financial Data in Python
Exercise instructions
- Import
seaborn
assns
. - Filter
listings
to have companies with IPOs after 2000 from all exchanges except the'amex'
. - Convert the data in column
'IPO Year'
to integers. - Create the column
market_cap_m
to express market cap in USD million. - Filter
market_cap_m
to exclude values above the 95th percentile. - Create a
pointplot
oflistings
using the column'IPO Year'
forx
,'market_cap_m'
fory
, and'Exchange'
forhue
. Show the result after rotating thexticks
by 45 degrees.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import the seaborn library as sns
____
# Exclude IPOs before 2000 and from the 'amex'
listings = ____[(____['IPO Year'] > ____) & (listings.Exchange != ____)]
# Convert IPO Year to integer
listings['IPO Year'] = ____['IPO Year'].____(____)
# Create market_cap_m
listings['market_cap_m'] = ____['Market Capitalization'].div(1e6)
# Exclude outliers
listings = listings[listings.____ < listings.____.____(.95)]
# Create the pointplot
sns.pointplot(x=____, y=____, hue=____, data=____)
# Rotate xticks
plt.____(____=____)
# Show the plot
plt.show()