Exploring with box plots
Two common formats of DataFrames are the wide format and long format. The wide format shows different variables represented in different columns, while the long format displays different variables represented by two columns together (one for the variable name and the other for the corresponding values).
Long versions of DataFrames can be useful for easily creating different visualizations, including the boxplot that you will create in this exercise after converting df_diffs
(loaded for you) from wide to long format.
pandas has been loaded for you as pd
, matplotlib.pyplot
as plt
, and Seaborn as sns
.
Cet exercice fait partie du cours
Monte Carlo Simulations in Python
Instructions
- Convert the
bmi
andhdl
columns (specified in that order) of thedf_diffs
DataFrame from wide to long format; save the long DataFrame ashdl_bmi_long
and name the column that will contain the variable valuesy_diff
. - Use a boxplot to visualize the results of patients in the first or last quartile of the
hdl
andbmi
variables.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Convert the hdl and bmi columns of df_diffs from wide to long format, naming the values column "y_diff"
hdl_bmi_long = df_diffs.____(value_name=____, value_vars=____)
print(hdl_bmi_long.head())
# Use a boxplot to visualize the results
____
plt.show()