Evaluating BMI and HDL outcomes
What is the difference in the predicted disease progression (the response y
) for patients who are in both the top 10% of BMI and the top 25% of HDL compared to
those in both the lowest 10% of BMI and the lowest 25% of HDL? Again, a simulation has already been performed for you: your task is to evaluate the simulation results in df_results
to find an answer to this question!
The following libraries have been imported: pandas
as pd
, numpy
as np
, and scipy.stats
as st
.
This exercise is part of the course
Monte Carlo Simulations in Python
Exercise instructions
- Complete the mean outcome definitions by filtering the results for patients who are in the both top 10% of BMI and the top 25% of HDL and then for patients who are in both the lowest 10% of BMI and the lowest 25% of HDL, leveraging
hdl_q25
,hdl_q75
,bmi_q10
,bmi_q90
, which are already defined for you.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
simulation_results = st.multivariate_normal.rvs(mean=mean_dia, size=20000, cov=cov_dia)
df_results = pd.DataFrame(simulation_results,columns=["age", "bmi", "bp", "tc", "ldl", "hdl", "tch", "ltg", "glu"])
predicted_y = regr_model.predict(df_results)
df_y = pd.DataFrame(predicted_y, columns=["predicted_y"])
df_summary = pd.concat([df_results,df_y], axis=1)
hdl_q25 = np.quantile(df_summary["hdl"], 0.25)
hdl_q75 = np.quantile(df_summary["hdl"], 0.75)
bmi_q10 = np.quantile(df_summary["bmi"], 0.10)
bmi_q90 = np.quantile(df_summary["bmi"], 0.90)
# Complete the mean outcome definitions
bmi_q90_hdl_q75_outcome = np.mean(df_summary[(df_summary["bmi"] > bmi_q90) & (____)]____)
bmi_q10_hdl_q15_outcome = np.mean(df_summary[(df_summary["bmi"] < bmi_q10) & (____)]____)
y_diff = bmi_q90_hdl_q75_outcome - bmi_q10_hdl_q15_outcome
print(y_diff)