CommencerCommencer gratuitement

Evaluating distribution fit for the ldl variable

In this exercise, you'll focus on one variable of the diabetes dataset dia: the ldl blood serum. You'll determine whether the normal distribution is a still good choice for ldl based on the additional information provided by a Kolmogorov-Smirnov test.

The dia DataFrame has been loaded for you. The following libraries have also been imported: pandas as pd, numpy as np, and scipy.stats as st.

Cet exercice fait partie du cours

Monte Carlo Simulations in Python

Afficher le cours

Instructions

  • Define a list called list_of_dists containing your candidate distributions: Laplace, normal, and exponential (in that order); use the correct names from scipy.stats.
  • Inside the loop, fit the data with the corresponding probability distribution, saving as param.
  • Perform a Kolmogorov–Smirnov test to evaluate goodness-of-fit, saving the results as result.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# List candidate distributions to evaluate
list_of_dists = [____]
for i in list_of_dists:
    dist = getattr(st, i)
    # Fit the data to the probability distribution
    param = dist.____
    # Perform the ks test to evaluate goodness-of-fit
    result = ____
    print(result)
Modifier et exécuter le code