Exercise

# Computing the K-S statistic

Write a function to compute the Kolmogorov-Smirnov statistic from two datasets, `data1`

and `data2`

, in which `data2`

consists of samples from the theoretical distribution you are comparing your data to. Note that this means we are using hacker stats to compute the K-S statistic for a dataset and a theoretical distribution, *not* the K-S statistic for two empirical datasets. Conveniently, the function you just selected for computing values of the formal ECDF is given as `dcst.ecdf_formal()`

.

Instructions

**100 XP**

- Compute the values of the convex corners of the formal ECDF for
`data1`

using`dcst.ecdf()`

. Store the results in the variables`x`

and`y`

. - Use
`dcst.ecdf_formal()`

to compute the values of the theoretical CDF, determined from`data2`

, at the convex corners`x`

. Store the result in the variable`cdf`

. - Compute the distances between the concave corners of the formal ECDF and the theoretical CDF. Store the result as
`D_top`

. - Compute the distance between the convex corners of the formal ECDF and the theoretical CDF. Note that you will need to subtract
`1/len(data1)`

from`y`

to get the`y`

-value at the convex corner. Store the result in`D_bottom`

. - Return the K-S statistic as the maximum of all entries in
`D_top`

and`D_bottom`

. You can pass`D_top`

and`D_bottom`

together as a tuple to`np.max()`

to do this.