ComenzarEmpieza gratis

Computing the value of a formal ECDF

To be able to do the Kolmogorov-Smirnov test, we need to compute the value of a formal ECDF at arbitrary points. In other words, we need a function, ecdf_formal(x, data) that returns the value of the formal ECDF derived from the dataset data for each value in the array x. Two of the functions accomplish this. One will not. Of the two that do the calculation correctly, one is faster. Label each.

As a reminder, the ECDF is formally defined as ECDF(x) = (number of samples ≤ x) / (total number of samples). You also might want to check out the doc string of np.searchsorted().

a)

def ecdf_formal(x, data):
    return np.searchsorted(np.sort(data), x) / len(data)

b)

def ecdf_formal(x, data):
    return np.searchsorted(np.sort(data), x, side='right') / len(data)

c)

def ecdf_formal(x, data):
    output = np.empty(len(x))

    data = np.sort(data)

    for i, x_val in x:
        j = 0
        while j < len(data) and x_val >= data[j]:
            j += 1

        output[i] = j

    return output / len(data)

Este ejercicio forma parte del curso

Case Studies in Statistical Thinking

Ver curso

Ejercicio interactivo práctico

Pon en práctica la teoría con uno de nuestros ejercicios interactivos

Empezar ejercicio