Random row selection
In this exercise, you will compare the two methods described for selecting random rows (entries) with replacement in a pandas
DataFrame:
- The built-in
pandas
function.random()
- The
NumPy
random integer number generatornp.random.randint()
Generally, in the fields of statistics and machine learning, when we need to train an algorithm, we train the algorithm on the 75% of the available data and then test the performance on the remaining 25% of the data.
For this exercise, we will randomly sample the 75% percent of all the played poker hands available, using each of the above methods, and check which method is more efficient in terms of speed.
Diese Übung ist Teil des Kurses
Writing Efficient Code with pandas
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Extract number of rows in dataset
N=poker_hands.shape[0]
# Select and time the selection of the 75% of the dataset's rows
rand_start_time = time.time()
poker_hands.iloc[np.random.randint(____=0, high=____, ____=int(0.75 * N))]
print("Time using Numpy: {} sec".format(time.time() - rand_start_time))