Get startedGet started for free

Random row selection

In this exercise, you will compare the two methods described for selecting random rows (entries) with replacement in a pandas DataFrame:

  • The built-in pandas function .random()
  • The NumPy random integer number generator np.random.randint()

Generally, in the fields of statistics and machine learning, when we need to train an algorithm, we train the algorithm on the 75% of the available data and then test the performance on the remaining 25% of the data.

For this exercise, we will randomly sample the 75% percent of all the played poker hands available, using each of the above methods, and check which method is more efficient in terms of speed.

This exercise is part of the course

Writing Efficient Code with pandas

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Extract number of rows in dataset
N=poker_hands.shape[0]

# Select and time the selection of the 75% of the dataset's rows
rand_start_time = time.time()
poker_hands.iloc[np.random.randint(____=0, high=____, ____=int(0.75 * N))]
print("Time using Numpy: {} sec".format(time.time() - rand_start_time))
Edit and Run Code