Aan de slagGa gratis aan de slag

Blocking experimental data

You are working with a manufacturing firm that wants to conduct some experiments on worker productivity. Their dataset only contains 100 rows, so it's important that experimental groups are balanced.

This sounds like a great opportunity to use your knowledge of blocking to assist them. They have provided a productivity_subjects DataFrame. Split the provided dataset into two even groups of 50 entries each.

The libraries numpy and pandas have been imported as np and pd respectively.

Deze oefening maakt deel uit van de cursus

Experimental Design in Python

Cursus bekijken

Oefeninstructies

  • Randomly select 50 subjects from the productivity_subjects DataFrame into a new DataFrame block_1 without replacement.
  • Set a new column, block to 1 for the block_1 DataFrame.
  • Assign the remaining subjects to a DataFrame called block_2 and set the block column to 2 for this DataFrame.
  • Concatenate the blocks together into a single DataFrame, and print the count of each value in the block column to confirm the blocking worked.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Randomly assign half
block_1 = productivity_subjects.____(____, random_state=42, ____)

# Set the block column
block_1['block'] = ____

# Create second assignment and label
block_2 = ____
block_2['block'] = ____

# Concatenate and print
productivity_combined = pd.____([block_1, block_2], axis=0)
print(productivity_combined['block'].value_counts())
Code bewerken en uitvoeren