Non-random assignment of subjects
An agricultural firm is conducting an experiment to measure how feeding sheep different types of grass affects their weight. They have asked for your help to properly set up the experiment. One of their managers has said you can perform the subject assignment by taking the top 250 rows from the DataFrame and that should be fine.
Your task is to use your analytical skills to demonstrate why this might not be a good idea. Assign the subjects to two groups using non-random assignment (the first 250 rows) and observe the differences in descriptive statistics.
You have received the DataFrame, weights
which has a column containing the weight
of the sheep and a unique id
column.
numpy
and pandas
have been imported as np
and pd
, respectively.
This exercise is part of the course
Experimental Design in Python
Exercise instructions
- Use DataFrame slicing to put the first 250 rows of
weights
intogroup1_non_rand
and the remaining intogroup2_non_rand
. - Generate descriptive statistics of the two groups and concatenate them into a single DataFrame.
- Print out to observe the differences.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Non-random assignment
group1_non_rand = ____
group2_non_rand = ____
# Compare descriptive statistics of groups
compare_df_non_rand = ____([group1_non_rand['weight'].____, group2_non_rand['weight'].____], axis=1)
compare_df_non_rand.columns = ['group1', 'group2']
# Print to assess
print(____)