Exercise

Summary statistics on different kinds of sample

Now you have three types of sample (simple, stratified, cluster), you can compare point estimates from each sample to the population parameter. That is, you can calculate the same summary statistic on each sample and see how it compares to the summary statistic for the population.

Here, we'll look at how satisfaction with the company affects whether or not the employee leaves the company. That is, you'll calculate the proportion of employees who left the company (they have an Attrition value of "Yes"), for each value of RelationshipSatisfaction.

attrition_pop, attrition_srs, attrition_strat, and attrition_clust are available; dplyr is loaded.

Instructions 1/4

undefined XP
  • 1
    • Group by RelationshipSatisfaction level.
    • Summarize to calculate a column named mean_attrition as the mean of the cases where Attrition is equal to "Yes".
  • 2
    • Calculate the proportion of employee attrition for each relationship satisfaction group, this time on the simple random sample, attrition_srs.
  • 3
    • Calculate the proportion of employee attrition for each relationship satisfaction group, this time on the stratified sample, attrition_strat.
  • 4
    • Calculate the proportion of employee attrition for each relationship satisfaction group, this time on the cluster sample, attrition_clust.