Variables inspection
Now that we've added several new variables to abaloneMod
, the next set of exercises will explore the quality of the data using summary statistics and graphical visualization.
You will also filter()
out cases (rows in the abaloneMod
dataset) that have errors or illogical values using the dplyr::filter()
function. For example, there are a few abalones that have a height
of 0 mm which is incorrect due to possible typographical mistakes or measurement errors.
The abaloneMod
dataset has been loaded for you along with the dplyr
and ggplot2
packages. After filtering out cases, you will create a new modified copy of the dataset called abaloneKeep
which will have the final cases kept for analysis in future lessons.
This exercise is part of the course
R For SAS Users
Exercise instructions
- Get summary statistics for abalone heights.
- Keep cases with heights greater than 0 and assign these cases to new dataframe
abaloneKeep
. - For abalones kept in
abaloneKeep
, make a histogram of heights which should all now be greater than 0.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Pull height from abaloneMod and run summary()
___ %>%
___ %>%
___
# Keep cases with height > 0 assign to abaloneKeep
___ <- ___ %>%
___
# Make histogram of updated heights in abaloneKeep
ggplot(___) +
___