Get startedGet started for free

Master data overview

So far you have combined information from rating and survey datasets with your original dataset.

We added several other employee-related information such as compensation, no_leaves_taken (number of vacation days taken), hiring_source etc. in the dataset org_final. Go ahead and check out this dataset before doing feature engineering in the next chapter.

This exercise is part of the course

HR Analytics: Predicting Employee Churn in R

View Course

Exercise instructions

  • Use glimpse() to view the structure of the org_final dataset.
  • Assign the number of variables in the org_final dataset to variables.
  • Generate a box plot to visualize the distribution of distance_from_home for Active and Inactive employees.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# View the structure of the dataset
___

# Number of variables in the dataset
variables <- ___

# Compare the travel distance of Active and Inactive employees
ggplot(org_final, aes(x = ___, y = ___)) +
  ___
Edit and Run Code