Exploring the UCI SECOM data
To round out this chapter and solidify your understanding of bagging, it's time to work with a new dataset! This data is from a semi-conductor manufacturing process, obtained from the UCI Machine Learning Repository.
Each row represents a production entity. The features are measurements from sensors or points in the process. The labels represent whether the entity passes (1
) or fails (-1
) the test.
The dataset is loaded and available to you as uci_secom
. The target
variable is the 'Pass/Fail'
column. Use the .value_counts()
and .describe()
methods to check this variable. What do you notice?
This exercise is part of the course
Ensemble Methods in Python
Hands-on interactive exercise
Turn theory into action with one of our interactive exercises
