Exploring the UCI SECOM data
To round out this chapter and solidify your understanding of bagging, it's time to work with a new dataset! This data is from a semi-conductor manufacturing process, obtained from the UCI Machine Learning Repository.
Each row represents a production entity. The features are measurements from sensors or points in the process. The labels represent whether the entity passes (1) or fails (-1) the test.
The dataset is loaded and available to you as uci_secom. The target variable is the 'Pass/Fail' column. Use the .value_counts() and .describe() methods to check this variable. What do you notice?
Deze oefening maakt deel uit van de cursus
Ensemble Methods in Python
Praktische interactieve oefening
Zet theorie om in actie met een van onze interactieve oefeningen.
Begin met trainen