Training and testing a classification model with scikit-learn