Improving your model
Your job in this exercise is to test a few different alpha levels using the Tfidf
vectors to determine if there is a better performing combination.
The training and test sets have been created, and tfidf_vectorizer
, tfidf_train
, and tfidf_test
have been computed.
This exercise is part of the course
Introduction to Natural Language Processing in Python
Exercise instructions
- Create a list of alphas to try using
np.arange()
. Values should range from0
to1
with steps of0.1
. - Create a function
train_and_predict()
that takes in one argument:alpha
. The function should:- Instantiate a
MultinomialNB
classifier withalpha=alpha
. - Fit it to the training data.
- Compute predictions on the test data.
- Compute and return the accuracy score.
- Instantiate a
- Using a
for
loop, print thealpha
,score
and a newline in between. Use yourtrain_and_predict()
function to compute thescore
. Does the score change along with the alpha? What is the best alpha?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the list of alphas: alphas
alphas = ____
# Define train_and_predict()
def ____(____):
# Instantiate the classifier: nb_classifier
nb_classifier = ____
# Fit to the training data
____
# Predict the labels: pred
pred = ____
# Compute accuracy: score
score = ____
return score
# Iterate over the alphas and print the corresponding score
for alpha in alphas:
print('Alpha: ', alpha)
print('Score: ', ____)
print()