Improving your model

Your job in this exercise is to test a few different alpha levels using the Tfidf vectors to determine if there is a better performing combination.

The training and test sets have been created, and tfidf_vectorizer, tfidf_train, and tfidf_test have been computed.

Deze oefening maakt deel uit van de cursus

Introduction to Natural Language Processing in Python

Cursus bekijken

Oefeninstructies

Create a list of alphas to try using np.arange(). Values should range from 0 to 1 with steps of 0.1.
Create a function train_and_predict() that takes in one argument: alpha. The function should:
- Instantiate a MultinomialNB classifier with alpha=alpha.
- Fit it to the training data.
- Compute predictions on the test data.
- Compute and return the accuracy score.
Using a for loop, print the alpha, score and a newline in between. Use your train_and_predict() function to compute the score. Does the score change along with the alpha? What is the best alpha?

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create the list of alphas: alphas
alphas = ____

# Define train_and_predict()
def ____(____):
    # Instantiate the classifier: nb_classifier
    nb_classifier = ____
    # Fit to the training data
    ____
    # Predict the labels: pred
    pred = ____
    # Compute accuracy: score
    score = ____
    return score

# Iterate over the alphas and print the corresponding score
for alpha in alphas:
    print('Alpha: ', alpha)
    print('Score: ', ____)
    print()

Code bewerken en uitvoeren