How many positive and negative reviews are there?
As a first step in a sentiment analysis task, similar to other data science problems, we might want to explore the dataset in more detail.
You will work with a sample of the IMDB movies reviews. A dataset called movies
has been created for you. It is a sample of the data we saw in the slides. Feel free to explore it in the IPython Shell, calling the .head()
method, for example.
Be aware that this exercise uses real data, and as such there is always a risk that it may contain profanity or other offensive content (in this exercise, and any following exercises that also use real data).
This exercise is part of the course
Sentiment Analysis in Python
Exercise instructions
- Find the number of positive and negative reviews in the
movies
dataset. - Find the proportion (percentage) of positive and negative reviews in the dataset.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Find the number of positive and negative reviews
print('Number of positive and negative reviews: ', movies.label.____)
# Find the proportion of positive and negative reviews
print('Proportion of positive and negative reviews: ', movies.label.____ / ____(movies))