Introducing the movie database
Throughout this chapter, you'll be working with the TMDb (The Movie Database). This contains metadata on around 5000 movies.
The dataset is loaded and available to you as movies
.
Your main objective is to predict movie revenue - more specifically, log-revenue, which is the normalized version of the revenue
feature.
Use the .describe()
method to explore the variable target
, containing the values shown in the histogram. You can also inspect the histogram to the right. What can you conclude?
This exercise is part of the course
Ensemble Methods in Python
Hands-on interactive exercise
Turn theory into action with one of our interactive exercises
