Exploring 20 News Groups dataset
In this exercise, you will be given a sample of the 20 News Groups dataset obtained using the fetch_20newsgroups()
function from sklearn.datasets
, filtering only three classes: sci.space
, alt.atheism
and soc.religion.christian
.
The dataset is loaded in the variable news_dataset
. Its attributes are printed so you can explore them on the console.
Fore more details on how to use this function, see the Sklearn documentation.
You will tokenize the texts and one-hot encode the labels step by step to understand how the transformations happen.
This exercise is part of the course
Recurrent Neural Networks (RNNs) for Language Modeling with Keras
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# See example article
print(news_dataset.____[5])