Carrier
In this exercise you'll create a StringIndexer
and a OneHotEncoder
to code the carrier
column. To do this, you'll call the class constructors with the arguments inputCol
and outputCol
.
The inputCol
is the name of the column you want to index or encode, and the outputCol
is the name of the new column that the Transformer
should create.
This exercise is part of the course
Foundations of PySpark
Exercise instructions
- Create a
StringIndexer
calledcarr_indexer
by callingStringIndexer()
withinputCol="carrier"
andoutputCol="carrier_index"
. - Create a
OneHotEncoder
calledcarr_encoder
by callingOneHotEncoder()
withinputCol="carrier_index"
andoutputCol="carrier_fact"
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a StringIndexer
carr_indexer = StringIndexer(____)
# Create a OneHotEncoder
carr_encoder = OneHotEncoder(____)