Get startedGet started for free

Encoding shirt sizes

You have data for a consignment of t-shirts. The data includes the size of the shirt, which is given as either S, M, L or XL.

Here are the counts for the different sizes:

+----+-----+
|size|count|
+----+-----+
|   S|    8|
|   M|   15|
|   L|   20|
|  XL|    7|
+----+-----+

The sizes are first converted to an index using StringIndexer and then one-hot encoded using OneHotEncoder.

Which of the following is not true:

This exercise is part of the course

Machine Learning with PySpark

View Course

Hands-on interactive exercise

Turn theory into action with one of our interactive exercises

Start Exercise