Binning values
For many continuous values you will care less about the exact value of a numeric column, but instead care about the bucket it falls into. This can be useful when plotting values, or simplifying your machine learning models. It is mostly used on continuous variables where accuracy is not the biggest concern e.g. age, height, wages.
Bins are created using pd.cut(df['column_name'], bins)
where bins
can be an integer specifying the number of evenly spaced bins, or a list of bin boundaries.
Este exercício faz parte do curso
Feature Engineering for Machine Learning in Python
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Bin the continuous variable ConvertedSalary into 5 bins
so_survey_df['equal_binned'] = ____(so_survey_df['ConvertedSalary'], ____)
# Print the first 5 rows of the equal_binned column
print(so_survey_df[['equal_binned', 'ConvertedSalary']].head())