Exercise

Writing out your results to a csv for submission

At last, you're ready to submit some predictions for scoring. In this exercise, you'll write your predictions to a .csv using the .to_csv() method on a pandas DataFrame. Then you'll evaluate your performance according to the LogLoss metric discussed earlier!

You'll need to make sure your submission obeys the correct format.

To do this, you'll use your predictions values to create a new DataFrame, prediction_df.

Interpreting LogLoss & Beating the Benchmark:

When interpreting your log loss score, keep in mind that the score will change based on the number of samples tested. To get a sense of how this very basic model performs, compare your score to the DrivenData benchmark model performance: 2.0455, which merely submitted uniform probabilities for each class.

Remember, the lower the log loss the better. Is your model's log loss lower than 2.0455?

Instructions

100 XP
  • Create the prediction_df DataFrame by specifying the following arguments to the provided parameters pd.DataFrame():
    • pd.get_dummies(df[LABELS]).columns.
    • holdout.index.
    • predictions.
  • Save prediction_df to a csv file called 'predictions.csv' using the .to_csv() method.
  • Submit the predictions for scoring by using the score_submission() function with pred_path set to 'predictions.csv'.