Define a competition metric
Competition metric is used by Kaggle to evaluate your submissions. Moreover, you also need to measure the performance of different models on a local validation set.
For now, your goal is to manually develop a couple of competition metrics in case if they are not available in sklearn.metrics
.
In particular, you will define:
Mean Squared Error (MSE) for the regression problem: $$MSE = \frac{1}{N}\sum_{i=1}^{N}{(y_i - \hat{y}_i)^2}$$
Logarithmic Loss (LogLoss) for the binary classification problem: $$LogLoss = -\frac{1}{N}\sum_{i=1}^{N}{(y_i\ln p_i + (1-y_i)\ln (1-p_i))}$$
This exercise is part of the course
Winning a Kaggle Competition in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
import numpy as np
# Import MSE from sklearn
from sklearn.metrics import mean_squared_error
# Define your own MSE function
def own_mse(y_true, y_pred):
# Raise differences to the power of 2
squares = np.____(y_true - y_pred, 2)
# Find mean over all observations
err = np.____(squares)
return err
print('Sklearn MSE: {:.5f}. '.format(mean_squared_error(y_regression_true, y_regression_pred)))
print('Your MSE: {:.5f}. '.format(own_mse(y_regression_true, y_regression_pred)))