1. Loss functions
We now know how to import datasets and perform TensorFlow operations on them, but how can we use this knowledge to train models? In this video, we'll move closer to that goal by taking a look at loss functions.
2. Introduction to loss functions
Loss functions play a fundamental role in machine learning.
We need loss functions to train models because they tell us how well our model explains the data.
Without this feedback, it is unclear how to adjust model parameters during the training process.
A high loss value indicates that the model fit is poor.
Typically, we train the model by selecting parameter values that minimize the loss function. In some cases, we may want to maximize a function instead. Fortunately, we can always place a minus sign before the function we want to maximize and instead minimize it. For this reason, we will always talk about loss functions and minimization.
3. Common loss functions in TensorFlow
While it is possible to define a custom loss function, this is typically not necessary, since many common options are available in TensorFlow. Typical choices for training linear models include the mean squared error loss, the mean absolute error loss, and the Huber loss. All of these are accessible from tf dot keras dot losses.
4. Why do we care about loss functions?
Here, we plot the MSE, MAE, and Huber loss for error values between minus two and two. Note that the MSE strongly penalizes outliers and has high sensitivity near the minimum. The MAE scales linearly with the size of the error and has low sensitivity near the minimum. And the Huber loss is similar to the MSE near zero and similar to the MAE away from zero.
For greater sensitivity near the minimum, you will want to use the MSE or Huber loss. To minimize the impact of outliers, you will want to use the MAE or Huber loss.
5. Defining a loss function
Let's say we decide to use the MSE loss. We'll need two tensors to compute it: the actual values or "targets" tensor and the predicted values or "predictions." Passing them to the MSE operation will return a single number: the average of the squared differences between the actual and predicted values.
6. Defining a loss function
In many cases, the training process will require us to supply a function that accepts our model's variables and data and returns a loss.
Here, we'll first define a model, "linear_regression," which takes the intercept, slope, and features as arguments and returns the model's predictions.
We'll next define a loss function called "loss_function" that accepts the slope and intercept of a linear model -- the variables -- and the input data, the targets and the features.
It then makes a prediction and computes and returns the associated MSE loss.
Note that we've defined both functions to use default argument values for features and targets. We will do this whenever we train on the full sample to simplify the code.
7. Defining the loss function
Notice that we've nested TensorFlow's MSE loss function within a function that first uses the model to make predictions and then uses those predictions as an input to the MSE loss function. We can then evaluate this function for a given set of parameter values and input data.
Here, we've evaluated the loss function using a test dataset and it returned a loss value of ten point seven seven. If we had omitted the data arguments, test_targets and test_features, the loss function would have instead used the default targets and features arguments we set to evaluate model performance.
8. Let's practice!
It's now time to put what you've learned to work in some exercises.