Initialization and activation
The problems of unstable (vanishing or exploding) gradients are a challenge that often arises in training deep neural networks. In this and the following exercises, you will expand the model architecture that you built for the water potability classification task to make it more immune to those problems.
As a first step, you'll improve the weights initialization by using He (Kaiming) initialization strategy. To do so, you will need to call the proper initializer from the torch.nn.init
module, which has been imported for you as init
. Next, you will update the activations functions from the default ReLU to the often better ELU.
This exercise is part of the course
Intermediate Deep Learning with PyTorch
Exercise instructions
- Call the He (Kaiming) initializer on the weight attribute of the second layer,
fc2
, similarly to how it's done forfc1
. - Call the He (Kaiming) initializer on the weight attribute of the third layer,
fc3
, accounting for the different activation function used in the final layer. - Update the activation functions in the
forward()
method fromrelu
toelu
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(9, 16)
self.fc2 = nn.Linear(16, 8)
self.fc3 = nn.Linear(8, 1)
# Apply He initialization
init.kaiming_uniform_(self.fc1.weight)
____(____)
____(____, ____)
def forward(self, x):
# Update ReLU activation to ELU
x = nn.functional.relu(self.fc1(x))
x = nn.functional.relu(self.fc2(x))
x = nn.functional.sigmoid(self.fc3(x))
return x