Initialization and activation
The problems of unstable (vanishing or exploding) gradients are a challenge that often arises in training deep neural networks. In this and the following exercises, you will expand the model architecture that you built for the water potability classification task to make it more immune to those problems.
As a first step, you'll improve the weights initialization by using He (Kaiming) initialization strategy. To do so, you will need to call the proper initializer from the torch.nn.init module, which has been imported for you as init. Next, you will update the activations functions from the default ReLU to the often better ELU.
Deze oefening maakt deel uit van de cursus
Intermediate Deep Learning with PyTorch
Oefeninstructies
- Call the He (Kaiming) initializer on the weight attribute of the second layer,
fc2, similarly to how it's done forfc1. - Call the He (Kaiming) initializer on the weight attribute of the third layer,
fc3, accounting for the different activation function used in the final layer. - Update the activation functions in the
forward()method fromrelutoelu.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(9, 16)
self.fc2 = nn.Linear(16, 8)
self.fc3 = nn.Linear(8, 1)
# Apply He initialization
init.kaiming_uniform_(self.fc1.weight)
____(____)
____(____, ____)
def forward(self, x):
# Update ReLU activation to ELU
x = nn.functional.relu(self.fc1(x))
x = nn.functional.relu(self.fc2(x))
x = nn.functional.sigmoid(self.fc3(x))
return x