Get startedGet started for free

Initialization and activation

The problems of unstable (vanishing or exploding) gradients are a challenge that often arises in training deep neural networks. In this and the following exercises, you will expand the model architecture that you built for the water potability classification task to make it more immune to those problems.

As a first step, you'll improve the weights initialization by using He (Kaiming) initialization strategy. To do so, you will need to call the proper initializer from the torch.nn.init module, which has been imported for you as init. Next, you will update the activations functions from the default ReLU to the often better ELU.

This exercise is part of the course

Intermediate Deep Learning with PyTorch

View Course

Exercise instructions

  • Call the He (Kaiming) initializer on the weight attribute of the second layer, fc2, similarly to how it's done for fc1.
  • Call the He (Kaiming) initializer on the weight attribute of the third layer, fc3, accounting for the different activation function used in the final layer.
  • Update the activation functions in the forward() method from relu to elu.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(9, 16)
        self.fc2 = nn.Linear(16, 8)
        self.fc3 = nn.Linear(8, 1)
        
        # Apply He initialization
        init.kaiming_uniform_(self.fc1.weight)
        ____(____)
        ____(____, ____)

    def forward(self, x):
        # Update ReLU activation to ELU
        x = nn.functional.relu(self.fc1(x))
        x = nn.functional.relu(self.fc2(x))
        x = nn.functional.sigmoid(self.fc3(x))
        return x
Edit and Run Code