Get startedGet started for free

Hidden layers and parameters

1. Hidden layers and parameters

So far, we've used one input layer and one linear layer. Now, we'll add more layers to help the network learn complex patterns.

2. Stacking layers with nn.Sequential()

We'll stack three linear layers using nn.Sequential(), a PyTorch container for stacking layers in sequence. This network takes input, passes it to each linear layer in sequence, and returns output. In this case, the layers within nn.Sequential() are hidden layers.

3. Stacking layers with nn.Sequential()

Here, the first argument in the first layer represents the number of input features (n_features), and the last argument in the final layer represents the number of output classes (n_classes), both defined by the dataset.

4. Adding layers

We can add

5. Adding layers

as many hidden layers

6. Adding layers

as we like

7. Adding layers

as long as the input dimension of each layer matches the output dimension of the previous one.

8. Adding layers

In our three-layer example,

9. Adding layers

the first layer takes 10 features and outputs 18.

10. Adding layers

The second layer takes 18 and outputs 20.

11. Adding layers

Finally, the third layer takes 20 and outputs 5.

12. Layers are made of neurons

A layer is fully connected when each neuron links to all neurons in the previous layer, as shown in red in the figure.

13. Layers are made of neurons

Each neuron in a linear layer

14. Layers are made of neurons

performs a linear operation using all neurons from the previous layer.

15. Layers are made of neurons

Therefore, a single neuron has N plus one learnable parameters, with N being the output dimension of the previous layer, plus one for the bias.

16. Parameters and model capacity

Increasing the number of hidden layers increases the total number of parameters in the model, also known as model capacity. Higher-capacity models can handle more complex datasets but may take longer to train. An effective way to assess a model's capacity is by calculating its total number of parameters. Let's break it down with a two-layer network.

17. Parameters and model capacity

The first layer has four neurons.

18. Parameters and model capacity

Each neuron has eight weights and one bias, which results in 36 parameters.

19. Parameters and model capacity

The second layer has two neurons.

20. Parameters and model capacity

Each neuron has four weights and one bias, for a total of ten parameters. Adding them together, this model has 46 learnable parameters in total.

21. Parameters and model capacity

We can also calculate this in PyTorch using the .numel method. This method returns the number of elements in a tensor. By looping through the model's parameters and summing the number of elements, we can confirm that the total is also 46.

22. Balancing complexity and efficiency

Understanding parameter count helps us balance model complexity and efficiency. Too many parameters can lead to long training times or overfitting, while too few might limit learning capacity.

23. Let's practice!

Let's practice!