Overfitting detection
In this exercise, we'll work with a small subset of the examples from the original sign language letters dataset. A small sample, coupled with a heavily-parameterized model, will generally lead to overfitting. This means that your model will simply memorize the class of each example, rather than identifying features that generalize to many examples.
You will detect overfitting by checking whether the validation sample loss is substantially higher than the training sample loss and whether it increases with further training. With a small sample and a high learning rate, the model will struggle to converge on an optimum. You will set a low learning rate for the optimizer, which will make it easier to identify overfitting.
Note that keras
has been imported from tensorflow
.
This exercise is part of the course
Introduction to TensorFlow in Python
Exercise instructions
- Define a sequential model in
keras
namedmodel
. - Add a first dense layer with 1024 nodes, a
relu
activation, and an input shape of (784,). - Set the learning rate to 0.001.
- Set the
fit()
operation to iterate over the full sample 50 times and use 50% of the sample for validation purposes.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Define sequential model
____
# Define the first layer
____
# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))
# Finish the model compilation
model.compile(optimizer=keras.optimizers.Adam(lr=____),
loss='categorical_crossentropy', metrics=['accuracy'])
# Complete the model fit operation
model.fit(sign_language_features, sign_language_labels, epochs=____, validation_split=____)