Get startedGet started for free

Quiz 4 - Question 2

A transformer model has 6 layers with four attention heads each, with each layer processing the input in 0.5 seconds. For this example, how long will it take for an input to go through the entire transformer model?

This exercise is part of the course

Google DeepMind: Discover The Transformer Architecture

View Course

Hands-on interactive exercise

Turn theory into action with one of our interactive exercises

Start Exercise