1. Learn
  2. /
  3. Courses
  4. /
  5. Scalable AI Models with PyTorch Lightning

Connected

Exercise

Comparing quantized model performance

Understanding performance improvements isn't just about accuracy. Quantized models often offer faster inference times—a key benefit in deployment scenarios. You'll measure how long it takes for both the original and quantized models to process the test set.

The measure_time() function has been predefined. It sets the model to evaluation mode, runs a forward pass on all batches in the dataloader, and returns the elapsed time.

Both model (the original model) and model_quantized (the quantized version) are preloaded along with test_loader.

Instructions

100 XP
  • Compute inference time for the original and quantized models.
  • Print both times rounded to two decimal points.