Speeding up inference in quantized models

1
Preparing for Llama fine-tuning
Free
Explore options for fine-tuning Llama 3 models and dive into TorchTune, a library built to simplify fine-tuning. This chapter guides you through data preparation, TorchTune's recipe-based system, and efficient task configuration, providing the key steps to launch your first fine-tuning task.
2
Fine-tuning with SFTTrainer on Hugging Face
Learn how fine-tuning can significantly improve the performance of smaller models for specific tasks. Start with fine-tuning smaller Llama models to enhance their task-specific capabilities. Next, discover parameter-efficient fine-tuning techniques such as LoRA, and explore quantization to load and use even larger models.

Initializing