This chapter introduces the basics of Reinforcement Learning with Human Feedback (RLHF), a technique that uses human input to help AI models learn more effectively. Get started with RLHF by understanding how it differs from traditional reinforcement learning and why human feedback can enhance AI performance in various domains.

Introduction to RLHF

Text generation with RLHF

Classifying generated text for RLHF

RL vs. RLHF

Exploring pre-trained LLMs

Tokenize a text dataset

Fine-tuning for review classification

Preparing data for RLHF

Preparing the preference dataset

Extracting prompts

Foundational Concepts

Discover how to set up systems for gathering human feedback in this Chapter. Learn best practices for collecting high-quality data, from pairwise comparisons to uncertainty sampling,  and explore strategies for enhancing your data collection. 

Métodos para coletar feedback de alta qualidade

Opções

Comparison

Rating

Entendendo comparação e rating em RLHF

Comparando slogans para uma campanha de academia

Mensurando a qualidade e a relevância do feedback

Baixa confiança

K-means para agrupar feedbacks

Active learning

Implementando um pipeline de active learning

Loop de active learning

Gathering Human Feedback

In this Chapter, you'll get into the core of Reinforcement Learning from Human Feedback training. This includes exploring fine-tuning with PPO, techniques to train efficiently, and handling potential divergences from your metrics' objectives. 

Reward models explored

Initializing the reward

Setting up the reward trainer

Training with PPO

Initialize the PPO trainer

PPO fine-tuning

Efficient fine-tuning in RLHF

Prepare for 8-bit Training

Train with LoRA

 Tuning Models with Human Feedback

Explore key techniques for assessing and improving model performance in this last Chapter of Reinforcement Learning from Human Feedback (RLHF): from fine-tuning metrics to incorporating diverse feedback sources, you'll be provided with a comprehensive toolkit to refine your models effectively.

Model metrics and adjustments

Mitigating negative KL divergence

Checking the reward model

Incorporating diverse feedback sources

Majority voting on multiple data sources

Unreliable data source identification

Evaluating RLHF models

Interpreting curves

Evaluating RLHF with metrics

Wrapping up your RLHF journey

Model Evaluation

Combine a eficiência da IA Generativa com o conhecimento da experiência humana neste curso sobre Reinforcement Learning from Human Feedback. Você vai aprender a fazer com que modelos de GenAI reflitam de fato valores e preferências humanas enquanto ganha prática com LLMs. Você também vai navegar pelas complexidades de modelos de recompensa e aprender a construir sobre LLMs para produzir uma IA que não apenas aprende, mas também se adapta a cenários do mundo real.

Deep Reinforcement Learning in Python

Aprenda a fazer modelos GenAI refletirem valores humanos enquanto ganha experiência com LLMs avançados.

Reinforcement Learning from Human Feedback (RLHF)

Aprenda como fazer com que os modelos GenAI realmente reflitam os valores humanos enquanto ganha experiência prática com LLMs avançados.

Desenvolvimento de modelos de idiomas grandes

Aprendizagem por reforço em Python

Escolha qual dos seguintes filmes é melhor: "Interstellar" ou "Oppenheimer"

Atribua a cada filme uma nota de 1 a 5 com base na importância: "Titanic", "Gladiator", "Interstellar"

Entendendo comparação e rating em RLHF

Reinforcement Learning from Human Feedback (RLHF)

exercicio interativo prático