MulaiMulai sekarang secara gratis

Working with discrete distributions

You are soon going to work with stochastic policies: policies which represent the agent's behavior in a given state as a probability distribution over actions.

PyTorch can represent discrete distributions using the torch.distributions.Categorical class, which you will now experiment with.

You will see that it is actually not necessary for the numbers used as input to sum to 1, as probabilities do; they get normalized automatically.

Latihan ini adalah bagian dari kursus

Deep Reinforcement Learning in Python

Lihat Kursus

Petunjuk latihan

  • Instantiate the categorical probability distribution.
  • Take one sample from the distribution.
  • Specify 3 positive numbers summing to 1, to act as probabilities.
  • Specify 5 positive numbers; Categorical will silently normalize them to obtain probabilities.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

from torch.distributions import Categorical

def sample_from_distribution(probs):
    print(f"\nInput: {probs}")
    probs = torch.tensor(probs, dtype=torch.float32)
    # Instantiate the categorical distribution
    dist = ____(probs)
    # Take one sample from the distribution
    sampled_index = ____
    print(f"Taking one sample: index {sampled_index}, with associated probability {dist.probs[sampled_index]:.2f}")

# Specify 3 positive numbers summing to 1
sample_from_distribution([.3, ____, ____])
# Specify 5 positive numbers that do not sum to 1
sample_from_distribution([2, ____, ____, ____, ____])
Edit dan Jalankan Kode