Quiz 2 - Question 1
Consider the following two text segments from the Africa Galore dataset:
the vibrant colors and intricate patterns of kente cloth , a symbol of ghanaian royalty and prestige , tell stories of culture , history , and social status . woven on narrow looms by skilled artisans , each strip of kente is a testament to patience and artistry.
bogolanfini , or mud cloth , from mali , is a textile steeped in tradition and symbolism . its distinctive patterns , created using fermented mud and natural dyes , tell stories of malian culture , history , and beliefs . the process of creating bogolanfini is as rich and complex as the designs themselves .
For the purpose of this exercise, all words have been transformed to lowercase and punctuation marks have been split from the words. Assume that punctuation marks are considered to be individual words as well (e.g., “history , culture” is split into the three words “history”, “,”, and “culture”).
Compute the following probabilities of a bigram language model whose probabilities were estimated on only these two text segments:
- P(, ∣ cloth)
- P( cloth ∣ ,)
- P( history ∣ culture)
- P( history ∣ ,)
Este ejercicio forma parte del curso
Google DeepMind: Build Your Own Small Language Model
Ejercicio interactivo práctico
Pon en práctica la teoría con uno de nuestros ejercicios interactivos
Empezar ejercicio