ComenzarEmpieza gratis

Splitting semantically

All of the splitting strategies you've used up to this point have the same drawback: the split doesn't consider the context of the surrounding text, so context can easily be lost during splitting.

In this exercise, you'll create and apply a semantic text splitter, which is a cutting-edge experimental method for splitting text based on semantic meaning. When the splitter detects that the meaning of the text has deviated past a certain threshold, a split will be performed.

Este ejercicio forma parte del curso

Retrieval Augmented Generation (RAG) with LangChain

Ver curso

Instrucciones del ejercicio

  • Instantiate the 'text-embedding-3-small' embedding model from OpenAI.
  • Create a semantic text splitter that uses vector gradients to determine semantic similarity and uses 0.8 as the threshold at which to split.
  • Split the document using the semantic splitter.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Instantiate an OpenAI embeddings model
embedding_model = ____(api_key="", model='____')

# Create the semantic text splitter with desired parameters
semantic_splitter = ____(
    embeddings=____, breakpoint_threshold_type="____", breakpoint_threshold_amount=____
)

# Split the document
chunks = ____
print(chunks[0])
Editar y ejecutar código