1
Building RAG Applications with LangChain
Free
Discover how to integrate external data sources into chat models with LangChain. Learn how to load, split, embed, store, and retrieve data for use in LLM applications.
2
Improving the RAG Architecture
Discover state-of-the-art techniques for loading, splitting, and retrieving documents, including loading Python files, splitting semantically, and using MRR and self-query retrieval methods. Learn to evaluate your RAG architecture using robust metrics and frameworks.
3
Introduction to Graph RAG
Discover how graph databases and retrieval can overcome some of the limitations of traditional vector-based storage and retrieval.

Initializing

Splitting semantically

All of the splitting strategies you've used up to this point have the same drawback: the split doesn't consider the context of the surrounding text, so context can easily be lost during splitting.

In this exercise, you'll create and apply a semantic text splitter, which is a cutting-edge experimental method for splitting text based on semantic meaning. When the splitter detects that the meaning of the text has deviated past a certain threshold, a split will be performed.

Instantiate the 'text-embedding-3-small' embedding model from OpenAI.
Create a semantic text splitter that uses vector gradients to determine semantic similarity and uses 0.8 as the threshold at which to split.
Split the document using the semantic splitter.