Get startedGet started for free

From vectors to graphs

1. From vectors to graphs

In this chapter, we'll discuss how to use a graph database in our RAG architecture to replace vector storage and retrieval.

2. Vector RAG limitations

The RAG architectures we've explored involve embedding a user input and querying a vector store to return relevant documents based on their semantic similarity. Although powerful, this approach does have some limitations. Firstly, document embedding captures semantic meaning but struggles to capture themes and relationships between entities in the document corpus.

3. Vector RAG limitations

Moreover, as the volume of the database grows, the retrieval process can become less efficient, as the computational load increases with the search space.

4. Vector RAG limitations

Lastly, vector RAG systems don't easily accommodate structured or diverse data, which are harder to embed.

5. Graph databases

We can address all of those challenges with graphs. Graphs are great at representing and storing diverse and interconnected information in a structured manner. Entities, like people, places, and sports teams

6. Graph databases - nodes

are represented by nodes, and relationships between entities

7. Graph databases - edges

are represented by labeled edges. Notice that edges are directional, so relationships can apply from one entity to another, but not necessarily the other way around. We'll look at this more closely in a later video.

8. Neo4j graph databases

Neo4j is a powerful graph database option designed to store and efficiently query complex relationships. Throughout this chapter, we'll integrate Neo4j with LangChain to build a Graph RAG architecture.

9. From graphics to graphs...

Let's return to our example and build a Neo4j graph using this information.

10. From graphics to graphs...

Our entities are represented as nodes, where the color indicates the entity type, such as a person. The relationships are represented by edges with types like LOCATED and INTERESTED.

11. From graphics to graphs...

Each node also has a type and unique identifier. Nodes can contain any number of properties represented as key:value pairs.

12. Loading and chunking Wikipedia pages

So how do we go from the unstructured text data we've seen to a nice structured graph? There's a few ways to do this, but we'll be using LLMs. Let's load the Wikipedia results from searching "large language model" using the WikipediaLoader class, and split the first few documents into chunks. Each document has page content and metadata as seen here.

13. From text to graphs!

We begin by defining the LLM to use for the transformation, and use it to create an LLMGraphTransformer. Note that we use temperature=0 to produce more deterministic graphs for greater reliability. The LLM creates structured graph documents by parsing and categorizing entities and their relationships, which it infers from the documents. The transformation is performed using the .convert_to_graph_documents() method on the documents.

14. From text to graphs!

In the output, we can see how the model inferred many entities from the text and created nodes with ids and types to match. Relationships between the entities were also inferred and mapped using edges going from a source node to a target node.

15. Let's practice!

Let's create some graphs!