Storing and querying documents

1. Storing and querying documents

Now that we've created our graph documents, we'll need to store them for querying.

2. Instantiating the Neo4j database

We'll be using Neo4j to store and query our graph documents. We won't be going through the Neo4j setup in this course, but feel free to check out the link shown for information on how to do this. Neo4j has both cloud-based and local verisons to suit your use cases. For our purposes, we'll assume that database is already locally available, which will also be the case in the exercises. We instantiate a graph with the Neo4jGraph class, specifying the URL to the Neo4j database server, and the credentials needed to access it. In a production setting, these credentials should be saved as environment variables rather than being committed to a codebase for better security.

3. Storing graph documents

Carrying on from the graph documents we created from Wikipedia results on large language models, we can add these graph documents to our database

4. Storing graph documents

using the .add_graph_documents() method. The include_source parameter links nodes to their source documents, which are also represented as nodes, by including a MENTIONS relationship in the graph, enabling better traceability and context preservation. The baseEntityLabel parameter assigns an additional __Entity__ label to each node, improving query performance.

5. Visualizing the graph

Here is what our graph looks like - there's a lot here, so let's zoom in to look at a sub-section of this graph.

6. Visualizing the graph

We can see lots of familiar nodes and relationships representing the different entities in the search results. The pink nodes are the source documents we specified when adding the graph documents; each one has a MENTIONS relationship from the source to the entity mentioned.

7. Database schema

We can also view the database schema with the .get_schema attribute. Here, we can see the different node types and relationships, including their direction.

8. Querying Neo4j - Cypher Query Language

Now let's talk about querying our database. Neo4j introduced Cypher Query Language in 2011 as a declarative query language for intuitively navigating and manipulating graph data using a SQL-like syntax. Consider the example of a social network using graphs to map social connections. We could use Cypher to find who James is friends with.

9. Querying Neo4j - Cypher Query Language

Here's what the Cypher code looks like. Let's break it down.

10. Querying Neo4j - Cypher Query Language

The code looks for a match between a node with name "James" and another node, indicated with the friend variable, joined with a FRIEND relationship. Notice that the relationship sits in the center of an arrow that indicates the direction of the relationship from one node to another.

11. Querying Neo4j - Cypher Query Language

Person is the node label, we're declaring the source node using the name property, and friend is a variable we'll use to search for the missing name. Notice the code finishes with RETURN friend, so the mystery friend is revealed.

12. Querying Neo4j - Cypher Query Language

This query reveals the mystery friend's name!

13. Querying the LLM graph

Let's query our database of Wikipedia results about LLMs. We'll query the database to find out who developed the GPT-4 model. Our query looks for a match between a model and an organization joined by the DEVELOPED_BY relationship. There we have it!

14. Let's practice!

Time to practice storing graph documents and querying the database.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.