Memory graphs

1. Memory graphs

Having both short-term and long-term memory is key to the success of our application.

2. Memory in RAG applications

Short-term memory typically exists for a short amount of time, for example in the scope of a conversation, and doesn't persist beyond this. Long-term memory refers to the facts that the application learns over time, for example, user preferences on output structure. Long-term memory is quite often used to personalize application usage, either to organizations or individuals.

3. Short-term memory

While most LangChain memory implementations treat

4. Short-term memory

conversation history as simply a list of messages,

5. Short-term memory

they contain rich information about the user.

6. Short-term to long-term memory

Using structured inputs, we can

7. Short-term to long-term memory

extract facts from the conversation history

8. Short-term to long-term memory

and use them to enrich our knowledge graph.

9. Short-term to long-term memory

These facts can be linked to the

10. Short-term to long-term memory

conversation or even the individual messages.

11. Neo4j chat message history

The langgraph-neo4j package provides a Neo4jChatMessageHistory class that saves and retrieves conversation history from a Neo4j database. The class expects a database URL, username and password, and a string to represent the ID of the current session. From there, you can build up the conversation history using the .add_user_message() and .add_ai_message() methods.

12. Problems with storing everything

This implementation will store absolutely everything in the memory, which may work for a while, but over time, may become impractical and harm context retrieval. We'll look at two variations of this implementation to handle this issue: a context window and conversation summaries.

13. Neo4j chat message history

The class maintains the history in the database, so the next time you instantiate the class with the same session ID, the history is retrieved from the database. We can specify the window parameter to control the number of latest messages to retrieve. The idea is that these messages, being the most recent, will hopefully be the most relevant. Reducing the context window will improve costs and latency, as we're processing smaller amounts of information, but on the other hand, this may degrade performance. Alternatively,

14. Summarizing conversations

we can summarize the conversation to try to capture longer-term memories into a more manageable context size. We do this by defining a structured output class using pydantic's BaseModel class. Recall that we did something similar when extracting entities to building domain graphs in Chapter 2. We will use this class to hold the facts from the conversation as fields for easier retrieval. For example, the people, subjects, and relationships.

15. Summarizing conversations

This output is then bound to a language model using the .with_structured_output() method, passing in the class we created.

16. Summarizing conversations

Finally, we create a chat prompt template with instructions to extract any facts from the conversation, and use the MessagesPlaceholder class to hold the conversation history. Then, when the chain is invoked, we can use the messages property on the conversation history instance to populate the history. The output will be a list of facts that can be saved to a knowledge graph.

17. Let's practice!

Let's take all of these elements and summarize a conversation already in our Neo4j database.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.