Splitting the play into Acts

Converting unstructured text into hierarchical lexical graphs is an iterative process that involves building splitters for each lexical entity and then splitting by each in turn.

In this exercise, you'll design a splitter to split the play, Romeo and Juliet, into acts. Here is a preview of the structure of the play:

The Project Gutenberg eBook of Romeo and Juliet
This ebook is for the use of anyone anywhere in the United States...

**PROLOGUE:**

 Enter Chorus.

CHORUS.
Two households, both alike in dignity...

ACT I

SCENE I. A public place.
 Enter Sampson and Gregory armed with swords and bucklers.

SAMPSON.
Gregory, on my word, we’ll not carry coals...
...

This exercise is part of the course

Graph RAG with LangChain and Neo4j

Exercise instructions

Update the splitters argument to also split the text on the pattern \n\nACT.
Configure the act_splitter to treat the separators list as regular expressions.
Split romeo_and_juliet using act_splitter.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

act_splitter = RecursiveCharacterTextSplitter(
  separators=[ 
    r"\n\nTHE PROLOGUE.",
    r"\n\n\*\*\* END",
    # Split by the word ACT
    r"____"
  ],
  # Configure the patterns as regular expressions
  ____=True
)

# Split the play using act_splitter
acts = act_splitter.____(____)

for act in acts:
  print(act.strip().split("\n")[0])

Edit and Run Code

This exercise is part of the course

Graph RAG with LangChain and Neo4j

AdvancedSkill Level

4.8+

Start Course for Free

Learn how Graph RAG can improve the accuracy and reliability of RAG applications! Store information as nodes and edges in a Neo4j database, and give your LLM the ability to query it so it can retrieve entity and relational information to provide informed answers.

Exercise 1: Graphs and RAG Exercise 2: Creating nodes Exercise 3: Creating relationships Exercise 4: Saving graph documents Exercise 5: Querying a knowledge graph Exercise 6: Writing Cypher statements Exercise 7: Running Cypher statements Exercise 8: The MERGE clause Exercise 9: Text-to-Cypher Graph RAG with Neo4j Exercise 10: Building a text-to-Cypher chain Exercise 11: Text-to-Cypher retrieval chain

Text-to-Cypher applications work well in many circumstances, but we can do better than that! Discover how to construct graph databases using different graph models including lexical and domain graphs. Create Neo4j vector indexes so that you can have the best of both worlds and run graph and vector retrieval simultaneously!

Exercise 1: Lexical graphs Exercise 2: Elements of a lexical graph Exercise 3: Splitting the play into Acts

Current Exercise

Exercise 4: Creating a hierarchical lexical graph Exercise 5: Combining lexical graphs with vector search Exercise 6: Creating text chunks Exercise 7: Creating a vector index Exercise 8: Domain graphs Exercise 9: Creating a structured output Exercise 10: Requesting a structured output Exercise 11: Providing few-shot examples Exercise 12: Building a hybrid retrieval chain Exercise 13: Runnable lambdas Exercise 14: Assigning additional values to an input Exercise 15: The final link in the chain

Although Graph RAG applications are generally more reliable than vector RAG, they aren't totally infallible. In this chapter, you'll learn to evaluate your Graph RAG applications, spot incorrect or duplicate graph nodes, and integrate long-term memory so user preferences can be learned over time.

Exercise 1: Entity resolution Exercise 2: Using extracted entities Exercise 3: Graph-based entity resolution Exercise 4: Evaluating Graph RAG with RAGAS Exercise 5: Creating a Ragas evaluation Exercise 6: Evaluating context retrieval with Ragas Exercise 7: Memory graphs Exercise 8: Saving conversation memory in the graph Exercise 9: Extracting facts from conversation histories Exercise 10: Using extracted conversation facts Exercise 11: Congratulations!