1. Learn
  2. /
  3. Courses
  4. /
  5. Retrieval Augmented Generation (RAG) with LangChain

Connected

Exercise

Recursively splitting documents

Splitting on a single character is simple and predictable, but it often produces sub-optimal chunks. In this exercise, you'll apply recursive character splitting to split the Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks paper you loaded in a earlier exercise.

Recall that recursive character splitting iterates over a list of characters, splitting on each in turn to see if chunks can be created beneath the chunk_size limit.

Instructions

100 XP
  • Define a LangChain recursive character text splitter to split recursively through the character list ['\n', '.', ' ', ''] with a chunk size of 75 and chunk overlap of 10.
  • Split document using the text_splitter you defined and an appropriate method.