MulaiMulai sekarang secara gratis

Splitting Python files

Although text and code files contain the same characters, code files contain structures beyond natural language. To retain this code-specific context during document splitting, you should program the splitter to first try to split on the most common code structure. Fortunately, LangChain provides functionality to do just that!

All of the necessary classes have been imported for you, including Language from langchain_text_splitters.

Latihan ini adalah bagian dari kursus

Retrieval Augmented Generation (RAG) with LangChain

Lihat Kursus

Petunjuk latihan

  • Create a recursive character splitter that will split on common Python code structures.
  • Split the python_data document loader into chunks.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Create a Python-aware recursive character splitter
python_splitter = RecursiveCharacterTextSplitter.____(
    ____, chunk_size=300, chunk_overlap=100
)

# Split the Python content into chunks
chunks = ____

for i, chunk in enumerate(chunks[:3]):
    print(f"Chunk {i+1}:\n{chunk.page_content}\n")
Edit dan Jalankan Kode