Defining a function for chunking
To be able to batch upserts in a reproducible way, you'll need to define a function to split your list of vectors into chunks.
The built-in itertools
module has already been imported for you.
This exercise is part of the course
Vector Databases for Embeddings with Pinecone
Exercise instructions
- Convert the
iterable
input into an iterator. - Slice
it
into chunks of sizebatch_size
using theitertools
module. - Yield the current chunk.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def chunks(iterable, batch_size=100):
"""A helper function to break an iterable into chunks of size batch_size."""
# Convert the iterable into an iterator
it = ____
# Slice the iterator into chunks of size batch_size
chunk = tuple(itertools.____(it, ____))
while chunk:
# Yield the chunk
____
chunk = tuple(itertools.islice(it, batch_size))