Get startedGet started for free

Batching upserts in parallel

In this exercise, you'll practice ingesting vectors into the 'datacamp-index' Pinecone index in parallel. You'll need to connect to the index, upsert vectors in batches asynchronously, and check the updated metrics of the 'datacamp-index' index.

The chunks() helper function you created earlier is still available to use:

def chunks(iterable, batch_size=100):
    """A helper function to break an iterable into chunks of size batch_size."""
    it = iter(iterable)
    chunk = tuple(itertools.islice(it, batch_size))
    while chunk:
        yield chunk
        chunk = tuple(itertools.islice(it, batch_size))

This exercise is part of the course

Vector Databases for Embeddings with Pinecone

View Course

Exercise instructions

  • Initialize the Pinecone client to allow 20 simultaneous requests.
  • Upsert the vectors in vectors in batches of 200 vectors per request asynchronously, configuring 20 simultaneous requests.
  • Print the updated metrics of the 'datacamp-index' Pinecone index.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Initialize the client
pc = Pinecone(api_key="____", ____)

index = pc.Index('datacamp-index')

# Upsert vectors in batches of 200 vectors
with pc.Index('datacamp-index', ____) as index:
    async_results = [____(vectors=chunk, ____) for chunk in chunks(vectors, batch_size=____)]
    [async_result.get() for async_result in async_results]

# Retrieve statistics of the connected Pinecone index
print(____)
Edit and Run Code