Batching upserts in parallel

In this exercise, you'll practice ingesting vectors into the 'datacamp-index' Pinecone index in parallel. You'll need to connect to the index, upsert vectors in batches asynchronously, and check the updated metrics of the 'datacamp-index' index.

The chunks() helper function you created earlier is still available to use:

def chunks(iterable, batch_size=100):
    """A helper function to break an iterable into chunks of size batch_size."""
    it = iter(iterable)
    chunk = tuple(itertools.islice(it, batch_size))
    while chunk:
        yield chunk
        chunk = tuple(itertools.islice(it, batch_size))

This exercise is part of the course

Vector Databases for Embeddings with Pinecone

Exercise instructions

Initialize the Pinecone client to allow 20 simultaneous requests.
Upsert the vectors in vectors in batches of 200 vectors per request asynchronously, configuring 20 simultaneous requests.
Print the updated metrics of the 'datacamp-index' Pinecone index.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Initialize the client
pc = Pinecone(api_key="____", ____)

index = pc.Index('datacamp-index')

# Upsert vectors in batches of 200 vectors
with pc.Index('datacamp-index', ____) as index:
    async_results = [____(vectors=chunk, ____) for chunk in chunks(vectors, batch_size=____)]
    [async_result.get() for async_result in async_results]

# Retrieve statistics of the connected Pinecone index
print(____)

Edit and Run Code

This exercise is part of the course

Vector Databases for Embeddings with Pinecone

IntermediateSkill Level

4.8+

Start Course for Free

Explore the mechanics behind Pinecone's vector database, from pods and indexes to comparing it with other databases. Learn to differentiate pod types, acquire API keys, and initialise Pinecone connection using python. Finally, you’ll learn how to create Pinecone indexes, exploring different parameters such as dimensionality, distance metrics, pod types, and others.

Exercise 1: Introduction to Pinecone indexes Exercise 2: Creating a Pinecone client Exercise 3: Your first Pinecone index Exercise 4: Managing indexes Exercise 5: Connecting to an index Exercise 6: Deleting an index Exercise 7: The Pinecone ecosystem Exercise 8: Vector ingestion Exercise 9: Checking dimensionality Exercise 10: Ingesting vectors with metadata

Get hands-on with Pinecone in Python, where we explore the practical side of using Pinecone for managing indexes, adding vectors with metadata, searching and retrieving vectors, and making updates or deletions. Gain a solid grasp of the key functions and ideas to smoothly handle data in the Pinecone vector database.

Exercise 1: Retrieving vectors Exercise 2: Querying vs. fetching Exercise 3: Fetching vectors Exercise 4: Querying vectors Exercise 5: Returning the most similar vectors Exercise 6: Changing distance metrics Exercise 7: Metadata filtering Exercise 8: Filtering queries Exercise 9: Multiple metadata filters Exercise 10: Updating and deleting vectors Exercise 11: Updating vector values Exercise 12: Updating vector metadata Exercise 13: Deleting vectors

In this chapter, learners delve into optimizing Pinecone index performance, leveraging multi-tenant namespaces for cost reduction, building semantic search engines, and creating retrieval-augmented question answering systems using Pinecone with the OpenAI API. Through these lessons, learners gain practical skills in performance tuning, semantic search, and retrieval-augmented question answering, empowering them to apply Pinecone effectively in real-world AI applications.

Exercise 1: Batching upserts Exercise 2: Defining a function for chunking Exercise 3: Batching upserts in chunks Exercise 4: Batching upserts in parallel

Current Exercise

Exercise 5: Multitenancy and namespaces Exercise 6: Namespaces Exercise 7: Querying namespaces Exercise 8: Semantic search with Pinecone Exercise 9: Creating and configuring a Pinecone index Exercise 10: Upserting vectors for semantic search Exercise 11: Querying vectors for semantic search Exercise 12: RAG chatbot with Pinecone and OpenAI Exercise 13: Upserting YouTube transcripts Exercise 14: Building a retrieval function Exercise 15: RAG questions answering function Exercise 16: Congratulations!