Multitenancy and namespaces

1. Multitenancy and namespaces

Batching is just one way of optimizing your query performance and data infrastructure; in this video, we'll explore another!

2. Multitenancy

Multitenancy is a software architecture where a system can serve multiple groups of users (more generally referred to as tenants) in isolation. In Pinecone, this typically refers to storing distinct datasets in separate namespaces while being served by a single index. This architecture not only allows us to separate data from different groups of users, to ensure security and privacy, but can also reduce query latency, as queries can be directed to specific namespaces, reducing the search space.

3. Multitenancy strategies

As we've discussed, namespaces can be used to separate vectors within a single index, which enables targeted queries by minimizing the number of scanned records. While this reduces the need for extra indexes, resource sharing and data deletion can become a challenge. There are also other strategies available, which may suit certain use cases. With metadata filtering, we've seen that we can attach metadata to records and use queries to search over records with specific metadata. This enables querying across tenants but may result in shared resources and cost tracking challenges. Lastly, we could create separate indexes for each segment, providing dedicated resources to each tenant. This model offers maximum isolation, but it requires additional effort and cost to maintain multiple indexes. We'll focus on option 1: utilizing namespaces.

4. Namespaces

Namespaces are created implicitly during upsertion if they don't already exist. For example, if we upsert vector_set1 into namespace1 using the namespace argument, namespace1 will be automatically created if it doesn't already exist. We can continue to create more namespaces by specifying new namespace names when upserting the different sets of vectors.

5. Inspecting namespaces

We can inspect our namespaces with the .describe_index_stats() method we've used previously. In the output, we can see that the vectors have been successfully upserted and distributed into the two new namespaces.

6. Querying vectors from namespaces

Now that we've created separate namespaces, it's time to query them! This is only requires one additional argument to indicate the namespace name that should be queried. Querying isn't the only operation that can be done on the namespace level; in fact, all of index operations we've looked at can directed to specific namespaces.

7. Deleting vectors from namespaces

For instance, we can delete vector IDs from specific namespaces. This is particularly useful, as IDs only need to be unique within a namespace, so there may be other records in different namespaces with the same ID, and we wouldn't want to delete those accidentally!

8. Let's practice!

Now it's your turn!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.