Get startedGet started for free

Querying vectors

1. Querying vectors

Aside from fetching vectors, we can also query them.

2. The power of querying

Recall that querying involves sending a vector to our index and receiving the most semantically similar vectors in return. Querying is used in all kinds of AI applications. In chatbots, user prompts can be embedded and used to query a vector database and provide the model used with more relevant information. In effect, it gives the model more context than it had from its training data. Similarly, in semantic search engines, user searches are embedded and used to query the database to return the most relevant results.

3. The .query() method

We can use the .query() method to query the index. We need to pass it a vector to query it with, and the top_k argument, which is the number of results to return - the top_k most similar vectors. As with other index methods, the query vector will need to have the same dimensionality as the index and other vectors in the index. The output shows the IDs of the records with the most similar vectors. Each record is also given a score, which is a measure of the vector's similarity to the input vector. The vector values also aren't shown by default for brevity.

4. The .query() method

To return the vector values in the output, set the include_values argument to True. Notice for both of these queries, we consumed five read units.

5. Read units (RUs) for querying

Querying, like fetching, also consumes read units. For querying, calculating the number of read units that will be consumed in an operation is less straightforward than for fetching records. The number of RUs is dependent on the number of records in the namespace being read and the size of the records, which is itself dependent on the dimensionality of the vectors and the amount of metadata stored. Pinecone provide a guide on RUs consumed for different configurations, but if these numbers don't reflect your project, I recommend taking a deeper dive into Pinecone's documentation linked here.

6. Distance metrics

So how does the query method actually determine the most similar vectors? What happens is we compute the distance between the query vector and vectors in the index, and the vectors with the smallest distance are returned. There are different metrics for calculating the distance between vectors, with the cosine similarity being one of the most common and the default in Pinecone.

7. Distance metrics

There's also the Euclidean distance, which is essentially a straight-line distance,

8. Distance metrics

and the dot product.

9. Setting the distance metric

Determining the most appropriate distance metric can require some experimentation. The distance metric is set when creating the index using the metric argument, and can't be changed afterwards. The argument can take the strings 'cosine', 'euclidean', or 'dotproduct'.

10. Your turn to query!

Time to give querying a go!