Vector ingestion
1. Vector ingestion
Great work so far!2. Creating and connecting to an index
Now that we've created an index and connected to it, we're ready to start ingesting some vectors!3. Ingesting vectors
Here's a series of ten vectors that we'll ingest into our index. It's a list of dictionaries containing unique IDs and vector values. Pinecone requires vectors to be structured in this particular way, so we may need to perform a bit of list and dictionary manipulation to get to this point. Before we start ingesting these vectors, it's also important to check that they have the same dimensionality as the index.4. Checking dimensionality
To do this, we create a list comprehension that checks if the length of the list located under the 'values' key is 1536 for each vector in vectors. To check that this condition is true for every vector, we can call the all() function on the list. Since this is true, we can be sure that these vectors are safe to ingest. If the dimensionality doesn't match, we'll get an exception message from the API.5. Upserting vectors
The .upsert() method is commonly used for ingesting vectors. Upsert is a combination of update and insert: if we try to ingest a vector ID that is already present in the index, it will get updated with the new data. If the vector isn't in the index, it will be inserted. We call the .upsert() method on our index, and pass it the vectors. To check that the vectors were successfully upserted, we can call the .describe_index_stats() method. We can see all 10 vectors were inserted. The index_fullness is still zero, as 10 is a tiny fraction of the index's maximum capacity. Note that it can take a few moments for this data to refresh, so .describe_index_stats() may not immediately show the updated statistics.6. Ingesting vectors with metadata
Many vectors will have some form of metadata associated with them. Metadata is data about our data, this could be a date or time, a category designation, or user attribution. In our vectors list of dictionaries, it should be stored as a dictionary under the "metadata" key, where metadata labels and values are key:value pairs. Ingesting this data into the database can be very useful. When we come to query the data, rather than having to search through every vector in the index, we can filter by metadata to only search over the most relevant records, which is much faster. We'll cover metadata filtering in the next chapter; for now, we'll ingest it.7. Upserting vectors with metadata
Fortunately, the syntax is the same as before. Providing that the metadata for structured as a dictionary under a "metadata" key, Pinecone knows exactly what to do!8. Let's practice!
Let's ingest some vectors!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.