Getting started with ChromaDB
In the following exercises, you'll use a vector database to embed and query 1000 films and TV shows from the Netflix dataset introduced in the video. The goal will be to use this data to generate recommendations based on a search query. To get started, you'll create the database and collection to store the data.
chromadb
is available for you to use, and the OpenAIEmbeddingFunction()
has been imported from chromadb.utils.embedding_functions
. As with the first two chapters, you don't need to provide an OpenAI API key in this chapter.
This exercise is part of the course
Introduction to Embeddings with the OpenAI API
Exercise instructions
- Create a persistent client to save the database files to disk; you can leave out the file path for these exercises.
- Create a database collection called
netflix_titles
that uses the OpenAI embedding function. - List all of the collections in the database.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a persistant client
client = chromadb.____()
# Create a netflix_title collection using the OpenAI Embedding function
collection = client.create_collection(
name="____",
____=____(model_name="text-embedding-3-small", api_key="")
)
# List the collections
print(client.____())