Filtering using metadata

Having metadata available to use in the database can unlock the ability to more easily filter results based on additional conditions. Imagine that the film recommendations you've be creating could access the user's set preferences and use those to further filter the results.

In this exercise, you'll be using additional metadata to filter your Netflix film recommendations. The netflix_titles collection has been updated to add metadatas to each title, including the 'rating', the age rating given to the title, and 'release_year', the year the title was initially released.

Here's a preview of an updated item:

{'ids': ['s999'],
 'embeddings': None,
 'metadatas': [{'rating': 'TV-14', 'release_year': 2021}],
 'documents': ['Title: Searching For Sheela (Movie)\nDescription: Journalists and fans await Ma Anand Sheela as the infamous former Rajneesh commune’s spokesperson returns to India after decades for an interview tour.\nCategories: Documentaries, International Movies'],
 'uris': None,
 'data': None}

Deze oefening maakt deel uit van de cursus

Introduction to Embeddings with the OpenAI API

Bekijk cursus

Oefeninstructies

Query two results from the collection using reference_texts.
Filter the results for titles with a 'G' rating that were also released before 2019.

Interactieve oefening met praktijkervaring

Probeer deze oefening door deze voorbeeldcode aan te vullen.

collection = client.get_collection(
  name="netflix_titles",
  embedding_function=OpenAIEmbeddingFunction(model_name="text-embedding-3-small", api_key="")
)

reference_texts = ["children's story about a car", "lions"]

# Query two results using reference_texts
result = collection.query(
  ____,
  ____,
  # Filter for titles with a G rating released before 2019
  ____={
    ____: [
        {"____": 
        	{____}
        },
        {"____": 
         	{____}
        }
    ]
  }
)

print(result['documents'])

Code bewerken en uitvoeren