Get startedGet started for free

Sorting by similarity

Now that you've embedded all of your features, the next step is to compute the similarities. In this exercise, you'll define a function called find_n_closest(), which computes the cosine distances between a query vector and a list of embeddings and returns the n smallest distances and their indexes.

In the next exercise, you'll use this function to enable your semantic product search application.

distance has been imported from scipy.spatial.

This exercise is part of the course

Introduction to Embeddings with the OpenAI API

View Course

Exercise instructions

  • Calculate the cosine distance between the query_vector and embedding.
  • Append a dictionary containing dist and its index to the distances list.
  • Sort the distances list by the 'distance' key of each dictionary.
  • Return the first n elements in distances_sorted.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def find_n_closest(query_vector, embeddings, n=3):
  distances = []
  for index, embedding in enumerate(embeddings):
    # Calculate the cosine distance between the query vector and embedding
    dist = ____
    # Append the distance and index to distances
    distances.append({"distance": ____, "index": ____})
  # Sort distances by the distance key
  distances_sorted = ____
  # Return the first n elements in distances_sorted
  return ____
Edit and Run Code