Query a Chat Completion Model
1. Query a Chat Completion Model
Now, let's learn about some of the AI functionality that Databricks provides. Specifically, we will dive into how to use Databricks to query Chat Completion Models.2. Query serving endpoint
The Serving Endpoint API exposes a `query` function that can be used to query the AI model being served by a specified serving endpoint. Some of the parameters it accepts are `name`, `messages`, and `max_tokens`. `name` is the name of the model representing the serving endpoint. `messages` is a list of `ChatMessage` objects and `max_tokens` tells the model the maximum number of "tokens" or words that the model can include in its response to queries, allowing us to limit the length of responses from the model.3. ChatMessage
We mentioned that we pass in a list of `ChatMessage` objects into the `messages` parameter when querying a serving endpoint. A `ChatMessage` requires `content`, which is the content of the query or directive we want to send to the LLM. The `role` represents the `ChatMessageRole`.4. ChatMessageRole
If we want to provide instructions to the LLM about how to respond to future queries, we send a `ChatMessage` with a `SYSTEM` `ChatMessageRole`. If we want to ask the LLM a query then we specify the role as a `USER` `ChatMessageRole`.5. Query chat completion Large Language Model (LLM)
Let's look put this all together with an example. First we use the `system` role to tell the LLM to act like a helpful assistant when responding to future queries. Next, we ask it to explain what a fibonacci sequence is, using the "user" role. The response from the query is a list of `choices` objects. We can use `response.choices[0].message.content` to retrieve the content of the first chat message response from the LLM.6. Query response structure
If we ask a single question, we can parse the content of the AI model response as shown in the code snippet.7. Example: summarize text
LLMs in Databricks can do more than just provide responses to queries,they can also summarize large amounts of content succinctly and accurately. To demonstrate the Chat model's ability to summarize text, we query the Databricks Meta Llama model to summarize a long string of text, stored in a variable `some_long_text`.8. Example: generate content
Here is an example of using the `service_endpoints.query()` method on the `WorkspaceClient` to query the Meta Llama 3 Instruct Model. To demonstrate the Chat model's ability to generate content, we first tell the AI agent to act like a ghost writer for a famous country singer, using the "SYSTEM" `ChatMessageRole`. We also send a message with the "USER" `ChatMessageRole`, asking it to write the lyrics to a country song that takes place in Mississippi.9. Let's practice!
Let's practice using the Databricks SDK to interact with AI models and ask them to do useful tasks for us.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.