Designing multi-step AI workflows

1. Designing multi-step AI workflows

We've worked on individual Snowflake Cortex functions, but the magic happens when we combine them into multi-step AI workflows.

2. Cortex review workflow

In this video, we'll learn how to chain together multiple Snowflake Cortex functions to create end-to-end solutions for real business needs. Specifically, we'll translate reviews into English, summarize them, assign a category, and generate a response. Lastly, we'll translate the response back into the customer's language.

3. Extracting reviews

Let's start by gathering a Spanish review from our database. We convert the output to a pandas DataFrame, as usual. To build our workflow, we'll extract the first review from `"DESCRIPTION"` column.

4. Spanish review

Looking at the review, we can see it is quite long and, as expected, is written in Spanish. Let's build our pipeline!

5. Translation

First, we generate translated text using the `translate()` function to convert the review from Spanish to English. Check out the results - `translate()` even works well with this large body of text!

6. Summarization

Next, we summarize the translated text to stay focused on the key points.

7. Classification

Now, we classify the text based on labels including staff, cleanliness, and pricing. This summary is classified by the model as primarily being about the staff at the hotel, so we can run our response by the support team for review.

8. Text generation

The penultimate step is to generate a response, feeding the summarized review to the prompt using a f-string. Displaying the output, we see that the AI has acknowledged the customer's frustrations and highlighted that we will take measures to ensure these issues do not happen again.

9. Response translation

Finally, we translate our AI-generated response from English back to Spanish. Now we can respond directly to this customer in their native language!

10. Cortex cost model

We've seen the power of Snowflake Cortex for building automated AI workflows, but using these tools effectively requires strategy. First, consider cost. Each Cortex function bills based on compute usage and token volume - both input and output. Since tokens represent small text chunks, lengthy prompts or verbose responses quickly increase expenses.

11. Limiting cost

To optimize usage, start by trimming inputs. Process only relevant text before passing it to models. When using `complete()`, intentionally set the `max_tokens` parameter to limit output length. For routine tasks, keep `temperature` low to reduce output variance and avoid repeated calls.

12. Cortex best practices

Smart model chaining can also save money. In our pipeline, summarizing first makes sense because multiple downstream functions - `text_classify()`, `complete()`, and `translate()` - all use that summary. However, if you're only summarizing and generating text, providing full text directly to `complete()` might cost less. For production systems, implement logging, cache results for recurring queries, and run pipelines in batch mode rather than one-off executions. While these techniques fall outside our course scope, they're essential for scale. In short, Snowflake Cortex is powerful, but to keep usage cost-effective, we need to be deliberate with what we send, how often we call Cortex functions, and how we structure our workflows.

13. Let's practice!

Time to build your own multi-step AI workflow!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.