Task-specific LLM functions & Helper functions

1. Task-specific LLM functions & Helper functions

Welcome back. That was the complete function in the last video, which we discussed completely. Now, we will look at the task-specific functions available to us. These are purpose-built and handle common tasks such as translation, sentiment analysis, summarization, and text classification without the need for prompting or specifying LLMs. It's a good time to pause the video and log into your Snowflake account if you're not already logged in. Navigate to Projects on the left panel, select Notebooks, and click on Intro to LLM Functions Notebook. We had already run until Intro to Complete. Look at the cells below Intro to Task-Specific Functions. First, let's look at the Translate function. Translate is a helpful function that, as the name implies, allows us to easily translate strings or columns in batch from one language to another. Let's take a look at how we do this in Python. When we use Cortex Translate, we pass the text or table column, source language, and target language to the function. And if we don't know the source language ahead of time, it can also be auto-detected by leaving the source language as an empty string. Cortex Translate currently supports 11 languages, but it is always good to check the documentation to see which languages are supported as well as which new ones are added over time. Another task-specific function that we can use in Cortex is Sentiment. Sentiment allows us to do aspect-based sentiment classification, which analyzes a text string or column and returns a value between a negative one to one, depending on how negative or positive the sentiment of the text is. Same as all the other LLM functions, we can run this on a single string or an entire column in a table. We'll show that here. Next is the Summarize function. Summarize automatically generates a concise high-level summary of a large body of text. It relies on a pre-trained LLM that understands general language patterns and provides a neutral high-level summary. The last task-specific function we will look at is ClassifyText. ClassifyText classifies free-form text into categories that you provide. All we need to do is pass the text or column of text we want classified and the list of categories that we want Classify to use. It allows us to do zero-shot classification into an arbitrary set of labels. All we need to do is pass the text or a column of text and a list of possible classes. There is no need to train a predictor, like in classical NLP, or even craft a custom problem. Here is what the basic structure looks like in both Python and SQL. In the SQL example here, we see how this is used to classify an entire table of support call transcripts. In this example, we are calling the function and specifying the input column, transcript, from the table call transcripts. We are also specifying the two categories that we want the classify function to use. To improve text classification performance even further, we can include descriptions and examples for each label, along with the description for the overall task, if we like. In the SQL example here, we see how we can pass the text or column of text into a table of support call transcripts. In this example, we are using the function and specifying the input column, transcripts, from the table call transcripts. In the SQL example here, we are using the description for the overall task, if we like. While text classification is a useful task on its own, it can also serve as a powerful starting point for LLM applications. One common way to do this is routing. Classify text can categorize the user intent based on the query, and choose whether to send the query to a search system, a function call, or even between language models of varying sizes and capabilities. Now in Python, we can see a simple illustration of how classify text is used to classify the user's intent to be used for downstream routing. Here, we want to classify each query as either looking for how-to or recommendations. The classified intent would then determine the next step in the LLM chain. We will see these functions in more detail in a later video. As we've seen, complete and task-specific LLM functions are quite powerful. It is worth noting that, as we saw, they can be run either on a single text string or on an entire column of table in batch. Next, we will move on to talk about helper functions that we can use from prototyping an LLM application to promoting it to production. Cortex offers two additional functions that can help as we build applications for production use, try-complete and count-tokens. The try-complete function helps with handling failures gracefully. Conceptually, it is similar to using a try-except block in Python. It functions the same as complete if the operation is successful, but returns null rather than an error if the operation fails. This is particularly useful for cases when the input text we pass onto an LLM is longer than the context window it supports. These kinds of failures while building AI applications can be dealt with gracefully by using try-complete. Very helpful. Now, let's talk about the helper function count-tokens. The count-tokens function calculates the total number of tokens in a prompt that is going to be used by complete or the input text of a task-specific function. This is important because each model uses a slightly different tokenizer under the hood, resulting in different token counts. A token is the smallest unit of text processed by the Snowflake Cortex LLM functions. This is a basic unit of processed data when working with language models. Each token is approximately equal to four characters and can be either a word, a single character, or part of the word. To better understand the Snowflake credits used based on the token count, refer to the reading prior to this video. Look for the Snowflake service consumption table for each function's cost in credits per million tokens. Now that we have covered the three types of LLM functions, Cortex complete, task-specific, and the helper functions, let's move on to talk about the access controls for them. If you took our introductory course or have prior experience with Snowflake, you will be familiar with role-based access controls. It is a mechanism for restricting access of data and apps to users based on their role. Within Snowflake Cortex, the Cortex user database role ensures that only the authorized users can use Cortex LLM functions or other Cortex features. By default, the Cortex user role is granted to the public role. This allows all users on a Snowflake account to use the Cortex LLM functions. However, an account admin can revoke the Cortex user role from the public role to restrict it as needed. The Cortex user role defines a set of access permissions that can be granted to others that will be using our applications. This makes sure that without the Cortex user role, one cannot get access to Cortex capabilities. In this video, we've gone over a lot. Let's remind ourselves about what we looked at. We looked at the three types of LLM functions available with Cortex. We started by looking at how Cortex Complete can run inference on models like Claude 3.5 Sonnet to perform text generation. From there, we looked at how to use the task-specific LLM functions for translation, sentiment analysis, text classification, and summarization to extract useful insights from our data. And then we looked at the helper functions, tryComplete and CountTokens. Finally, we looked at how we can control access to these functions using the Cortex user database role. Now that we have looked at all the LLM functions, we are ready to roll up our sleeves and get some hands-on experience with them. In the next video, we are going to dig in and start using Cortex Complete. See you in the next video.

2. Let's practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.