Controlling output with top_p and max_tokens
Besides temperature, the parameters top_p
(nucleus sampling) and max_tokens
also affect AI model outputs. top_p
controls output diversity by limiting the probability of tokens sampled, while max_tokens
controls response length. In this exercise, you will test the function with different parameters combinations to see how the response differs.
In this exercise, the boto3
and json
libraries, and the bedrock
client, have been pre-imported.
This exercise is part of the course
Introduction to Amazon Bedrock
Exercise instructions
- Initialize Bedrock client.
- Generate a concise story using a low
top_p
and lowmax_tokens
and a more creative story using a hightop_p
and highmax_tokens
, keepingmax_tokens
to a maximum of 200.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def generate_story_with_params(bedrock, top_p, max_tokens):
messages = [{"role": "user",
"content": "Write a story about our robot writing a bestselling AI cookbook memoir."}]
request_body=json.dumps({"anthropic_version": "bedrock-2023-05-31", "max_tokens": max_tokens,
"top_p": top_p, "messages": messages})
response = bedrock.invoke_model(body=request_body, modelId='anthropic.claude-3-5-sonnet-20240620-v1:0')
response_body = json.loads(response.get('body').read().decode())
return response_body["content"][0]["text"]
# Modify the parameters to create the two stories
short_focused = generate_story_with_params(bedrock, ____, ____)
long_diverse = generate_story_with_params(bedrock, ____, ____)
print("More focused: ", short_focused, "More creative: ", long_diverse)