Get startedGet started for free

Controlling output with top_p and max_tokens

Besides temperature, the parameters top_p (nucleus sampling) and max_tokens also affect AI model outputs. top_p controls output diversity by limiting the probability of tokens sampled, while max_tokens controls response length. In this exercise, you will test the function with different parameters combinations to see how the response differs.

In this exercise, the boto3 and json libraries, and the bedrock client, have been pre-imported.

This exercise is part of the course

Introduction to Amazon Bedrock

View Course

Exercise instructions

  • Initialize Bedrock client.
  • Generate a concise story using a low top_p and low max_tokens and a more creative story using a high top_p and high max_tokens, keeping max_tokens to a maximum of 200.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def generate_story_with_params(bedrock, top_p, max_tokens):
    messages = [{"role": "user", 
                 "content": "Write a story about our robot writing a bestselling AI cookbook memoir."}]
    request_body=json.dumps({"anthropic_version": "bedrock-2023-05-31", "max_tokens": max_tokens,
                 "top_p": top_p, "messages": messages})
    response = bedrock.invoke_model(body=request_body, modelId='anthropic.claude-3-5-sonnet-20240620-v1:0')
    response_body = json.loads(response.get('body').read().decode())
    return response_body["content"][0]["text"]
    
# Modify the parameters to create the two stories
short_focused = generate_story_with_params(bedrock, ____, ____)
long_diverse = generate_story_with_params(bedrock, ____, ____)

print("More focused: ", short_focused, "More creative: ", long_diverse)
Edit and Run Code