Get startedGet started for free

Implementing rate limit handling

When making multiple requests to Bedrock, we need to handle rate limits gracefully. In this exercise, you'll create a function that implements exponential backoff while keeping track of request attempts, ensuring our application remains reliable under heavy load.

The json, time and boto3 libraries are preloaded. The bedrock client and model_id have also been preloaded.

This exercise is part of the course

Introduction to Amazon Bedrock

View Course

Exercise instructions

  • Complete the API request string with the prompt.
  • Implement exponential backoff.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def smart_retry(prompt, max_attempts=3):
  base_delay = 0.5
  
  for attempt in range(max_attempts):
    try:
      # Complete the API request string with the prompt
      response = bedrock.invoke_model(
        modelId=model_id,
        body=json.dumps({"anthropic_version": "bedrock-2023-05-31", "max_tokens": 100,
                         "messages": [{"role": "user", "content": [{"type": "text", "text": ____}]}]}))
      return json.loads(response["body"].read().decode())["content"][0]["text"]
    
    except Exception as e:
      if "ThrottlingException" in str(e):
        # Implement exponential backoff
        time.sleep(____)
      else: raise e
  return "Max retries exceeded"

print(smart_retry("Tell me about podcasting."))
Edit and Run Code