Implementing rate limit handling
When making multiple requests to Bedrock, we need to handle rate limits gracefully. In this exercise, you'll create a function that implements exponential backoff while keeping track of request attempts, ensuring our application remains reliable under heavy load.
The json
, time
and boto3
libraries are preloaded. The bedrock
client and model_id
have also been preloaded.
This exercise is part of the course
Introduction to Amazon Bedrock
Exercise instructions
- Complete the API request string with the prompt.
- Implement exponential backoff.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def smart_retry(prompt, max_attempts=3):
base_delay = 0.5
for attempt in range(max_attempts):
try:
# Complete the API request string with the prompt
response = bedrock.invoke_model(
modelId=model_id,
body=json.dumps({"anthropic_version": "bedrock-2023-05-31", "max_tokens": 100,
"messages": [{"role": "user", "content": [{"type": "text", "text": ____}]}]}))
return json.loads(response["body"].read().decode())["content"][0]["text"]
except Exception as e:
if "ThrottlingException" in str(e):
# Implement exponential backoff
time.sleep(____)
else: raise e
return "Max retries exceeded"
print(smart_retry("Tell me about podcasting."))