Amazon Bedrock Agents are like smart assistants for your AWS infrastructure — they can reason, decide what to do next, and trigger actions using Lambda functions.
In this article, I’ll show how I built a Supervisor Agent that orchestrates multiple AWS Lambdas to:
-
List EC2 instances,
-
Fetch their CPU metrics from CloudWatch,
-
Combine both results intelligently — all without the agent ever calling AWS APIs directly.
By the end, you’ll understand how Bedrock Agents work, how to use action groups, and how to chain Lambdas through a supervisor function — a clean, scalable pattern for multi-step automation.
Let’s check on the diagram and with other example of what the agent is, for better visuality and understanding:
User makes a call to the Bedrock agent (1) with some task, let’s say, “how much of TVs do you have in stock?”. The agent knows by a defined prompt that if the question is related to checking stock status, they need to call (2) the “database“ action group (3, AG). In the database AG, we defined a lambda function to use (4), and this lambda will check the status in the DynamoDB table (5), get the response (6,7) and return the answer to the user (8).
Let’s check one more example: 
Each agent can have multiple action groups, for example, we want to get information about some AWS resources, like list all ECS tasks, the logic is the same as for the previous one. To explain the agent that AG to call – we need to use the prompt, will explain later.
And more example:

We added one more AG with EKS action groups. As you see here, each action group can have more than one lambda function to make requests to. In this example, it’s listing and deleting resources from some existing K8S cluster.
The action group and lambda function can have any functionality, even if you need to get data from a third-party API to fetch weather data or flight ticket availability.
I hope it’s a bit clear now, and let’s get back to our supervisor agent setup:
In AWS console, open Bedrock → Agents → Create agent

Give it a name and Create

Once created, you can change the model if you want to or keep Claude by default. I will change to Nova Premier 1.0
Add description and instructions for the Agent. The action group we will create on the next step
You are the main AWS Supervisor Agent.
Goal: Help analyze AWS infrastructure.
Action Groups:
- ec2: list_instances → returns instance list + instanceIds
Rules:
- Never call AWS APIs directly.
- For EC2:
- Call ec2_listinstances
Always use before actions.
Note:
ec2 – action group name
list_instances – function name, as I mentioned previously – you can have multiple functions per each action group

And click the “Save”

And “Prepare” buttons in the top. Prepare will be active once you save.

Scroll down to the actions group → add


Name – EC2. Action group. invocation – create a new lambda function, where list_instances must be the same as we defined in the agent instructions

Add action group name and description, click Create and again “Save“ and “Prepare“.
Go to your lambda function, Bedrock created the function with EC2 prefix in the name and add this code:
import logging
from typing import Dict, Any
from http import HTTPStatus
import boto3
logger = logging.getLogger()
logger.setLevel(logging.INFO)
ec2_client = boto3.client('ec2')
def lambda_handler(event: Dict[str, Any], context: Any) -> Dict[str, Any]:
"""
AWS Lambda handler for processing Bedrock agent requests related to EC2 instances.
Supports:
- Listing all EC2 instances
- Describing a specific instance by ID
"""
try:
action_group = event['actionGroup']
function = event['function']
message_version = event.get('messageVersion', 1)
parameters = event.get('parameters', [])
response_text = ""
if function == "list_instances":
# List all EC2 instances
instances = ec2_client.describe_instances()
instance_list = []
for reservation in instances['Reservations']:
for instance in reservation['Instances']:
instance_list.append({
'InstanceId': instance.get('InstanceId'),
'State': instance.get('State', {}).get('Name'),
'InstanceType': instance.get('InstanceType'),
'PrivateIpAddress': instance.get('PrivateIpAddress', 'N/A'),
'PublicIpAddress': instance.get('PublicIpAddress', 'N/A')
})
response_text = f"Found {len(instance_list)} EC2 instance(s): {instance_list}"
elif function == "describe_instance":
# Expect a parameter with the instance ID
instance_id_param = next((p for p in parameters if p['name'] == 'instanceId'), None)
if not instance_id_param:
raise KeyError("Missing required parameter: instanceId")
instance_id = instance_id_param['value']
result = ec2_client.describe_instances(InstanceIds=[instance_id])
instance = result['Reservations'][0]['Instances'][0]
response_text = (
f"Instance {instance_id} details: "
f"State={instance['State']['Name']}, "
f"Type={instance['InstanceType']}, "
f"Private IP={instance.get('PrivateIpAddress', 'N/A')}, "
f"Public IP={instance.get('PublicIpAddress', 'N/A')}"
)
else:
response_text = f"Unknown function '{function}' requested."
# Format Bedrock agent response
response_body = {
'TEXT': {
'body': response_text
}
}
action_response = {
'actionGroup': action_group,
'function': function,
'functionResponse': {
'responseBody': response_body
}
}
response = {
'response': action_response,
'messageVersion': message_version
}
logger.info('Response: %s', response)
return response
except KeyError as e:
logger.error('Missing required field: %s', str(e))
return {
'statusCode': HTTPStatus.BAD_REQUEST,
'body': f'Error: {str(e)}'
}
except Exception as e:
logger.error('Unexpected error: %s', str(e))
return {
'statusCode': HTTPStatus.INTERNAL_SERVER_ERROR,
'body': f'Internal server error: {str(e)}'
}
NOTE: the response of the function must be in Bedrock-specific format, details can be found in the documentation:
https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html
After you updated your function code – go to the function Configuration → permissions → role name create a new inline policy:
As JSON:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances"
],
"Resource": [
"*"
]
}
]
}
Now we can go back to our agent and click “Test”, enter text to check if it actually works:

Cool! The first action group works as expected, lets add one more action group to list cloudwatch metrics:

Name of the action group – cloudwatch

Name of the function is getMetrics, add description and parameters, since this lambda must know the instance or intances to check metrics of
Update the agent prompt to explain how we want to use the new action group, and click “Save” and “Prepare” again
You are the main AWS Supervisor Agent.
Goal: Help analyze AWS infrastructure.
Action Groups:
- ec2: describeInstances → returns instance list + instanceIds
- cloudwatch: getMetrics → needs instance_ids
Rules:
- Never call AWS APIs directly.
- For EC2 + CPU:
- Call ec2__describeInstances
- Extract instanceIds
- Call cloudwatch__getMetrics
- Combine results.
Always use before actions.
Now lets update our cloudwatch function code:
import boto3
import datetime
import logging
import json
from typing import Dict, Any
from http import HTTPStatus
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event: Dict[str, Any], context: Any) -> Dict[str, Any]:
try:
action_group = event["actionGroup"]
function = event["function"]
message_version = event.get("messageVersion", 1)
parameters = event.get("parameters", [])
region = "us-east-1"
instance_ids = []
# --- Parse parameters ---
for param in parameters:
if param.get("name") == "region":
region = param.get("value")
elif param.get("name") == "instance_ids":
raw_value = param.get("value")
if isinstance(raw_value, str):
# Clean up stringified list from Bedrock agent
raw_value = raw_value.strip().replace("[", "").replace("]", "").replace("'", "")
instance_ids = [x.strip() for x in raw_value.split(",") if x.strip()]
elif isinstance(raw_value, list):
instance_ids = raw_value
logger.info(f"Parsed instance IDs: {instance_ids}")
if not instance_ids:
response_text = f"No instance IDs provided for CloudWatch metrics in {region}."
else:
cloudwatch = boto3.client("cloudwatch", region_name=region)
now = datetime.datetime.utcnow()
start_time = now - datetime.timedelta(hours=1)
metrics_output = []
for instance_id in instance_ids:
try:
metric = cloudwatch.get_metric_statistics(
Namespace="AWS/EC2",
MetricName="CPUUtilization",
Dimensions=[{"Name": "InstanceId", "Value": instance_id}],
StartTime=start_time,
EndTime=now,
Period=300,
Statistics=["Average"]
)
datapoints = metric.get("Datapoints", [])
if datapoints:
datapoints.sort(key=lambda x: x["Timestamp"])
avg_cpu = round(datapoints[-1]["Average"], 2)
metrics_output.append(f"{instance_id}: {avg_cpu}% CPU (avg last hour)")
else:
metrics_output.append(f"{instance_id}: No recent CPU data")
except Exception as e:
logger.error(f"Error fetching metrics for {instance_id}: {e}")
metrics_output.append(f"{instance_id}: Error fetching metrics")
response_text = (
f"CPU Utilization (last hour) in {region}:n" +
"n".join(metrics_output)
)
# --- Bedrock Agent response format ---
response_body = {
"TEXT": {
"body": response_text
}
}
action_response = {
"actionGroup": action_group,
"function": function,
"functionResponse": {
"responseBody": response_body
}
}
response = {
"response": action_response,
"messageVersion": message_version
}
logger.info("Response: %s", response)
return response
except Exception as e:
logger.error(f"Unexpected error: {e}")
return {
"statusCode": HTTPStatus.INTERNAL_SERVER_ERROR,
"body": f"Internal server error: {str(e)}"
}
And update the cloudwatch lambda permissions as we did for ec2 lambda:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Action": [
"cloudwatch:GetMetricStatistics"
],
"Resource": [
"*"
]
}
]
}
And test it again

We have EC2 and CloudWatch action groups, and they can be called from the agent to get a list of EC2 instances and their CPU metrics. Now, let’s add a Supervisor function to make this process smarter and more efficient.
Instead of the agent calling both EC2 and CloudWatch separately, the Supervisor takes care of that logic. It first calls the EC2 function to get all instances, then passes those instance IDs to the CloudWatch function to fetch metrics, and finally combines everything into one clear result.
This way, the agent only needs to call one action — the Supervisor — while the Supervisor coordinates all the steps in the background. It’s cleaner, faster, and easier to maintain.

Give it a name and description

Give the function name and description
And update the agent instructions to avoid a direct call to the ec2 and CloudWatch action groups:

And click “Save“ and “Prepare“.
Update the supervisor lambda function code,
NOTE: need to update your EC2 and Cloudwatch functions name in the code below:
import boto3
import json
import logging
import re
import ast
logger = logging.getLogger()
logger.setLevel(logging.INFO)
lambda_client = boto3.client("lambda")
def lambda_handler(event, context):
try:
action_group = event["actionGroup"]
function = event["function"]
parameters = event.get("parameters", [])
message_version = event.get("messageVersion", "1.0")
# Parse parameters
region = "us-east-1"
for param in parameters:
if param.get("name") == "region":
region = param.get("value")
# Decide routing
if function == "analyzeInfrastructure":
logger.info("Supervisor: calling EC2 and CloudWatch")
# Step 1: call EC2 Lambda
ec2_payload = {
"actionGroup": "ec2",
"function": "list_instances",
"parameters": [{"name": "region", "value": region}],
"messageVersion": "1.0"
}
ec2_response = invoke_lambda("ec2-yeikw", ec2_payload) #### CHANGE TO YOUR EC2 FUNCTION NAME
instances = extract_instance_ids(ec2_response)
# Step 2: call CloudWatch Lambda (if instances found)
if instances:
cw_payload = {
"actionGroup": "cloudwatch",
"function": "getMetrics",
"parameters": [
{"name": "region", "value": region},
{"name": "instance_ids", "value": instances}
],
"messageVersion": "1.0"
}
cw_response = invoke_lambda("cloudwatch-ef6ty", cw_payload) #### CHANGE TO YOUR CLOUDWATCH FUNCTION NAME
final_text = merge_responses(ec2_response, cw_response)
else:
final_text = "No instances found to analyze."
else:
final_text = f"Unknown function: {function}"
# Construct Bedrock-style response
response = {
"messageVersion": message_version,
"response": {
"actionGroup": action_group,
"function": function,
"functionResponse": {
"responseBody": {
"TEXT": {"body": final_text}
}
}
}
}
logger.info("Supervisor response: %s", response)
return response
except Exception as e:
logger.exception("Error in supervisor")
return {
"statusCode": 500,
"body": f"Supervisor error: {str(e)}"
}
def invoke_lambda(name, payload):
"""Helper to call another Lambda and parse response"""
response = lambda_client.invoke(
FunctionName=name,
InvocationType="RequestResponse",
Payload=json.dumps(payload),
)
result = json.loads(response["Payload"].read())
return result
def extract_instance_ids(ec2_response):
"""Extract instance IDs from EC2 Lambda response"""
try:
body = ec2_response["response"]["functionResponse"]["responseBody"]["TEXT"]["body"]
# Try to extract JSON-like data after "Found X EC2 instance(s):"
if "Found" in body and "[" in body and "]" in body:
data_part = body.split(":", 1)[1].strip()
try:
instances = ast.literal_eval(data_part) # safely parse the list
return [i["InstanceId"] for i in instances if "InstanceId" in i]
except Exception:
pass
# fallback regex in case of plain text
return re.findall(r"i-[0-9a-f]+", body)
except Exception as e:
logger.error("extract_instance_ids error: %s", e)
return []
def merge_responses(ec2_resp, cw_resp):
"""Combine EC2 and CloudWatch outputs"""
ec2_text = ec2_resp["response"]["functionResponse"]["responseBody"]["TEXT"]["body"]
cw_text = cw_resp["response"]["functionResponse"]["responseBody"]["TEXT"]["body"]
return f"{ec2_text}nn{cw_text}"
And again, add supervisor lambda permisions to invoke our EC2 and Cloudwatch functions, for example:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "lambda:InvokeFunction",
"Resource": [
"arn:aws:lambda:us-east-1:<account_id>:function:ec2-<id>",
"arn:aws:lambda:us-east-1:<account_id>:function:cloudwatch-<id>"
]
}
]
}
Lets test the function again, and surprisingly it fails

I checked my supervusor supervisor function logs and see this

One it seems it doesn’t show anything useful, but not – the hint it 3000.00ms. its default lambda function timeout, lets adjust it. Go to supervisor function – configuration – general and edit Timeout parameter , I changed to 10 seconds

And it helped!

You can continue extending this functionality by adding AWS billing analysis to find the most expensive resources or most expensive ec2 instances you run, and so on and you don’t have to be limited only by AWS resources. Feel free to have some external functionality.
