Observing and evaluating AI agentic workflows with Strands Agents SDK and Arize AX

Machine Learning Blog

This article discusses how Strands Agents SDK and Arize AX can be used to observe and evaluate AI agentic workflows, addressing key challenges in generative AI application development.

AI agents are nondeterministic, producing different results with the same input
Key challenges include unpredictable behavior, hidden failure modes, and complex tool integration
Arize AX provides comprehensive observability features:
- Tracing LLM operations
- Automated quality monitoring
- Prompt management
- Real-time dashboards and alerts
The solution demonstrates building a restaurant reservation agent using Strands SDK and instrumenting it with Arize AX

The article emphasizes that observability, automatic evaluations, and proactive monitoring are critical for deploying reliable AI agents in production environments.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

May 16
2025

Introducing Strands Agents, an Open Source AI Agents SDK

Nov 20
2025

Introducing Strands Agent SOPs – Natural Language Workflows for AI Agents

Mar 18
2026

Evaluating AI agents for production: A practical guide to Strands Evals

Jul 23
2026

Evaluating AI Agents: A production blueprint with Strands and AgentCore

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Observing and evaluating AI agentic workflows with Strands Agents SDK and Arize AX

Related articles