ToolSimulator: scalable tool testing for AI agents
Machine Learning Blog
This article introduces ToolSimulator, an LLM-powered framework within Strands Evals for safely testing AI agents that use external tools at scale.
- Replaces risky live API calls with intelligent, adaptive LLM-based simulations
- Maintains consistent state across multi-turn workflows without touching production systems
- Validates tool responses against Pydantic schemas to catch malformed outputs
- Generates realistic, context-appropriate responses based on tool schemas and agent inputs
- Supports independent simulator instances for parallel testing configurations
- Integrates seamlessly with Strands Evals evaluation pipelines and telemetry
- Available in Strands Evals SDK; no AWS account required for local testing
ToolSimulator enables developers to test complex agent workflows safely, catch integration bugs early, and deploy production-ready agents without managing test infrastructure or risking unintended side effects.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.