Implementing programmatic tool calling on Amazon Bedrock
Machine Learning Blog
This article explains programmatic tool calling (PTC), a paradigm that enables LLMs to generate Python code that orchestrates multiple tool calls within a sandboxed environment, dramatically reducing latency and token consumption compared to traditional sequential tool calling.
- PTC generates code once; sandbox executes tools; only final result returns to model context
- Reduces token consumption by 87–92% by eliminating intermediate results from context window
- Improves accuracy by using Python for data processing instead of natural language reasoning
- Three implementation approaches: self-hosted Docker on ECS, managed AgentCore Code Interpreter, Anthropic SDK proxy
- Self-hosted solution uses IPC protocol between orchestrator and Docker sandbox for tool execution
- Sandbox enforces strict security: no network, read-only filesystem, non-root user, resource limits
- AgentCore pre-loads tools into sandbox; model generates code calling pre-loaded functions directly
- Tested across eight models (Claude, Qwen, DeepSeek, MiniMax, Kimi, GLM); all achieved correct results in PTC mode
- Cost savings example: 90% reduction ($14,040/month savings on 1,000 daily executions)
PTC is model-agnostic and privately deployed within AWS accounts, offering significant efficiency gains for multi-tool workflows involving data processing, calculations, and orchestration.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2024
2026
2026
2025
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.