Introducing OS Level Actions in Amazon Bedrock AgentCore Browser
Machine Learning Blog
This article introduces OS Level Actions for Amazon Bedrock AgentCore Browser, enabling AI agents to interact with operating system-level UI elements beyond the browser's web layer.
- Agents can now control native dialogs, security prompts, and OS-rendered content invisible to CDP
- Supports mouse actions: click, move, drag, and scroll at OS level
- Supports keyboard actions: type text, press individual keys, and execute key combinations
- Full desktop screenshots capture native UI for vision model analysis
- Action-screenshot-reaction loop allows agents to observe and respond to dynamic UI
- Eight supported actions with configurable parameters and constraints
- Requires IAM execution role with specific Bedrock AgentCore permissions
- Coordinates map to screen pixels within configured viewport dimensions
OS Level Actions extend AgentCore Browser beyond web automation, enabling agents to handle production scenarios involving native dialogs and system prompts previously inaccessible to web automation tools.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.