Manage multi-tenant Amazon Bedrock costs using application inference profiles

Machine Learning Blog

This article discusses managing multi-tenant Amazon Bedrock costs using application inference profiles, offering a comprehensive solution for tracking and controlling generative AI service expenses.

Application inference profiles enable granular cost tracking by associating metadata with each inference request
Solution includes creating CloudWatch dashboards, SNS alerts, and an API Gateway endpoint for cost monitoring
Provides a GitHub sample solution that demonstrates:
- Tracking usage across multiple tenants
- Creating tenant-specific cost allocation
- Implementing multi-level alerting system
Key features include:
Tagging requests with tenant ID, business unit, or application ID
Creating alarms for token costs, tokens per minute, and requests per minute

The solution helps organizations implement sophisticated cost management for multi-tenant generative AI services, providing visibility and control over resource consumption.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Aug 4
2025

Cost tracking multi-tenant model inference on Amazon Bedrock

May 21
2026

Building multi-tenant agents with Amazon Bedrock AgentCore

Aug 28
2024

Implementing tenant isolation using Agents for Amazon Bedrock in a multi-tenant environment

Dec 16
2024

Multi-tenant RAG with Amazon Bedrock Knowledge Bases

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Manage multi-tenant Amazon Bedrock costs using application inference profiles

Related articles