Home icon

Manage multi-tenant Amazon Bedrock costs using application inference profiles

Machine Learning Blog



This article discusses managing multi-tenant Amazon Bedrock costs using application inference profiles, offering a comprehensive solution for tracking and controlling generative AI service expenses.

  • Application inference profiles enable granular cost tracking by associating metadata with each inference request
  • Solution includes creating CloudWatch dashboards, SNS alerts, and an API Gateway endpoint for cost monitoring
  • Provides a GitHub sample solution that demonstrates:
    • Tracking usage across multiple tenants
    • Creating tenant-specific cost allocation
    • Implementing multi-level alerting system
  • Key features include:
  • Tagging requests with tenant ID, business unit, or application ID
  • Creating alarms for token costs, tokens per minute, and requests per minute

The solution helps organizations implement sophisticated cost management for multi-tenant generative AI services, providing visibility and control over resource consumption.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Aug 4
2025
Cost tracking multi-tenant model inference on Amazon Bedrock
May 21
2026
Building multi-tenant agents with Amazon Bedrock AgentCore
Aug 28
2024
Implementing tenant isolation using Agents for Amazon Bedrock in a multi-tenant environment
Dec 16
2024
Multi-tenant RAG with Amazon Bedrock Knowledge Bases

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.