Serverless generative AI architectural patterns – Part 1
Compute Blog
This article explores serverless architectural patterns for generative AI applications, focusing on three primary design approaches for real-time implementations:
- Synchronous request-response pattern using REST APIs, GraphQL, and chatbot interfaces
- Asynchronous request-response pattern with WebSocket APIs for bidirectional communication
- Asynchronous streaming response pattern for incremental result delivery
The recommended architecture follows a three-tier approach:
- Frontend layer for user interaction
- Middleware layer with API, prompt engineering, and orchestration sub-layers
- Backend services for model hosting and private data sources
Key AWS services highlighted include Amazon Bedrock, API Gateway, AWS AppSync, Step Functions, and Lambda for building scalable, flexible generative AI applications.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.