Home icon

Serverless generative AI architectural patterns – Part 1

Compute Blog



This article explores serverless architectural patterns for generative AI applications, focusing on three primary design approaches for real-time implementations:

  • Synchronous request-response pattern using REST APIs, GraphQL, and chatbot interfaces
  • Asynchronous request-response pattern with WebSocket APIs for bidirectional communication
  • Asynchronous streaming response pattern for incremental result delivery

The recommended architecture follows a three-tier approach:

  • Frontend layer for user interaction
  • Middleware layer with API, prompt engineering, and orchestration sub-layers
  • Backend services for model hosting and private data sources

Key AWS services highlighted include Amazon Bedrock, API Gateway, AWS AppSync, Step Functions, and Lambda for building scalable, flexible generative AI applications.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Sep 4
2025
Serverless generative AI architectural patterns – Part 2
Aug 9
2024
Emerging Architecture Patterns for Integrating IoT and generative AI on AWS
Jun 6
2024
Operationalize generative AI applications on AWS: Part II – Architecture Deep Dive
Apr 24
2024
Let’s Architect! Discovering Generative AI on AWS

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.