How Public AI delivers sovereign LLM inference on AWS and Intel
AWS Partner Network Blog
This article describes how Public AI deployed sovereign LLM inference using AWS and Intel infrastructure for the Apertus multilingual model launch.
- Public AI built production inference platform on Amazon EKS with Intel-powered EC2 instances
- Architecture deployed in AWS Europe (Zurich) region to maintain Swiss data residency requirements
- Stack includes vLLM serving, Amazon Cognito authentication, AWS WAF protection, and Bedrock Guardrails
- Apertus 8B model runs cost-effectively on Intel R8i instances; 70B variant uses GPU instances
- Intel Xeon processors available across AWS Regions, Local Zones, and Outposts for jurisdiction flexibility
- Platform serves thousands of daily users and sustains peak concurrent demand reliably
- Repeatable blueprint enables rapid deployment of sovereign models across multiple jurisdictions
Public AI demonstrates a scalable, jurisdiction-compliant architecture for deploying open-weight LLMs using AWS managed services and Intel processors, establishing a replicable model for sovereign AI initiatives globally.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.