Home icon

How Public AI delivers sovereign LLM inference on AWS and Intel

AWS Partner Network Blog



This article describes how Public AI deployed sovereign LLM inference using AWS and Intel infrastructure for the Apertus multilingual model launch.

  • Public AI built production inference platform on Amazon EKS with Intel-powered EC2 instances
  • Architecture deployed in AWS Europe (Zurich) region to maintain Swiss data residency requirements
  • Stack includes vLLM serving, Amazon Cognito authentication, AWS WAF protection, and Bedrock Guardrails
  • Apertus 8B model runs cost-effectively on Intel R8i instances; 70B variant uses GPU instances
  • Intel Xeon processors available across AWS Regions, Local Zones, and Outposts for jurisdiction flexibility
  • Platform serves thousands of daily users and sustains peak concurrent demand reliably
  • Repeatable blueprint enables rapid deployment of sovereign models across multiple jurisdictions

Public AI demonstrates a scalable, jurisdiction-compliant architecture for deploying open-weight LLMs using AWS managed services and Intel processors, establishing a replicable model for sovereign AI initiatives globally.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

May 12
2026
Enabling AI sovereignty on AWS
May 9
2024
Deploy LLMs in AWS GovCloud (US) Regions using Hugging Face Inference Containers
Mar 16
2026
Introducing Disaggregated Inference on AWS powered by llm-d
Apr 7
2025
How AWS and Intel make LLMs more accessible and cost-effective with DeepSeek

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.