How Public AI delivers sovereign LLM inference on AWS and Intel

AWS Partner Network Blog

This article describes how Public AI deployed sovereign LLM inference using AWS and Intel infrastructure for the Apertus multilingual model launch.

Public AI built production inference platform on Amazon EKS with Intel-powered EC2 instances
Architecture deployed in AWS Europe (Zurich) region to maintain Swiss data residency requirements
Stack includes vLLM serving, Amazon Cognito authentication, AWS WAF protection, and Bedrock Guardrails
Apertus 8B model runs cost-effectively on Intel R8i instances; 70B variant uses GPU instances
Intel Xeon processors available across AWS Regions, Local Zones, and Outposts for jurisdiction flexibility
Platform serves thousands of daily users and sustains peak concurrent demand reliably
Repeatable blueprint enables rapid deployment of sovereign models across multiple jurisdictions

Public AI demonstrates a scalable, jurisdiction-compliant architecture for deploying open-weight LLMs using AWS managed services and Intel processors, establishing a replicable model for sovereign AI initiatives globally.

Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

May 12
2026

Enabling AI sovereignty on AWS

May 9
2024

Deploy LLMs in AWS GovCloud (US) Regions using Hugging Face Inference Containers

Mar 16
2026

Introducing Disaggregated Inference on AWS powered by llm-d

Apr 7
2025

How AWS and Intel make LLMs more accessible and cost-effective with DeepSeek

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

How Public AI delivers sovereign LLM inference on AWS and Intel

Related articles