AWS Parallel Computing Service now supports Slurm 25.11
News
This article announces AWS Parallel Computing Service (AWS PCS) support for Slurm version 25.11, introducing enhanced monitoring, logging, and job recovery capabilities.
- Expedited re-queue automatically reschedules failed jobs at highest priority for faster recovery
- New OpenMetrics endpoint provides real-time visibility into jobs, nodes, and scheduling
- Slurm database daemon and REST API daemon logs now send to CloudWatch, S3, or Firehose
- Scheduler audit logs now available as dedicated log type for independent cost control
- Features available across all AWS Regions where AWS PCS is supported
AWS PCS simplifies HPC workload management by adding improved observability and automated job recovery capabilities to its managed Slurm service.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.