Introducing AWS Glue 5.1 for Apache Spark
Big Data Blog
This article introduces AWS Glue 5.1, a new version of the serverless data integration service that upgrades to Apache Spark 3.5.6 with enhanced capabilities for data lakes and ETL workloads.
- Upgrades Spark to 3.5.6, Python to 3.11, Scala to 2.12.18 with improved libraries
- Updates open table formats: Apache Hudi 1.0.2, Iceberg 1.10.0, Delta Lake 3.3.2
- Supports Apache Iceberg Materialized Views and format version 3.0 with new data types
- Enables Row Lineage tracking for audit and modification tracking in Iceberg tables
- Extends Lake Formation fine-grained access control to write operations
- Adds S3A as default S3 connector for enhanced performance
- Introduces Apache Spark troubleshooting agent for automated job debugging
- Accessible via AWS Glue Studio, console, SDK, and CLI
AWS Glue 5.1 delivers performance improvements and advanced data governance features for modern data integration workloads.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.