Access Amazon Redshift Managed Storage tables through Apache Spark on AWS Glue and Amazon EMR using Amazon SageMaker Lakehouse
Big Data Blog
This article discusses how to access Amazon Redshift Managed Storage (RMS) tables through Apache Spark using Amazon SageMaker Lakehouse, AWS Glue, and Amazon EMR.
- SageMaker Lakehouse unifies data across S3 data lakes and Redshift data warehouses
- Enables accessing RMS tables through Apache Iceberg APIs and AWS Glue Data Catalog
- Supports integration with SageMaker Unified Studio, Amazon EMR 7.5.0+, and AWS Glue 5.0
- Requires specific Spark configurations to access RMS tables
- Demonstrates creating and querying RMS tables using Spark SQL in a JupyterLab notebook
The article provides a step-by-step guide to creating a Lakehouse catalog, configuring Spark sessions, and querying RMS tables using different compute options like AWS Glue and Amazon EMR Serverless.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2025
2025
2025
2024
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.