Home icon

Access Amazon Redshift Managed Storage tables through Apache Spark on AWS Glue and Amazon EMR using Amazon SageMaker Lakehouse

Big Data Blog



This article discusses how to access Amazon Redshift Managed Storage (RMS) tables through Apache Spark using Amazon SageMaker Lakehouse, AWS Glue, and Amazon EMR.

  • SageMaker Lakehouse unifies data across S3 data lakes and Redshift data warehouses
  • Enables accessing RMS tables through Apache Iceberg APIs and AWS Glue Data Catalog
  • Supports integration with SageMaker Unified Studio, Amazon EMR 7.5.0+, and AWS Glue 5.0
  • Requires specific Spark configurations to access RMS tables
  • Demonstrates creating and querying RMS tables using Spark SQL in a JupyterLab notebook

The article provides a step-by-step guide to creating a Lakehouse catalog, configuring Spark sessions, and querying RMS tables using different compute options like AWS Glue and Amazon EMR Serverless.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Apr 28
2025
Access your existing data and resources through Amazon SageMaker Unified Studio, Part 1: AWS Glue Data Catalog and Amazon Redshift
May 9
2025
Configure cross-account access of Amazon SageMaker Lakehouse multi-catalog tables using AWS Glue 5.0 Spark
Jun 25
2025
AWS Glue enables enhanced Apache Spark capabilities for AWS Lake Formation tables with full table access
Jun 25
2024
Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.