Stream real-time data into Apache Iceberg tables in Amazon S3 using Amazon Data Firehose
Big Data Blog
This article discusses how to stream real-time data into Apache Iceberg tables stored in Amazon S3 using Amazon Data Firehose. It covers the benefits of using Iceberg, such as its support for concurrent data writes across different frameworks, time travel and rollback capabilities, and schema evolution.
Specifically, the article covers:
- Setting up Data Firehose to deliver real-time data streams into Iceberg tables for four different scenarios: inserting all records into a single table, performing inserts/updates/deletes in a single table, routing records to different tables based on data content using JSON Query expressions, and routing records using a Lambda function.
- Querying the data written to Iceberg tables using Amazon Athena.
- Considerations and limitations when using Data Firehose with Iceberg.
- Conclusion highlighting the simplicity of setting up real-time data ingestion into Iceberg tables using the serverless Data Firehose service.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Oct 1
2024
2024
Amazon Data Firehose delivers data streams into Apache Iceberg format tables in Amazon S3
Jun 20
2025
2025
Stream data from Amazon MSK to Apache Iceberg tables in Amazon S3 and Amazon S3 Tables using Amazon Data Firehose
Nov 15
2024
2024
Amazon Data Firehose supports continuous replication of database changes to Apache Iceberg Tables in Amazon S3
Mar 14
2025
2025
Amazon Data Firehose now delivers real-time streaming data into Amazon S3 Tables
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.