Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service
Big Data Blog
This article discusses an architecture for detecting, masking, and redacting personally identifiable information (PII) data from data streams using AWS Glue before loading it into Amazon OpenSearch Service. It provides a solution overview and walks through a specific use case in the financial services industry.
Specifically, the article covers:
- Solution architecture overview using AWS services like Kinesis Data Streams, AWS Glue, Amazon S3 data lake, and OpenSearch Service
- The business context and dataset structure containing sensitive PII data like names, SSN, credit card numbers, etc.
- A detailed use case demonstrating the step-by-step process of ingesting raw data into Kinesis, processing it with AWS Glue to detect and mask PII fields, and loading the masked data into OpenSearch Service and S3
- Capabilities of AWS Glue's Detect PII action to identify and mask PII data based on patterns and sampling
- Alternative methods for PII detection and masking using other AWS services like Amazon Macie and Amazon Comprehend
- Conclusion highlighting the importance of handling sensitive data compliantly while enabling scalability
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
Jan 13
2025
2025
Batch data ingestion into Amazon OpenSearch Service using AWS Glue
Aug 26
2024
2024
Copy and mask PII between Amazon RDS databases using visual ETL jobs in AWS Glue Studio
Nov 11
2024
2024
Achieve data resilience using Amazon OpenSearch Service disaster recovery with snapshot and restore
Feb 21
2025
2025
Improve search results for AI using Amazon OpenSearch Service as a vector database with Amazon Bedrock
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.