Sentiment Analysis with Text and Audio Using AWS Generative AI Services: Approaches, Challenges, and Solutions
Machine Learning Blog
This article explores sentiment analysis approaches for both text and audio using AWS generative AI services, developed through a partnership between AWS and Itaú Unibanco's research institute.
- Text-based sentiment analysis using LLMs showed low overall accuracy across models tested
- Fine-tuned models outperformed base models but risked overfitting on domain-specific data
- Audio-based models (HuBERT, Wav2Vec, Whisper) achieved higher accuracy on fixed phrases than variable sentences
- Direct audio analysis captures prosodic cues lost in transcription-only approaches
- AWS services enable end-to-end pipelines: Kinesis for ingestion, Lambda for preprocessing, Bedrock/SageMaker for inference
- Text challenges include ambiguity, sarcasm, multilingual support; audio challenges include noise and speaker diversity
- Hybrid multimodal approaches combining audio embeddings with text analysis show promise for improved results
The article demonstrates that while both text and audio sentiment analysis have limitations, AWS services provide scalable infrastructure for building comprehensive solutions tailored to specific use cases.
The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.
Related articles
2026
2024
2026
2024
The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.