Home icon

Implement semantic video search using open source large vision models on Amazon SageMaker and Amazon OpenSearch Serverless

Machine Learning Blog



This article describes an innovative approach to semantic video search using large vision models (LVMs) on AWS services, enabling users to search video content using natural language or image queries.

  • Uses open-source multimodal models like CLIP, OpenCLIP, and SigLIP for zero-shot video search
  • Leverages Amazon SageMaker for model deployment and processing
  • Utilizes Amazon OpenSearch Serverless as a vector database for efficient search
  • Implements techniques like temporal smoothing and clustering to improve search results
  • Supports both text and image-based video search across diverse use cases

The solution provides a flexible, cost-effective method for semantic video search without requiring extensive machine learning expertise, adaptable to various applications like content discovery, video editing, and moderation.



Go to article

The AWS News Feed is currently looking for gold sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.

Related articles

Feb 5
2025
Video semantic search with AI on AWS
Jun 3
2024
Implement serverless semantic search of image and live video with Amazon Titan Multimodal Embeddings
Apr 17
2026
Power video semantic search with Amazon Nova Multimodal Embeddings
Apr 17
2026
Optimize video semantic search intent with Amazon Nova Model Distillation on Amazon Bedrock

The AWS News Feed is currently looking for silver sponsors. If you want to support the AWS community and reach a large audience of AWS professionals, consider sponsoring the AWS News Feed.