Amazon OpenSearch Serverless Next-Gen: Instant Scaling & 60% Cost Savings for AI Agents
- Amazon Rebuilds OpenSearch Serverless from the Ground Up for AI Agents, Delivering 60% Cost Savings and 20x Faster Scaling
- Amazon Web Services (AWS) today announced a complete re-architecture of its OpenSearch Serverless service, designed to meet the unpredictable workload demands of AI agents while cutting costs and...
- Why It Matters AI agents are driving a new class of dynamic, high-volume workloads that require search and vector backends capable of rapid scaling and cost efficiency.
Amazon Rebuilds OpenSearch Serverless from the Ground Up for AI Agents, Delivering 60% Cost Savings and 20x Faster Scaling
Amazon Web Services (AWS) today announced a complete re-architecture of its OpenSearch Serverless service, designed to meet the unpredictable workload demands of AI agents while cutting costs and accelerating deployment. The next-generation service scales compute resources from zero to thousands of requests per second and back to zero when idle, offering up to 60% lower costs compared to provisioning OpenSearch Service clusters for peak capacity. It also provisions infrastructure in seconds—up to 20 times faster than the previous version—and integrates natively with AI development platforms like Vercel and Kiro.

Why It Matters AI agents are driving a new class of dynamic, high-volume workloads that require search and vector backends capable of rapid scaling and cost efficiency. Traditional provisioned clusters often result in over-provisioning or under-provisioning, leading to either wasted resources or performance bottlenecks. AWS’s redesign addresses these challenges by decoupling compute from storage, enabling true serverless scaling without manual intervention.
Key Features and Improvements The new architecture introduces two distinct modes:
- NextGen (default): Built for AI agents, with instant autoscaling, scale-to-zero capabilities, and support for full-text and vector search.
- Classic: The existing OpenSearch Serverless infrastructure, available for users who prefer the legacy architecture.
Users can deploy production-ready search backends in minutes via the AWS Console, AWS CLI, or SDKs. For example, creating a NextGen collection now requires minimal configuration—default settings and security policies are applied automatically. AWS provides a sample CLI command to illustrate setup:
aws opensearchserverless create-collection-group –name channy-nextgen-group –standby-replicas ENABLED –generation NEXTGEN –description "My NextGen collection group" –capacity-limits ‘{ "maxIndexingCapacityInOCU": 96, "maxSearchCapacityInOCU": 96, "minIndexingCapacityInOCU": 0, "minSearchCapacityInOCU": 0 }’ –region "us-east-1"
Capacity limits are configurable in OpenSearch Compute Units (OCUs), with separate billing for compute and storage.
Integration with AI Development Tools AWS has also enhanced integrations with popular AI development platforms:
- Vercel: Users can create or connect OpenSearch collections directly within the Vercel console, enabling seamless backend setup for AI agent applications.
- Kiro and Claude Code: These tools now support OpenSearch Agent Skills, a repository of pre-built workflows that embed domain-specific knowledge and best practices into AI agents. The OpenSearch Launchpad in Kiro further accelerates development by offering guided architecture planning.
Availability and Pricing The next-generation OpenSearch Serverless is generally available today across all AWS commercial regions where the original service operates. Pricing follows a pay-per-use model, charging for compute resources (indexing, search, and GPU acceleration) in OCUs and storage in GB-months. AWS emphasizes that users are billed only for the resources they consume, aligning with the serverless ethos.

What Comes Next AWS encourages developers to test the new service and provide feedback through the AWS re:Post forum or standard AWS Support channels. The company has also committed to expanding functionality in future updates, though no additional details were provided in the announcement.
This launch underscores AWS’s focus on enabling AI innovation by removing infrastructure barriers. As AI agents become more prevalent, services like OpenSearch Serverless will play a critical role in powering the underlying search and retrieval systems that drive their decision-making capabilities.
