AWS Resilience Hub Next Generation: AI-Powered Analysis and Modular Policies

News Context

At a glance

Amazon Web Services announced the general availability of the next generation of AWS Resilience Hub on May 28, 2026.
The update addresses a common operational challenge for organizations managing hundreds of applications: the lack of consistent standards for availability and disaster recovery.
The new version of Resilience Hub introduces a structured approach to resilience management by integrating application modeling, automated dependency discovery, and generative AI-powered analysis into a single workflow.

Amazon Web Services announced the general availability of the next generation of AWS Resilience Hub on May 28, 2026. The updated service provides a centralized framework for Site Reliability Engineers (SREs) and development teams to define, measure, and prove the resilience of application portfolios across an enterprise.

The update addresses a common operational challenge for organizations managing hundreds of applications: the lack of consistent standards for availability and disaster recovery. In many large-scale environments, different teams use disparate tools and set varying resilience goals, making it difficult for leadership to verify if a portfolio meets corporate compliance or business continuity requirements.

The new version of Resilience Hub introduces a structured approach to resilience management by integrating application modeling, automated dependency discovery, and generative AI-powered analysis into a single workflow.

A Business-Centric Application Model

A primary shift in this version is the introduction of a new application model that maps technical infrastructure to business outcomes. Rather than focusing solely on individual resources, the service now uses a hierarchy consisting of systems, user journeys, and services.

In this model, a system represents a complete business application. Within that system, user journeys describe the critical paths a customer takes to achieve a business goal. Finally, services are defined as the deployable units—comprising code, AWS resources, and observability tools—that support those journeys.

Resilience Hub automatically discovers these components and maps them into a topology. This allows teams to visualize exactly how data flows and how resources are connected, providing a clearer understanding of how a failure in one microservice might impact a specific business outcome.

Modular Resilience Policies

To move away from rigid, one-size-fits-all templates, AWS has implemented modular and composable resilience policies. Users can now construct policies by selecting specific requirements based on the criticality of the application.

These requirements include Service Level Objectives (SLOs), which define the target level of reliability for a service, as well as multi-Availability Zone (AZ) and multi-Region disaster recovery standards. For high-stakes environments, such as financial applications, teams can set precise Recovery Time Objectives (RTO)—the maximum acceptable delay between a failure and restoration—and Recovery Point Objectives (RPO), which define the maximum acceptable amount of data loss measured in time.

The tool also allows for specific data recovery requirements, enabling administrators to define the time objective for restoring individual services from backups.

AI-Powered Analysis and Dependency Discovery

The next generation of the hub incorporates generative AI to perform failure mode assessments. These assessments analyze configured services against the defined resilience policies, the AWS Resilience Analysis Framework, and AWS Well-Architected best practices.

Amazon Web Services logo

The AI identifies potential failure modes and provides actionable recommendations to mitigate risks. To improve accuracy, users can add assertions to guide the AI agents during the assessment process.

Complementing the AI analysis is a new dependency discovery assessment. This feature uses DNS query log analysis to automatically identify internal and third-party endpoints that a service relies on. This process is designed to uncover hidden dependencies, such as unexpected cross-region calls or critical third-party APIs, which often act as single points of failure during an outage.

Enterprise Scaling and Deployment

For large-scale governance, the service integrates with AWS Organizations. This allows a single delegated administrator account to evaluate the resilience posture of the entire enterprise without requiring manual logins to individual member accounts.

Pearson taps AWS Resilience Hub to improve application resilience | Amazon Web Services

Deployment requires the configuration of an invoker IAM role to grant Resilience Hub read-only access to resources. For those not using AWS Organizations, cross-account roles can be used to maintain visibility across different environments.

Existing customers can transition to the new model using migration APIs. These tools convert previous assessment policies into the new modular format and map legacy application structures to the new system-service hierarchy.

The service is now available in all AWS commercial Regions where Resilience Hub was previously offered. AWS has transitioned the tool to a service-based pricing model, which includes two failure mode assessments per month for services, with optional pricing for automated dependency assessments.

AWS Resilience Hub Next Generation: AI-Powered Analysis and Modular Policies

A Business-Centric Application Model

Modular Resilience Policies

AI-Powered Analysis and Dependency Discovery

Enterprise Scaling and Deployment

Share this:

Related