AWS recently introduced Amazon Application Recovery Controller (ARC) Region switch, a fully managed, highly available capability that enables organizations to plan, practice, and orchestrate Region switches.
Previously, managing a failover required users to create and uphold intricate scripts to synchronize tasks among different AWS services like computing, databases, and DNS. With the introduction of ARC Region Switch, this labor-intensive process is replaced by a centralized, highly available solution. As noted by Sébastien Stormacq, a principal developer advocate at AWS:
It gives you a centralized solution to coordinate and automate recovery tasks across AWS services and accounts when you need to switch your application’s operations from one AWS Region to another.
ARC Region switch enables users to create detailed recovery plans with a variety of execution blocks that define the steps for a Region switch. These include:
- ARC routing controls: To redirect traffic using DNS health checks.
- Amazon Aurora global database: To perform database failover or switchover.
- Amazon EC2, Amazon EKS, and Amazon ECS scaling: To scale compute resources in the target Region by a specified percentage.
- Custom actions: To integrate custom recovery steps using AWS Lambda functions.
- Manual approval: To add checkpoints in the recovery workflow for team review.
(Source: AWS News blog post)
A key feature of the ARC Region switch is its proactive validation, which continually checks resource configurations and AWS IAM permissions every 30 minutes to ensure that recovery plans remain viable. It also offers a global dashboard to monitor the status of plans across your entire enterprise.
For cost and reliability, the service provides flexibility in how users prepare their standby resources. They can configure the desired percentage of compute capacity to target in their destination Region during recovery, allowing them to balance cost with performance needs.
The reception in the community of the capability is positive. A long-time disaster recovery architect noted on a Reddit thread that the service will “make x-region DR run books so much easier.” This sentiment was reinforced by a comment on LinkedIn, which described the service’s key innovation: “A fully orchestrated, automated failover service with CONTINUOUS VALIDATION. ARC constantly checks readiness, turning a high-stakes manual crisis into a confident, push-button drill, removing the fear from failovers.”
While AWS has introduced a consolidated, managed solution with ARC Region Switch, both Google Cloud and Microsoft Azure offer similar capabilities for multi-region failover. Rather than a single service, these providers provide a set of powerful tools that can be configured to achieve a comparable result. For instance, Azure uses services like Site Recovery for orchestrated failover and Front Door or Traffic Manager for global traffic routing. Similarly, Google Cloud customers build multi-region recovery strategies by combining Cloud DNS with failover policies and Global Load Balancers to redirect traffic during an outage.
Currently, ARC Region switch is available in all commercial AWS Regions and priced at $70 per month per plan, with each plan capable of including up to 100 execution blocks. Users can also create parent plans to orchestrate up to 25 child plans.