Recovery Corner: What It Takes to Test Recovery at Scale

A large multi-site customer came to Stage2Data after earlier disaster recovery (DR) tests failed to give the team enough confidence. Some systems would not come up at all. Others could be recovered, but only after recovery times stretched beyond what the business could accept.

Why the Existing Recovery Plan Was Falling Short

The customer’s environment is complex. It includes hundreds of virtual machines, multiple data centers, virtual desktop infrastructure (VDI) dependencies, mainframe-related systems, and 15 to 20 network segments that need to communicate across the recovery environment.

In a real event, recovery would not be as simple as powering on workloads in another location. Those workloads would need to retain the right network identity, connect through the right paths, and support the application dependencies that keep the business running.

Stage2Data reviewed the customer’s architecture and recovery requirements, then deployed Zerto at the production site with replication into a Stage2Data data center. We also created a dedicated Zerto long-term retention bucket, giving the customer extended offsite retention while keeping that data separate from the replication target for added security.

Testing Recovery in Phases Before the 300-VM Validation

Since launch, the customer has completed successful smaller tests and now has a recovery model designed to bring critical systems online in minutes, with recovery point objectives measured in seconds. The next step is a larger planned validation across the broader environment, including a 300-VM recovery test.

That phased approach matters. Large recoveries rarely fail on compute alone. They fail when segmented networks, VDI access, application dependencies, and user connectivity do not come back in the right order.

By testing those layers before an incident, the customer can move from a recovery plan that looked good on paper to one that has been proven under realistic conditions.

Recovery Corner: What It Takes to Test Recovery at Scale

Why the Existing Recovery Plan Was Falling Short

Testing Recovery in Phases Before the 300-VM Validation

Robert Kellerman

Recent posts

Q2 Newsletter

Stage2Data Successfully Completes Second SOC 2 Type II Audit

January 2026 Newsletter

SOLUTIONS

Recent News