On-premises disaster recovery failover and failback

  1. Home
  2. On-premises disaster recovery failover and failback

Go back to AZ-304 Tutorials

In this tutorial, we will understand and learn about failover and failback during disaster recovery of on-premises machines to Azure with Azure Site Recovery. Moreover, we will cover the process of running a disaster recovery drill to Azure, using a Site Recovery test failover.

Recovery stages

Failover and failback in Site Recovery has four stages:

Stage 1:

Failover from on-premises in which after setting up replication to Azure for on-premises machines, when your on-premises site goes down, you fail those machines over to Azure. 

Stage 2: 

Reprotecting Azure VM so that they start replicating back to the on-premises site. The on-premises VM is set off during reprotection, to help ensure data consistency.

Stage 3:

Failover from Azure, this specifies when the on-premises site is running as normal again then you run another fail over. However, this time to failback Azure VMs to your on-premises site. You can failback to the original location from which you failed over, or to an alternate location.

Stage 4: 

Reprotecting on-premises machines after failing back, again enables replication of the on-premises machines to Azure.

Failover

Performing a failover is part of your business continuity and disaster recovery (BCDR) strategy.

  • Firstly, in BCDR strategy, replicate your on-premises machines to Azure on an ongoing basis. However, users access workloads and apps running on the on-premises source machines.
  • Secondly, if the need arises, like there’s an outage on-premises and you fail the replicating machines over to Azure. Azure VMs are created using the replicated data.
  • Lastly, for business continuity, users can continue accessing apps on the Azure VMs.
Failover is a two-phase activity:
  • Firstly, Failover that creates and brings up an Azure VM using the selected recovery point.
  • Secondly, Commit, in this after failover you verify the VM in Azure and then you can then commit the failover to the selected recovery point, or select a different point for the commit.
AZ-304 practice tests

Connecting Azure after Failover

For connecting to the Azure VMs created after failover using RDP/SSH, there are a number of requirements.

connecting failover
Image Source: Microsoft

Types of Failover

Site Recovery provides different failover options.

failover types
failover types
Image Source: Microsoft

Reprotection/failback

After failing over to Azure, the replicated Azure VMs are in an unprotected state.

  • Firstly, as failing back to your on-premises site, you need to start the Azure VMs replicating to on-premises. Then, the reproduction process depends on the type of machines you failed over.
  • However, after machines are replicating from Azure to on-premises, you can run a failover from Azure to your on-premises site.
  • Then, after machines are running on-premises again, enable the replication so that they replicate to Azure for disaster recovery.

Working of Failback

  • For failing back, a VM needs at least one recovery point in order to fail back. However, in a recovery plan, all VMs in the plan need at least one recovery point.
  • However, there is an app-consistent recovery point option. In this, a single VM recovers to its latest available app-consistent recovery point. And, for a recovery plan with a replication group, each replication group recovers to its common available recovery point.
  • Thirdly, during failover from Azure to the on-premises site, Site Recovery shuts down the Azure VMs. And, then you commit the failover, Site Recovery removes the failed back Azure VMs in Azure.

Hyper-V reprotection/failback

For reprotecting and failing back Hyper-V VMs from Azure to on-premises:

  • Firstly, you can only fail back Hyper-V VMs replicating using a storage account. However, Failback of Hyper-V VMs that replicate using managed disks is not supported.
  • Secondly, On-premises Hyper-V hosts should be connected to Azure.
  • Thirdly, you run a planned failback from Azure to on-premises. Moreover, there are no specific components necessary to be set up for Hyper-V VM failback.
  • However, during planned failover, you can select options for synchronizing data before failback:

Synchronizing data before failover: This minimizes downtime for virtual machines as it synchronizes machines without shutting them down.

Phase 1: This takes a snapshot of the Azure VM and copies it to the on-premises Hyper-V host.

Phase 2: This shuts down the Azure VM so that no new changes occur there. However, the final set of delta changes is transferred to the on-premises server and the on-premises VM is started up.

Synchronizing data during failover only: This option is faster as we expect that most of the disk has changed, and thus don’t perform checksum calculations. It performs a download of the disk.

Running a test failover to Azure

Running a test failover can help in validating your replication and disaster recovery strategy, without any data loss or downtime. However, a test failover doesn’t impact ongoing replication, or your production environment. So, you can run a test failover on a specific virtual machine (VM), or on a recovery plan containing multiple VMs.

However, if you want to run a test failover for a single VM, follow the steps below:

  • Firstly, in Site Recovery in the Azure portal, click Recovery Plans > recoveryplan_name > Test Failover.
  • Secondly, select a Recovery Point to which to failover. 
For this, you can use one of the following options:

Firstly, Latest processed, this fails over all VMs in the plan to the latest recovery point processed by Site Recovery. 

Secondly, Latest app-consistent, this fails over all the VMs in the plan to the latest application-consistent recovery point processed by Site Recovery. 

Then, the Latest option, this first processes all the data sent to the Site Recovery service for creating a recovery point for each VM before failing over to it. Moreover, this option provides the lowest RPO (Recovery Point Objective).

Then, Latest multi-VM processes, this option is available for recovery plans with one or more VMs that have multi-VM consistency enabled. 

After that, Latest multi-VM app-consistent, this option is available for recovery plans with one or more VMs that have multi-VM consistency enabled. 

Lastly, Custom,  use this option to fail over a specific VM to a particular recovery point.

  • Thirdly, select an Azure virtual network in which test VMs will be created.
  • However, if you are failing over to Azure and data encryption is on. Then, in Encryption Key, select the certificate that was issued when you enabled encryption during Provider installation. 
  • After that, track failover progress on the Jobs tab. Then, you should be able to see the test replica machine in the Azure portal.
  • Then, for initiating an RDP connection to the Azure VM, you need to add a public IP address on the network interface of the failed VM.
  • After that, click Cleanup test failover. Further, this deletes the VMs that were created during test failover.
  • Lastly, in Notes, record and save any observations associated with the test failover.
AZ-304 online course

Reference: Microsoft Documentation, Documentation 2

Go back to AZ-304 Tutorials

Menu