HADR Databases solutions on Azure using Geo-replication and Auto-Failover Groups

Rafael Rampineli
3 min readSep 4, 2024

--

Azure provides various high-availability and disaster recovery (HADR) solutions for databases to ensure your data remains accessible and protected during failures or disasters. In this overview, we will focus on two key features: Geo-Replication and Auto-Failover Groups.

Geo-Replication

Geo-replication allows you to create readable secondary databases in different Azure regions. These secondary databases are copies of your primary database and can be used for disaster recovery, read scaling, and data localization, thereby enhancing availability and performance for global applications.

How Geo-Replication Works:

  • Replication Process: Geo-replication employs automatic asynchronous replication. The primary database’s data is copied to secondary databases through a process known as seeding. Transactions are committed on the primary database before being replicated to the secondary.
  • Planned Failover: When a planned failover occurs, there is no data loss between the servers.
  • Unplanned Failover: In an unplanned failover, the secondary database immediately takes over as the primary without synchronization. Any transactions committed to the primary but not yet replicated will be lost.

Secondary Database Management:

  • Number of Secondary Databases: You can create up to four secondary databases for each primary database. If only one secondary database exists and it fails, the application is at higher risk until a new secondary database is created. Multiple secondary databases provide better protection, even if one fails.
  • Elastic Pools: Each secondary database can be part of an elastic pool or remain outside of one. The elastic pool choice is independent of other databases, but each elastic pool is limited to a single region. Consequently, multiple secondary databases in the same setup cannot share an elastic pool.

Service and Configuration:

  • Service Tier: The primary and secondary databases must have the same service tier.
  • Backup Storage and Compute Size: It is recommended that the secondary database matches the primary in backup storage redundancy and compute size (DTUs or vCores).

Cost Considerations:

  • Geo-replication incurs additional costs for secondary databases and data transfer between regions. Review pricing and estimate costs based on your configuration and usage.

Limitations:

  • Active geo-replication is not supported for Azure SQL Managed Instance. Instead, use Auto-Failover Groups for SQL Managed Instance.

Auto-Failover Groups

Auto-failover groups provide high availability and disaster recovery by automatically managing failovers between a primary database and a secondary database. This feature is crucial for mission-critical applications requiring minimal downtime.

How Auto-Failover Groups Work:

  • Replication Process: Data from the primary database is asynchronously replicated to the secondary database, keeping it up-to-date with the primary.
  • Automatic Detection: Azure continuously monitors both the primary and secondary databases. If an issue is detected, the failover process is automatically initiated.
  • Failover Process: When a failover is triggered, Azure automatically promotes the secondary database to the primary role, redirecting applications with minimal disruption.

Failover Management:

  • Manual Failover: You can initiate a failover manually if necessary.
  • Automatic Failover: Alternatively, you can set up automatic failover based on a user-defined policy. This option recovers multiple related databases to a secondary region automatically in case of a catastrophic failure or unplanned event.

Failover Group Configuration:

  • Single or Multiple Databases: A failover group can include one or more databases, usually used by the same application.
  • Configuration: Databases in a failover group must be configured on the primary server.
  • Secondary Server: All databases in the failover group are replicated to a single secondary server or instance in a different region.

Failover Behavior:

  • Outage Response: With an automatic failover policy, any outage affecting databases in the group triggers an automatic failover.
  • Post-Failover Configuration: After a failover, ensure that the authentication settings for the new primary database, server, or instance are correctly configured.
  • Failover Group Management: A failover group is a collection of databases managed by a single server or managed instance that can failover as a unit to another region.

Azure offers robust high-availability and disaster recovery solutions with Geo-Replication and Auto-Failover Groups. Geo-Replication creates secondary, readable databases across regions for enhanced disaster recovery and performance. For Azure SQL Managed Instances, Auto-Failover Groups provide automatic failover, ensuring minimal downtime and high availability by managing the replication and failover of multiple databases. Both features require careful setup to align with your needs, ultimately safeguarding your data and ensuring uninterrupted service.

--

--

Rafael Rampineli
Rafael Rampineli

Written by Rafael Rampineli

I specialize in managing and optimizing complex database environments to ensure high availability, performance, and data integrity.

No responses yet