Overview
All PVBS instances are hosted within the Australia East Azure datacentre(s). In the event that this data centre goes offline, and will be offline for a substantial period of time (> 24 hours), this Disaster Recovery plan can be implemented to stand up new PVBS instances within 12 hours.
To recreate PVBS in a new Azure region, the following steps should be taken:
- Deploy new instances of the globally used infrastructure resources
- Deploy new sets of resources for each tenant
- Restore each tenant’s database from a backup
- Redirect each tenant’s subdomain to the new PVBS instance
Deploy Infrastructure
The following globally used resources are used by all tenants of PVBS. In the event of a total Region failure, these would need to be deployed.
- KeyVault
- SQL Server
- SendGrid
- App Service for webjobs
The DevOps environment includes release pipeline tasks for each of these infrastructure resources. This is maintained using ARM templates kept in source control.
These can be deployed to a new Region by updating the release pipeline.
The Argus and PVBS Service webjobs are also deployed using release pipelines. This can be updated to deploy the latest prod build to the new webjobs App Service.
Deploy Tenant Resources
Tenant resources are deployed to Azure using a release pipeline. Each tenant has a task in this pipeline, which defines any tenant specific variables to be applied when the resources are created. The ARM template of a PVBS tenants’ resources is kept in source control.
This pipeline can be used to deploy a set of resources for each tenant to a new Azure region.
This includes the following resources:
- App Service
- Deploy Slot
- SQL Database
- Application Insights
Once the resources have been deployed, the latest production build of the PVBS code can be deployed to each tenant deploy slot, then swapped. This is done using the existing release pipeline, updated to point to the new Azure region.
Restore Databases
Each PVBS tenant uses an SQL database for data storage. These have backups produced automatically, which are retained for 7 days as standard, but long-term retention can be configured if required.
This produces full backups one per week, differential backups every 12-24 hours, and transaction log backup every 5 – 10 minutes.
These backups are replicated to the paired region, which in the case of Australia East is Australia South East.
As these backups are replicated in the pair region, they can be accessed in the event that the originating region goes offline.
A powershell script can be used to restore a copy of each Tenants database from the most recent Point In Time restore point.
Once restored, each site can be activated.
Redirect to New Instances
Currently the CNAME subdomain records for each tenant will be directed to the *.azurewebsites.net URL of the old resources.
These need to be updated to point to the new resources. Once redirected, the custom domain and SSL binding can be configured on each new site. This is done with a powershell script.