This topic explains automatic failover in the Episerver DXC Service. Automatic failover enables customers with business critical websites to maintain high availability in the event of an outage in an infrastructure component in a datacenter, or even an entire datacenter region. Failover is an optional component which you can add to your DXC Service instance.
How it works
Failover prevents websites from going down in the event of a server failure. Through automatic detection, an error on a primary server is detected, and traffic is automatically routed to a backup server in a secondary geographically redundant location within the same delivery region.
Failover in DXC Service is fully automatic, with no manual intervention. Geo-replication is included by default, meaning that in case of a datacenter outage, traffic is sent to a backup server in a different location, providing redundancy across geographical regions.
The setup includes two application environments, where storage is replicated from the primary to the secondary (failover) environment. Episerver ensures that the failover web app is always in the same state as the primary one.
The web app endpoints are continuously checked for responses. If one of them stops responding, traffic will be moved over to the secondary web app. When the primary web app is healthy again, traffic is directed back again. You can display a message informing that the site is in read-only mode.
DXC Service includes built-in endpoint monitoring and automatic endpoint failover. To use failover, you need to update the websites configuration to either handle all hosts (*), and/or add the failovers hostname. This is done in the Episerver Admin view, see example below.
- Ensure to reduce the cache expiration so you do not cache items for unnecessary long time, as this may cause outdated content to be displayed on the failover web app, in case of failure.
- CMS and Commerce versions on your site must support Read-Only mode (CMS 9.7.0 and Commerce 9.9.0 or higher).
- Add-ons on the site must also support read-only mode.
- The site must be able to handle aggressive autoscaling, that is when a number of instances to meet maximum load are started before that load is expected, to get the failover environment quickly up and running. This may be an issue for sites where many calls are made to external systems during application startup.
- Ensure that you configure warnings in your solution to handle read-only mode, for example by using application state. For database transactions features, such as saving a posted form, or storage transaction features like image resizing, these features must be aware that the application is in read-only mode, to not throw write exceptions.
- Optionally, you can configure if you want to display an information message to end-users on the failover site when in read-only mode during a failure, see Find database mode by code in Database mode.
Note: When the site is in failover state, the used storage will be read-only.