Views: 1232
Number of votes: 8
Average rating:

Web application warmup during DXC Service deployments

When deploying through the DXC Service Management Portal, warmup is both configured (when needed) and tested automatically as a part of the deployment. Since it's usually the most time-consuming part of a deployment, this blog post is aimed at providing a few more details on what's going on at this step and provide some information on how it can be done as quickly as possible.

How is warm up configured for sites in DXC Service?

During the first part of the deployment where code is copied and transformations are applied, the deployment engine will check if an applicationInitialization-section exists in Web.config, if it doesn't, it will be automatically created (more information about this is available here). Because of the way Application Initialization works on the underlying web server, some further configuration related to rewrite rules and redirects will be applied if needed as well to ensure that the warmup requests will be as effective as possible (a redirect response will be accepted in the same way as a 200 OK response by application initialization (and not "followed"), which means that a simple http to https redirect could render the whole application initialization configuration almost useless, therefor, rewrite rules are configured so that the application initialization requests are excluded from any redirects).

Once the "code copy"-part of the deployment has finished, the warmup process will begin.

What is happening during the warm up step?

Autoscaling is first temporarily disabled before any warmup starts (making the current number of instances of the web app "fixed") to avoid any unnecessary scaleouts that could otherwise be triggered by the warmup process. After that the first part of a Azure web app "swap with preview" is initiated which ensures that the deployment slot created during the deployment process will be ready to receive production traffic after the swap is completed and avoid any restarts of the web app.

Once that has finished the application initialization process will begin automatically within each individual instance of the web application. On top of this, the deployment engine will also perform a couple of checks to try to validate that this process works as expected. First, each individual instance's local cache status is verified to ensure the best possible performance and stability of the web app once it's swapped.

The application initialization section is also analyzed to find a suitable hostname that can be used to poll the web app and its instances in the slot and check its response (whenever possible, a feature called routing rules is utilized to be able to use a "real" hostname to reach the deployment slot). Some of the things analyzed in the response are:

  • If the header "X-AppInit-WarmingUp" is returned, it means that the Application Initialization engine is still running
  • Which web app instance that sent the response, if the response doesn't come from the same instance as the request was sent to, it indicates that the load balancer in Azure has redirected the request because the Application Initialization engine is still running
  • If the request times out, or if the instance responds with errors, it could potentially indicate that the Application Initialization engine is still running and therefor the process will keep polling the site until the response changes or if a timeout value is met

Once all instances of the web app have been validated successfully (or if the timeout value has been met), the deployment will continue with re-enabling autoscaling and allow the site to be swapped after a manual validation.

What can be done to speed this up?

Local cache

When it comes to local cache, the only thing that can be done to make this as quick as possible is to keep the web app file size as small as possible, it's otherwise an internal process in Azure that could potentially be affected by for example network congestion in the underlying infrastructure.

Local cache can sometimes take only a few seconds to be ready, but it could also take several minutes. There is no guarantee that it will be ready on all instances at the same time, which is why the deployment engine will check all of them individually.

Application Initialization

The amount of time the application initialization process takes is highly dependent on the site and how quickly it starts, it's also affected by the number of links that have been added to the applicationInitialization-section of course, especially if each page takes a significant time to load (it's run sequentially).

Configuring a minimal applicationInitialization-section to save time during deployment is of course not recommended since that could cause new instances to be slow in production instead, but it could of course potentially be a balance between being able to scale out faster or making sure that every single page on the web app has been warmed up.

The logic used to create this section automatically during deployments have been tried and tested over thousands of deployments and has the primary goal of making sure that scale outs and potential restarts of instances will be as seamless as possible, so it's not recommended to create this section manually.

Making sure that the site in the slot is able to respond with "200 OK" however could save several minutes since it allows the deployment engine to actually detect if the application initialization process has finished (the "X-AppInit-WarmingUp" header is only returned on succesful responses) instead of letting it try to validate this until it hits the timeout.

At the time of writing this blog post, the timeout values are as follows:

  • If no warmup is detected (for example if the site just keeps responding with errors), it will wait for up to 10 minutes
  • If warmup is detected for at least one instance, it will wait for a maximum time of 25 minutes

Why does this have to finish before I can swap the site?

As part of the swap process Azure also tries to validate the slot before the swap is made, these checks includes things like warmup and local cache to ensure that the new slot will behave as it's supposed to after the swap. If these checks fail, it will simply block the swap request so there isn't really any point in trying until the site is ready.

How can I know what took so much time in my latest deployment?

The details of the warmup process for each individual instance is logged in the detailed deployment log which can be accessed by first opening the output log of the deployment and then click the "Get Detailed Log"-button (see "Deployment job log output" in this article).

Mar 26, 2019

Johan Kronberg
(By Johan Kronberg, 3/27/2019 4:08:02 PM)

Is it wrong to keep it simple and just have one line of "/" in the applicationInitialization-section?

Tried that and getting:

"Failed to validate routing rule support. The error was: Failed to locate what hostname to use: Couldn't find any valid hostname by analyzing the applicationInitialization section from the slot slot"

Anders Wahlqvist
(By Anders Wahlqvist, 3/27/2019 11:17:12 PM)

@Johan, I wouldn't go as far as to say that it's wrong since it's valid from an IIS standpoint (not recommended though since the site most likely won't be properly warmed up using a configuration like that). The warning/error you're seeing is logged because we try to analyze that section to find out what hostname to use when validating the site and checking the warmup status (since most customers use the "builtin" one we provide, we know that the links and hostnames in this section works, and for those that add it "manually", they most likely put some thought into what they added here), but in this case, no hostname is specified at all.

If you want to keep a minimal warmup section for this site, I recommend that you simply add a valid hostname (used by that web app of course) using the hostName property, as it's done in the example here. The error/warning you're seeing is a "soft fail" though, we will use other methods of validating the site as fallback so the error in itself is not a big problem in any way.

However, if it's a production site, I highly recommend that you add a more robust warmup section (or simply remove it and let the deployment engine do it automatically) since having one request to the default page only will very likely cause issues for new instances. Even for a preproduction site, load tests will probably be a hit or miss during scale outs.

Normally, having a larger warmup section doesn't make the deployments or scale outs that much slower either, the first few requests might take a while but most of the time, the following requests will be very fast and provide something of a safety net in case something goes wrong with the first few requests (which happens fairly often for a number of reasons). And even if it takes some time, it's usally better that it's slow before receiving production traffic rather than being slow when it's receving it.

We've also made a few improvements to how it's handled automatically during deployments recently so might be worth trying that out again in case you haven't.

Not having to do anything should hopefully be even simpler :-)

Johan Kronberg
(By Johan Kronberg, 3/28/2019 9:50:50 AM)

OK! Is it recommended to acquire and add the slot hostnames as well for this config? The one I get e-mailed now is *-slot.azurewebsites.net but I saw the docs had another domain...

Anders Wahlqvist
(By Anders Wahlqvist, 3/28/2019 10:18:43 AM)

@Johan, The slot hostnames are not needed in this config, just use one of the hostnames that've been assigned to the actual web app in that environment. The "<webapp>.dxcloud.episerver.net"-address for example.

Aleksander Rajca
(By Aleksander Rajca, 5/29/2019 10:13:58 PM)

@Anders how the swap is done when there are N > 1 production instances?
Does it create N slots, which are replaced after warmup with all running instances? Or only one slot? - if so, how the rest of production instances are warmed up and swapped then?

Anders Wahlqvist
(By Anders Wahlqvist, 6/10/2019 8:28:45 AM)

@Aleksander - First, sorry for my late reply on this. We will warm up the same number of instances for the deployment slot as the production slot currently has and swap all of them at the same time to make sure the web app has enough resources to handle the current traffic load. Hope that answers your question, if not, please reach out again!

Aleksander Rajca
(By Aleksander Rajca, 6/11/2019 11:43:04 AM)

Thanks @Anders for your answer. I've one more question. After code is deployed to slot, autostaling is enabled. Let's assume that before swapping the slot, application scales out. How does the warmup works on a new instance? Does it warmup both - current and new application code at the same time?

Anders Wahlqvist
(By Anders Wahlqvist, 6/11/2019 4:23:57 PM)

That's correct @Aleksander, we enable autoscaling again after the deployment slot has been warmed up. If the site would then scale out before the "complete/go live" step is started, a new instance is started that holds both the current and the new version of code and both are warmed up before they recieve any traffic.

We will also validate the warmup status of the new instance before proceeding with the actual swap (and we will lock down the number of instances again during the complete/go live process so it doesn't scale out again until the swap has finished).

Please login to comment.