Try our conversational search powered by Generative AI!

harry mohammed
Feb 26, 2013
  2052
(2 votes)

CMS editor server instance crashing main site

 

Recently I was involved in supporting a new large scale implementation for a global client.

The implementation had gone through full testing and passed with flying colours. There seemed to be the usual cosmetic bugs and certain “Known bugs” that were acceptable by the client.

We finally got the go ahead to switch the new site on, this is where the real fun started.

Once the Load Balancer changes were made and the site made publicly available, we saw issues which were causing key functionality to have huge performance issues and ultimately crashing the site. When I say crashing the site, I mean the servers stopped responding.

The configuration of the implementation was to have the following:

  • 1 CMS editor server
  • Front end presentation servers with CMS access switched off.
  • Akamai CDN

It seemed a standard set up and nothing new.

What we found, was Cached objects were being invalidated for data sources from a different provider. After a while the front end presentation servers would stop responding and cause a failover.

Initial investigation was made, however, this was inconclusive and was pointing at Akamai not caching correctly.

The goose chase started and a series of remedial steps were taken:

  1. Monitor the DB
  2. Look at deactivating the many languages.

Monitoring the DB showed masses of Deadlocks and the dreaded SQL Query that is initiated by FindPagesWithCriteria were causing the DB to lock up.

However, this was not actually the issue.

If we think about it, the CMS editor application was being bought up causing the Presentation servers and DB to max out.

Further thinking it seems logical to think that the CMS was causing the issue.

To prove the point, we stopped the CMS server and the issue went away immediately and the Site was responding as expected.

After looking at the implementation it was identified that the Remote Events that are standard EPiServer built in functionality and I have never seen cause an issue.

What was realised was that the CMS and front end instances were configured to listen to sites that were not actually active. (There were 2 other sites that weren’t actually switched on)

So what was happening, the CMS was sending out events to all the presentation servers announcing it self and trying to register itself which was causing the Presentation servers to invalidate their cache. With users still trying to browse the site, this was causing them all to go back to populate their caches which cause the DB to max out and causing the sites to crash.

To fix this, the following is what to do:

  1. Set the scheduler to false for all servers and all sites in episerver.config.
  2. If other sites are not running in ISS make sure to comment them out in the episerver.config sites section.
  3. Delete the content of automaticSiteMapping section, an attribute in episerverFramework.config on all server. Its recreated automatically on site startup.
  4. Delete table tblsiteConfig, recreated on startup
  5. Make sure no scheduled jobs are active in admin mode.
  6. Restart frontend sites.

 

The point of the above is, when switching a site to live, make sure only that is running what is needed.

Feb 26, 2013

Comments

Please login to comment.
Latest blogs
Optimizely and the never-ending story of the missing globe!

I've worked with Optimizely CMS for 14 years, and there are two things I'm obsessed with: Link validation and the globe that keeps disappearing on...

Tomas Hensrud Gulla | Apr 18, 2024 | Syndicated blog

Visitor Groups Usage Report For Optimizely CMS 12

This add-on offers detailed information on how visitor groups are used and how effective they are within Optimizely CMS. Editors can monitor and...

Adnan Zameer | Apr 18, 2024 | Syndicated blog

Azure AI Language – Abstractive Summarisation in Optimizely CMS

In this article, I show how the abstraction summarisation feature provided by the Azure AI Language platform, can be used within Optimizely CMS to...

Anil Patel | Apr 18, 2024 | Syndicated blog

Fix your Search & Navigation (Find) indexing job, please

Once upon a time, a colleague asked me to look into a customer database with weird spikes in database log usage. (You might start to wonder why I a...

Quan Mai | Apr 17, 2024 | Syndicated blog

The A/A Test: What You Need to Know

Sure, we all know what an A/B test can do. But what is an A/A test? How is it different? With an A/B test, we know that we can take a webpage (our...

Lindsey Rogers | Apr 15, 2024

.Net Core Timezone ID's Windows vs Linux

Hey all, First post here and I would like to talk about Timezone ID's and How Windows and Linux systems use different IDs. We currently run a .NET...

sheider | Apr 15, 2024