Views: 2078
Number of votes: 0
Average rating:

OutOfMemoryException running EPiServer out of web context

TL;DR: If you run EPiServer out of web context, like in a test setup, and get OutOfMemoryExceptions it might be the cache not being trimmed in time. You can work around that by setting the /configuration/system.web/caching/cache[@percentagePhysicalMemoryUsedLimit] to a lower value (you can try 95, or 50 if you don't care about cache).

The problem: Cache fills up memory unless running in the web server

I ran into an issue running EPiServer (with Commerce) in a console context for integration tests where it ran out of memory running one of the tests. I had been working on optimizing cache so it made sense more memory was used but I expected the automatic cache trimming to evict cache objects to prevent the application from running out of memory. Doing the same operation in a web context did not cause the process to run out of memory. What was going on?

Investigation: Deep-dive into System.Web.Caching source code

Since I couldn't find any detailed documentation I started digging around in the reflected source code for System.Web.Caching.Cache and related classes (which is what is used as the default cache implementation in EPiServer). Here's what I concluded, though I can't guarantee all the conclusions are correct:

The cache monitors both the private bytes used by the process as well as the amount of physical memory in use on the machine. If either metric goes too high it will start trimming the cache. You can configure the limits using the privateBytesLimit and percentagePhysicalMemoryUsedLimit attributes of the /configuration/system.web/caching.cache element in your config, but it has sensible defaults for most situations.

A weakness though, is that it seems to skip monitoring private bytes for a non-IIS process. So that metric will be effectively disabled, falling back to the total physical memory usage metric. By default the "high load" threshold of that metric is set to 99 %, and the "medium load" (when I think it will start trimming the cache, though not that aggressively) is set to 97 %. Albeit high, that seems reasonable for a web server given that it will free up memory by trimming the cache when it approaches this limit.

But the catch is that the cache trimming only runs every so often, from two minute intervals down to five second intervals. Which means that if you put a lot of data into the cache in a short time when the cache is already under high load you could run out of memory.

Solution: Config lower cache limit to make it trim in time

For some reason, perhaps because it can measure the private bytes, the cache trimming was run in time to avoid the OutOfMemoryException in the web context. But I could solve my issue in the console context by lowering the total physical memory usage threshold in config, in my case in the test assembly's .dll.config, by setting percentagePhysicalMemoryUsedLimit="50". I don't really care if the cache usage is optimal in the test so I just set a very low limit to avoid these issues. Probably lowering it to 90 or even 95 % would have done the trick.

Jan 23, 2015

steve
(By steve, 1/23/2015 9:42:13 PM)

This is a tricky black box. I've struggled with similar problems, but running inside the web context (a scheduled job). The job was running for a long time, and loaded a lot of pages in a really big site. It ate memory in a steady way, and since the box had 32GB of it, it did not feel inclined to release it. However, it got OOM exceptions prior to hitting any of the thresholds. I set it low (below 50% - which still meant the site could consume 16GB), and could see it trim the cache. The problem was that it did not release it all, and it kept increasing above the threshold many times - gradually more and more after each trim.

It looks like a memory leak of some sort, but hard to track down. We eventually ended up clearing (CacheManager.Clear) the cache periodically during this job, just to make it work. Not ideal, but something we could live with.

I spent a lot of time tweaking the cache settings, but the cache seems to have a mind of it's own about what it thinks is the best way to do this.

Magnus Rahl
(By Magnus Rahl, 1/24/2015 10:47:52 AM)

It could very well be that I haven't seen the full width of the cache trimming problem, only so far as to solve the problem at hand. I too am puzzled by the trimming algorithm. In my case I could see the first trim when the process hit 500-600 MB, a trim down to about 250 MB. Then several trims around these levels before it moves to a new level with peaks at about 2GB and not trim below 1200 MB. That might very well mean that the "baseline" contains objects not (only) referenced by the cache (possibly leaked). In a way that might be expected as the application runs, but then it still puzzles me why it trims the cache so aggressively in the beginning and then lets it run without leash later on.

Please login to comment.