Views: 1319
Number of votes: 1
Average rating:

Creating the perfect Episerver integration with HttpClient

Building an integration that keeps working during heavy user load is tricky. 

Since Episerver uses .NET as underlying framework, a lot of integrations involve consuming different web apis. A key class here is to use the HttpClient class. It's easy to use to build integrations that works during light user load. Unfortunately this class is the worst mess that Microsoft has ever created. It looks easy but don't be fooled. It's like nitroglycerin. If you sneeze in its general direction it will explode in your face.

There are a couple of bottlenecks you will run into. The first one is that you will run out of sockets on your server. Then there is also memory consumption and max free threads to consider underneath the hood.

Here are some advice how to keep it working during heavy load (and get better performance during light load):

  1. Reuse HttpClient instances

    It is intended to be reused for many calls. Do not wrap the HttpClient in a using statement (even though it weirdly enough has a Dispose()). Do not create a new instance for each call. If you create a new instance every time you will lose a lot of performance at low load and get SocketExceptions and crash site at high loads. A good pattern is to have an IHttpClientFactory that stores instances that can be reused
    public class HttpClientFactory : IHttpClientFactory
    {
         protected static readonly ConcurrentDictionary<string, HttpClient> HttpClientCache = new ConcurrentDictionary<string, HttpClient>();
    
         public HttpClient GetForHost(Uri uri)
         {
             var key = $"{uri.Scheme}://{uri.DnsSafeHost}:{uri.Port}";
    
             return HttpClientCache.GetOrAdd(key, k =>
             {
                 var client = new HttpClient()
                 {
                     /* Other setup */
                 };
                 var sp = ServicePointManager.FindServicePoint(uri);
                 sp.ConnectionLeaseTimeout = 60 * 1000; // 1 minute
                 return client;
             });
         }
    }​
    So if you are building a repository class that needs an HttpClient you can have the IHttpClientFactory in the repository contructor as a dependency and grab a new instance from that one.
    public class ProductRepository:IProductRepository
    {
       private readonly HttpClient _client;
       public ProductRepository(IHttpClientFactory httpClientFactory)
       {
           _client = httpClientFactory.GetForHost("[product base url]");
       }
    }​

    Except solving the possible out of sockets problem I actually got 30% faster calls using this improvement only in a project. Setting up a completely new HttpClient including https handshake etc is an expensive operation. In .NET core this is standard but there is another way to inject named instances into your repositories that you should use instead.

  2. Set ServicePoint default connection limit

    Unfortunately .NET has a very low limit of how many concurrent connections an HttpClient instance can have. If you use asyncronous programming, which you should, with async await you should really increase this value. If you don't you will get a TaskCanceledException when you run out of connections. You can easily do that in application startup with:
    protected void Application_Start()
    {
        ...
        ServicePointManager.DefaultConnectionLimit = int.MaxValue;;
        ...
    }​

    Mind you, don't hammer an external api too hard with x number of simultanous calls. They can get angry. With great power comes great responsibility.

     
  3. Make sure that HttpClient respects DNS changes

    Reusing a single HttpClient has a hidden problem that you need to know about. Let's say you have a cloud environment and are swapping slots. This means that in the background the DNS is changing to another IP. If you have a static HttpClient that lives forever, that change won't be picked up until you restart the entire application. That's a little evil. That's why HttpClientFactory above has the obscure setting: 
    var sp = ServicePointManager.FindServicePoint(uri);
    sp.ConnectionLeaseTimeout = 60 * 1000; // 1 minute​

    This will take care of any nasty DNS change that occurs while your super fast Episerver website just keeps on running. Close to light speed. And beyond.

  4. Dispose of HttpResponse object

    HttpResponse response;
    try
    {
        //Create call with http client and set response...                
    }
    finally
    {
        if (response != null)
           response.Dispose();
    }​
    I've seen some strange behaviours if I forgot this one with TaskCanceledException as a result. In the response there is the content stream which in some cases can stay open even though you are done with it. So always dispose this object in your favorite way. Either by using try catch finally like above or even better with the keyword using(var response ...) {}. Especially important to do it if the response is a failed call. 

  5. Set Timeout large enough to handle large files (default is 100 seconds)

    Otherwise you will also get a TaskCanceledException weirdly enough.
    //Add to httpclient factory above
    var client = new HttpClient()
    {
        Timeout = TimeSpan.FromMinutes(10);
    };

    Easy to forget when you are on your superfast local machine and downloading small files. You need to make it work on a large file on a poor network = long download time. HttpClient will close with a TaskCanceledException if the request takes longer that 100s. Only add this one if you need it though.

  6. Avoid storing large files in memory, use streams all the way with HttpCompletionOption.ResponseHeadersRead

    Do use streams instead of byte[]. 
    Do use HttpCompletionOption.ResponseHeadersRead. Otherwise it won't start streaming until the entire file is loaded into memory.
    Do dispose the response object, either by calling Dispose() yourself on the reponse object or by the using keyword below.

    using (var response = await httpClient.GetAsync(
    	"https://test.test.com/test/", 
    		HttpCompletionOption.ResponseHeadersRead))
    {
    	if (response.IsSuccessStatusCode)
    	{
    		using (var stream = response.Content.ReadAsStreamAsync())
    		{
    		   //Save to disc using the stream
    		}
    	}
    }​


    Stream it directly to the user or to a file or whereever you want it. Using a byte[] will create a very expensive object in memory in the background. During heavy load and many files you will end up spending a lot of memory and CPU just juggling objects on the large object heap in the background. Use streams all the way to the destination.

  7. Enable support for gzipped response from server

    HttpClientHandler handler = new HttpClientHandler()
    {
         AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
    };
    var client = new HttpClient(handler)
    {
          /* Other setup */
    };​

    Why send more bytes than you have to?

Jan 15, 2020

Johan Kronberg
( By Johan Kronberg, 1/16/2020 7:31:51 AM)

Good one! I think it's also wise to use:

new HttpClientHandler { AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate }

Daniel Ovaska
( By Daniel Ovaska, 1/16/2020 9:50:53 AM)

Good one Johan! Adding to list. Thx!

( 1/16/2020 3:10:33 PM)

Definately bookmarking this page! 

Johan Kronberg
( By Johan Kronberg, 1/16/2020 3:14:06 PM)

Jeroen has some findings in this area as well: https://jstemerdink.blog/2019/10/03/speed-up-your-site/

Please login to comment.