Views: 7060
Number of votes: 1
Average rating:

Paging with LINQ – performance

Facing a scenario where I needed to select EPiServerProfile objects based on different criteria and displaying a paged view based on these objects I thought I should use LINQ. Neat syntax and readable code, but what about the performance? There are probably thousands of posts like this out there already, but here is mine :)

The test

I designed a test to compare the performance of LINQ to a simple looping scenario. Based on a simple collection more advanced objects should be created, their properties inspected for matching criteria, and then a certain page of a certain size of the results should be selected. This should correspond to my scenario having the list of all usernames, load their corresponding EPiServerProfile and selecting objects based on properties of the profile.

The LINQ

This is the LINQ I use to select my objects:

   1: List<Dummy> = items.Select(num => new Dummy(num, delay)).
   2:                                       Where(dum => dum.Good).
   3:                                       Skip((page - 1) * pageSize).
   4:                                       Take(pageSize).
   5:                                       ToList();

The items object is a List<bool> whose objects are used to construct the more complex Dummy instances I want. The boolean values are stored in the Dummy objects Good property and used to determine if the object should be selected. The delay variable is used to control how “heavy” the object should be to construct in the test (simulating operations performed by the object, including database requests etc.).

The for loop

To manually gather the same results I use the following method:

   1: private static List<Dummy> ManualForSelect(List<bool> items, int page, int pageSize, int delay)
   2: {
   3:     List<Dummy> selection;
   4:     selection = new List<Dummy>();
   5:     int matches = 0;
   6:     int skip = (page - 1) * pageSize;
   7:     for (int i = 0; i < items.Count; i++)
   8:     {
   9:         Dummy dum = new Dummy(items[i], delay);
  10:         if (dum.Good)
  11:         {
  12:             matches++;
  13:         }
  14:  
  15:         if (matches > skip)
  16:         {
  17:             selection.Add(dum);
  18:         }
  19:  
  20:         if (selection.Count == pageSize)
  21:         {
  22:             break;
  23:         }
  24:     }
  25:     return selection;
  26: }

Collecting results

Trying to get as good results as possible the two methods are compared with different collection sizes, object construction times (the delay variable above), scarcity of the objects to be selected (randomly distributed in the collection) and page sizes. Each set of variables is run a number of loops where random pages in the probable span (1000 “valid” objects with page size 100 should give pages 1 to 10) were requested. The number of loops was set to match the number of pages, with some exceptions (I didn’t have the patience to wait for too many loops).

Results

The table below contains the test results

Collection size Loops Delay (ticks) Scarcity Page size Total pages LINQ (ms) For loop (ms) Ratio
10000 10 10000 100 100 1 19226 5506 349%
10000 100 10000 10 100 10 12422 10915 114%
10000 1000 10000 10 10 100 8040 7923 101%
10000 100 10000 2 100 50 10020 9867 102%
10000 100 1000 100 100 1 60 20 300%
10000 1000 1000 10 100 10 31 20 155%
10000 10000 1000 10 10 100 22 17 129%
10000 100 10 100 100 1 65 14 464%
10000 1000 10 10 100 10 30 24 125%
10000 10000 10 10 10 100 22 17 129%
1000000 10 10 10000 100 1 628 343 183%
1000000 100 10 1000 100 10 412 321 128%
1000000 1000 1000 10 100 1000 358 331 108%
1000000 1000 1 10 100 1000 334 312 107%

 

Conclusion

It seems that the for loop is always faster (not surprisingly), but how much faster differs widely. The for loop’s greatest advantage is when the parameters are set so that the whole collection has to be looped, which seems fair. When the objects requested are less scarce compared to the collection and page size however, the LINQ closes in. And added heavy construction time for the objects they’re practically equal (because less time is spent looping, more constructing objects).

In my scenario I’m looking to select from user profiles (probably more like ten thousand than a million) which are relatively heavy to construct (will probably include database calls at least the first time). Matches will be relatively common and page sizes will be between ten and a hundred. In this scenario I can afford the performance overhead of LINQ.

Sep 01, 2009

Guest
(By Guest, 9/21/2010 12:32:37 PM)

Interesting post Magnus!

Just out of curiousity, have you, or could you, test if this renders a different result?

List = items.Select(num => new Dummy(num, delay)).Where(dum => dum.Good).ToList().Skip((page - 1) * pageSize).Take(pageSize).ToList();
/ Joel Abrahamsson

Magnus Rahl
(By Magnus Rahl, 9/21/2010 12:32:37 PM)

I only did a few tests, but as I suppose you expected it is considerably slower to call ToList before paging. I guess that forces the LINQ to yield the Select and Where completely, instead of progressing item by item until the Skip and Take are fulfilled. Neat that it seems to understand how to enumerate efficiently in my test case though. That was actually my main concern and reason for doing the tests. I could probably have found information about how LINQ chains calls but testing was more fun.

Guest
(By Guest, 9/21/2010 12:32:37 PM)

Interesting post. It's to bad though, the LINQ code is so compact and nice looking!
/ Jonas Hjalmarsson

Please login to comment.