Paging with LINQ – performance

Facing a scenario where I needed to select EPiServerProfile objects based on different criteria and displaying a paged view based on these objects I thought I should use LINQ. Neat syntax and readable code, but what about the performance? There are probably thousands of posts like this out there already, but here is mine :)

The test

I designed a test to compare the performance of LINQ to a simple looping scenario. Based on a simple collection more advanced objects should be created, their properties inspected for matching criteria, and then a certain page of a certain size of the results should be selected. This should correspond to my scenario having the list of all usernames, load their corresponding EPiServerProfile and selecting objects based on properties of the profile.

The LINQ

This is the LINQ I use to select my objects:

        1: List<Dummy> = items.Select(num => new Dummy(num, delay)).

       2:                                       Where(dum => dum.Good).

       3:                                       Skip((page - 1) * pageSize).

       4:                                       Take(pageSize).

       5:                                       ToList();

The items object is a List<bool> whose objects are used to construct the more complex Dummy instances I want. The boolean values are stored in the Dummy objects Good property and used to determine if the object should be selected. The delay variable is used to control how “heavy” the object should be to construct in the test (simulating operations performed by the object, including database requests etc.).

The for loop

To manually gather the same results I use the following method:

       1: private static List<Dummy> ManualForSelect(List<bool> items, int page, int pageSize, int delay)

       2: {

       3:     List<Dummy> selection;

       4:     selection = new List<Dummy>();

       5:     int matches = 0;

       6:     int skip = (page - 1) * pageSize;

       7:     for (int i = 0; i < items.Count; i++)

       8:     {

       9:         Dummy dum = new Dummy(items[i], delay);

      10:         if (dum.Good)

      11:         {

      12:             matches++;

      13:         }

      14:  

      15:         if (matches > skip)

      16:         {

      17:             selection.Add(dum);

      18:         }

      19:  

      20:         if (selection.Count == pageSize)

      21:         {

      22:             break;

      23:         }

      24:     }

      25:     return selection;

      26: }

Collecting results

Trying to get as good results as possible the two methods are compared with different collection sizes, object construction times (the delay variable above), scarcity of the objects to be selected (randomly distributed in the collection) and page sizes. Each set of variables is run a number of loops where random pages in the probable span (1000 “valid” objects with page size 100 should give pages 1 to 10) were requested. The number of loops was set to match the number of pages, with some exceptions (I didn’t have the patience to wait for too many loops).

Results

The table below contains the test results

Collection size	Loops	Delay (ticks)	Scarcity	Page size	Total pages	LINQ (ms)	For loop (ms)	Ratio
10000	10	10000	100	100	1	19226	5506	349%
10000	100	10000	10	100	10	12422	10915	114%
10000	1000	10000	10	10	100	8040	7923	101%
10000	100	10000	2	100	50	10020	9867	102%
10000	100	1000	100	100	1	60	20	300%
10000	1000	1000	10	100	10	31	20	155%
10000	10000	1000	10	10	100	22	17	129%
10000	100	10	100	100	1	65	14	464%
10000	1000	10	10	100	10	30	24	125%
10000	10000	10	10	10	100	22	17	129%
1000000	10	10	10000	100	1	628	343	183%
1000000	100	10	1000	100	10	412	321	128%
1000000	1000	1000	10	100	1000	358	331	108%
1000000	1000	1	10	100	1000	334	312	107%

Conclusion

It seems that the for loop is always faster (not surprisingly), but how much faster differs widely. The for loop’s greatest advantage is when the parameters are set so that the whole collection has to be looped, which seems fair. When the objects requested are less scarce compared to the collection and page size however, the LINQ closes in. And added heavy construction time for the objects they’re practically equal (because less time is spent looping, more constructing objects).

In my scenario I’m looking to select from user profiles (probably more like ten thousand than a million) which are relatively heavy to construct (will probably include database calls at least the first time). Matches will be relatively common and page sizes will be between ten and a hundred. In this scenario I can afford the performance overhead of LINQ.

Sep 01, 2009

Comments

Sep 21, 2010 10:32 AM

Interesting post Magnus!

Just out of curiousity, have you, or could you, test if this renders a different result?

List = items.Select(num => new Dummy(num, delay)).Where(dum => dum.Good).ToList().Skip((page - 1) * pageSize).Take(pageSize).ToList();
/ Joel Abrahamsson

Magnus Rahl Sep 21, 2010 10:32 AM

I only did a few tests, but as I suppose you expected it is considerably slower to call ToList before paging. I guess that forces the LINQ to yield the Select and Where completely, instead of progressing item by item until the Skip and Take are fulfilled. Neat that it seems to understand how to enumerate efficiently in my test case though. That was actually my main concern and reason for doing the tests. I could probably have found information about how LINQ chains calls but testing was more fun.