.NET Collection Enumeration Performance, part 1
I'm investigating the performance of various ways of enumerating collections in .NET, and have some results you might be interested in. The firstbunch concern enumeration of collections (of strings).
For reasons I'll go into another time, I'm looking into the performance of enumerating collections of strings, including string[], List<string>, IList<string>, ICollection<string>, and IEnumerable<string>.
The first point of interest is to see whether enumerating via foreach is faster or slower than using (IList<string>'s) subscript operator. In other words, which of the following three code blocks will have the best performance:
<br />// 1. via foreach: <br /> <br />foreach(string s in collection) <br />{ <br /> . . . // use s <br />} <br />// 2. via subscript (slow version): <br /> <br />for(int i = 0; i != collection.Count; ++i) <br />{ <br /> string s = collection[i]; <br /> . . . // use s <br />} <br />// 3. via subscript (faster version): <br /> <br />for(int i = 0, count = collection.Count; i != count; ++i) <br />{ <br /> string s = collection[i]; <br /> . . . // use s <br />} I tested four collection instances:
- a small array, containing four strings ("abc", "def", "ghi", "jkl")
- a small IList<string>, containing (the same) four strings
- a large array, containing 1000 randomly generated (with a fixed seed for reproducibility) strings of lengths between 1 and 32 characters
- a large IList<string>, containing (the same) 1000 strings
Each was tested in a loop such as the following (or the same but for using subscripting forms 1 & 2 described above):
<br />int total = 0; <br />for(int WARMUPS = 2; 0 != WARMUPS; --WARMUPS) <br />{ <br /> counter.Start(); <br /> total = 0; <br /> for(int i = 0; i != ITERATIONS; ++i) <br /> { <br /> foreach(string s in <strong>collection</strong>) <br /> { <br /> total += s.Length; <br /> } <br /> } <br /> counter.Stop(); <br />} The WARMPUPS stuff is to let skip any JITters. The total is counted in order to prevent the compiler from being too clever in its optimisations, and also to serve as a rudimentary check that each loop is doing the same (amount of) work. The counter is the .NET equivalent of STLSoft's performance counters, described in one of my WDJ/DDJ articles from 2003. (These are part of the Synesis Software .NET components, which will be publicly released early in 2010.)
Anyway, as I'm already going on too long for a blog post, I'll skip to the results, as follows (all times are in microseconds; ITERATIONS was 1,000,000):
| Collection Type/Size | Enumeration Method | Time |
|---|---|---|
| small array | foreach | 17,236 |
| small array | subscript (slow) | 10,412 |
| small array | subscript (fast) | 9,551 |
| large array | foreach | 4,083,337 |
| large array | subscript (slow) | 4,479,619 |
| large array | subscript (fast) | 4,482,497 |
| small list | foreach | 36,196 |
| small list | subscript (slow) | 20,846 |
| small list | subscript (fast) | 17,564 |
| large list | foreach | 8,000,430 |
| large list | subscript (slow) | 7,137,409 |
| large list | subscript (fast) | 6,315,805 |
I'm a little surprised by the small array results, since it's my understanding that the compiler will optimise a foreach on an array to a highly efficient form. equivalent to the faster subscript form, but without the implicit check on Length that is applied by the compiler on our behalf each time we subscript explicitly.
I'm not going to draw any conclusions yet, not until I've had time to look at the actual machine code in the debugger, but I am encouraged (for my "other purposes"; to be discussed another time) to see that enumeration via subscript access into a list does not cost; in fact it's actually cheaper than going via foreach!

