Community blog feed

Why boxing doesn't keep me awake at nights

Website
Blog
Jon Skeet's Coding Blog
Posted
08 Oct 2008 at 19:42

Summary

I'm currently reading the (generally excellent) CLR via C#, and I've recently hit the section on boxing. Why is it that authors feel they have to scaremonger about the effects boxing can have on performance?Here's a piece of code from the book:using System;publicsealedclass Program {publ

Post extract

I'm currently reading the (generally excellent) CLR via C#, and I've recently hit the section on boxing. Why is it that authors feel they have to scaremonger about the effects boxing can have on performance?

Here's a piece of code from the book:

using System;

public sealed class Program {
   public static void Main() {
      Int32 v = 5;   // Create an unboxed value type variable.

#if INEFFICIENT
      // When compiling the following line, v is boxed
      // three times, wasting time and memory
      Console.WriteLine("{0}, {1}, {2}", v, v, v);
#else
      // The lines below have the same result, execute
      // much faster, and use less memory
      Object o = v;

      // No boxing occurs to compile the following line.
      Console.WriteLine("{0}, {1}, {2}", o, o, o);
#endif
   }
}

In the text afterwards, he reiterates the point:

This second version executes much faster and allocates less memory from the heap.

This seemed like an overstatement to me, so I thought I'd try it out. Here's my test application:

using System;
using System.Diagnostics;

public class Test
{
    const int Iterations = 10000000;
   
    public static void Main()
    {
        Stopwatch sw = Stopwatch.StartNew();
        for (int i=0; i < Iterations; i++)
        {
#if CONSOLE_WITH_BOXING
            Console.WriteLine("{0} {1} {2}", i, i, i);           
#elif CONSOLE_NO_BOXING
            object o = i;
            Console.WriteLine("{0} {1} {2}", o, o, o);
#elif CONSOLE_STRINGS
            string s = i.ToString();
            Console.WriteLine("{0} {1} {2}", s, s, s);
#elif FORMAT_WITH_BOXING
            string.Format("{0} {1} {2}", i, i, i);
#elif FORMAT_NO_BOXING
            object o = i;
            string.Format("{0} {1} {2}", o, o, o);
#elif FORMAT_STRINGS
            string s = i.ToString();
            string.Format("{0} {1} {2}", s, s, s);
#elif CONCAT_WITH_BOXING
            string.Concat(i, " ", i, " ", i);
#elif CONCAT_NO_BOXING
            object o = i;
            string.Concat(o, " ", o, " ", o);
#elif CONCAT_STRINGS           
            string s = i.ToString();
            string.Concat(s, " ", s, " ", s);
#endif           
        }
        sw.Stop();
        Console.Error.WriteLine("{0}ms", sw.ElapsedMilliseconds);
    }
}

I compiled the code with one symbol defined each time, with optimisations and without debug information, and ran it from a command line, writing to nul (i.e. no disk or actual console activity). Here are the results:

Symbol Results (ms) Average (ms)
CONSOLE_WITH_BOXING 33054 33444
  33898  
  33381  
CONSOLE_NO_BOXING 34638 33451
  32423  
  33294  
CONSOLE_STRINGS 29259 28337
  29071  
  26683  
FORMAT_WITH_BOXING 17143 17210
  18100  
  16389  
FORMAT_NO_BOXING 15814 15657
  15936  
  15222  
FORMAT_STRINGS 9178 8999
  9077  
  8742  
CONCAT_WITH_BOXING 12056 12563
  14304  
  11329  
CONCAT_NO_BOXING 11949 12240
  13145  
  11628  
CONCAT_STRINGS 5833 5936
  6263  
  5713  

So, what do we learn from this? Well, a number of things:

  • As ever, microbenchmarks like this are pretty variable. I tried to do this on a "quiet" machine, but as you can see the results varied quite a lot. (Over two seconds between best and worst for a particular configuration at times!)
  • The difference due to boxing with the original code in the book is basically inside the "noise"
  • The dominant factor of the statement is writing to the console, even when it's not actually writing to anything real
  • The next most important factor is whether we convert to string once or three times
  • The next most important factor is whether we use String.Format or Concat
  • The least important factor is boxing

Now I don't want anyone to misunderstand me - I agree that boxing is less efficient than not boxing, where there's a choice. Sometimes (as here, in my view) the "more efficient" code is slightly less readable - and the efficiency benefit is often negligible compared with other factors. Exactly the same thing happened in Accelerated C# 2008, where a call to Math.Pow(x, 2) was the dominant factor in a program again designed to show the efficiency of avoiding boxing.

The performance scare of boxing is akin to that of exceptions, although I suppose it's more likely that boxing could cause a real performance concern in an otherwise-well-designed program. It used to be a much more common issue, of course, before generics gave us collections which don't require boxing/unboxing to add/fetch data.

In short: yes, boxing has a cost. But please look at it in context, and if you're going to start making claims about how much faster code will run when it avoids boxing, at least provide an example where it actually contributes significantly to the overall execution cost.

We'd love to hear what you think! Submit ideas or give us feedback