Anonymous methods and closure

This article was originally published in VSJ, which is now part of Developer Fusion.

Anonymous methods were introduced in .NET 2.0, and while they sound like something designed to implement “dirty” shortcuts, they are a welcome addition to C#. The big problem with Anonymous methods is figuring out what the problem is that they are designed to solve. So let’s take a look at what they are for.

On one level anonymous methods are just about making delegates easier to create. A delegate is just an object that “wraps” a function call. Before you can use a delegate to wrap a function, you have to create a delegate type that has the signature of the function you want to wrap. Then you instantiate the type, wrap the function and use it. What all this means is that you have to invent multiple names for what in most cases is a single idea. For example, if you want to wrap a “Hello World” function, you first create a suitable delegate type:

delegate void MyHelloDelegateType();

…then you have to create the function:

void Hello()
{
MessageBox.Show(“Hello From a Delegate”);
}

…and finally create an instance of the delegate type specifying the function that it is to wrap:

MyHelloDelegateType MyHelloDelegate1 =
    new MyHelloDelegateType (Hello);

…or equivalently:

MyHelloDelegateType MyHelloDelegate1 =
    Hello;

Calling the function via the delegate is just a matter of using its name:

MyHelloDelegate1();

You can see that we have had to invent MyHelloDelegateType, Hello, and MyHelloDelegate, which is fine if you are going to create multiple instances of the type and wrap multiple functions, but in most cases the type, the delegate and the method are more or less a single entity. The idea of an anonymous method allows you to fuse the identity of the delegate with the function it wraps. For example, we can create an instance of MyHelloDelegateType

MyHelloDelegateType MyHelloDelegate1 =
    delegate()
    {
    	MessageBox.Show(
    		“Hello From a Delegate”);
    };
MyHelloDelegate1();

One identifier less might not seem much of a victory, but now we can recast the code to express the fact that the delegate type is really about the signature and the delegate instance is about what actually happens by renaming the delegate “Hello”:

MyHelloDelegateType1 Hello1 =
    delegate()
{
MessageBox.Show(“Hello From a Delegate”);
};
Hello1();

You can specify parameters in the function definition by treating the keyword “delegate” as if it were the function’s name. For example:

MyHelloDelegateType2 Hello2 =
    delegate(string Msg)
{
    MessageBox.Show(Msg);
};

However for this to work we need to define another delegate type:

delegate void MyHelloDelegateType2(
    string MyString);

Notice that the identifier used as the parameter in the type definition doesn’t carry any meaning – it’s just there for the syntax. Perhaps the C style signature specification:

delegate void
    MyHelloDelegateType2(string);

…would be better, but this doesn’t work. With the type defined we can now call the delegate in the usual way:

Hello2(“Hello delegate 2”);

The situation is a little more complicated than this simple example suggests. In fact the anonymous method doesn’t have to match the signature of the delegate type exactly. As long as the delegate type has no out parameters, then the anonymous method can be defined with no parameters. For example, the following is perfectly legal even though the delegate type specifies a single string parameter:

MyHelloDelegateType2 Hello3 = delegate
{
    MessageBox.Show(
    	“Default message!”);
};

However you still have to call the delegate with the correct signature:

Hello3(“dummy”);

The parameter supplied is simply thown away, and you can see why this approach doesn’t work if there is an out parameter defined – where would the return value come from?

So what are anonymous methods good for? They certainly save one level of naming confusion, but in some cases they can remove the need for any new names at all. For example, consider the Find method of the Array object defined as:

public static T Find<T> (T[] array,
    Predicate<T> match
)

The Predicate delegate function is defined as:

public delegate bool Predicate<T>(T obj)

Without anonymous methods you would have to define a Predicate method, wrap in a delegate and pass it to the Find method. With anonymous method it can be as simple as:

int result = Array.Find(
    A, delegate(int x) {
    return (x < 0);
});

In short, anonymous methods are good for short functions that you want to use “at once”.

Closure?

So far so good, anonymous methods save on names and are good when you want a function “now” – but they have hidden depth. You will often hear JScript programmers talking about “closure”, and now you can join in because C# has “closure” – of a sort. Consider the following code:

int i = 0;
MyHelloDelegateType1 Hello2=delegate()
{
    i++;
    MessageBox.Show(i.ToString());
};
Hello2();
Hello2();
Hello2();
MessageBox.Show(i.ToString());

You might find it surprising that this code is legal, let alone that it reveals that the value of i is incremented each time the delegate is called and the final statement shows that it is indeed the local variable that is changed. This behaviour is described by saying that the anonymous method “captures” the variables in the scope of its containing or outer function. Many programmers refer to this as “closure” or “lexical closure”, although there is much debate about what exactly constitutes closure, and you will find some saying that C# doesn’t support it and others that it does. The issue comes down to whether the value or the variable is captured at the time of creation. C# captures the variable and, for me at least, this is good enough to be called closure.

To demonstrate how subtle the effects of closure can be, consider the following example:

MyHelloDelegateType1[] Count=
    new MyHelloDelegateType1[10];
for(int i=0;i<10;i++) {
    Count[i] = delegate()
    { MessageBox.Show(i.ToString()); };
};

Notice that we create an array of 10 delegates and each one is the same anonymous method that simply displays the current value of i. What do you think is going to be the result of calling one of the delegates, Count[0] say? The first thing to notice is that i isn’t even in scope at the end of the for loop, so if you use:

MessageBox.Show(i.ToString());

…after the for loop then you will get runtime error:

“The name ‘i’ does not exist in the
    current context”

With this in mind you might expect that calling one of the delegates would produce the same error, but no:

for (int j = 0; j < 10; j++)
{ Count[j](); }

…works perfectly and displays the value 10 for each delegate. What happens is that the variable i is captured when each of the delegates is created, but all of the delegates share the same variable with the local environment. When the outer function changes the variable then all the delegates see the change and, in this case, the delegates’ captured copy of i slowly counts up to 10. When the loop ends the local version of the variable goes out of scope, but the captured copy of i lives on in the delegates and it has the value 10. It doesn’t make any difference if you change the final loop variable from j to i – this is a different i in a different lexical context and nothing to do with the captured variable i that the delegates are using.

Examples of closure can become more complicated than this simple for loop, and if you find yourself using such constructions you probably should reconsider and find a more clear expression of what you are trying to do. However the principle is simple enough; the compiler creates a hidden class wrapper for all of the variables in scope when the delegates are created. If a variable is recreated each time the delegate is created then each delegate will capture a new copy. For example:

MyHelloDelegateType1[] Count=new
    MyHelloDelegateType1[10];
for(int i=0;i<10;i++) {
    int j = i;
    Count[i] = delegate() {
    	MessageBox.Show(j.ToString());
    };
};

In this case the variable j is recreated each time through the loop and each delegate captures its own copy. If you now try calling each delegate in turn you will find that it now displays 0,1,2, and so on, reflecting the value of i at the time the delegate was created. Notice that j is out of scope when the loop ends, so you can’t discover what its current value is – only the captured copies survive the loop.

Clearly closures are fun, but what use are they? The answer is that they provide a context for a function which can be used to provide it with additional information without the need to use additional parameters. Why wouldn’t you create some additional parameters? Most likely because the signature of the function you are trying to use isn’t under your control. For example, consider the EnumWindows API call which needs a callback function that is called for each window that it enumerates. The API call is:

[DllImport(“user32.dll”)]
[return: MarshalAs(
    UnmanagedType.Bool)]
static extern bool EnumWindows(
    EnumWindowsProc lpEnumFunc,
    ref IntPtr lParam);

…and the callback delegate is:

public delegate bool EnumWindowsProc(
    IntPtr hWnd, ref IntPtr lParam);

The problem with using the callback delegate is that it only has the two parameters – the handle of the current window and a pointer supplied in the call to the EnumWindows function. It is this pointer that is used to communicate between the callback function and the program needing the enumeration. Closure, however, makes communication much easier. If you need a function to find a particular dialog box specified by its Owner and its Caption string then you could write a function something like:

public IntPtr getDialog(
    IntPtr Owner, String Caption) {

Now we can define the callback delegate using the fact that the Owner and Caption parameters are in scope and so are captured by an anonymous function:

   EnumWindowsProc enumProc =
    	delegate(IntPtr handle,
    	ref IntPtr pointer) 	{

First we get the window text and compare it to Caption:

int length = GetWindowTextLength(handle);
    	StringBuilder wTitle = new
    		StringBuilder(length + 1);
    	GetWindowText(handle, wTitle,
    		wTitle.Capacity);
    if (wTitle.ToString() == Caption) {

If they match we check that the class name is correct for a dialog box and then check that Owner is the correct window:

   		int max = 100;
    		StringBuilder classname =
    			new StringBuilder(max);
GetClassName(handle, classname, max);
if (classname.ToString() == “#32770”)
    		{
    IntPtr Parent = GetParent(handle);
    			if (Parent == Owner)
    			{
    				pointer = handle;
    				return false;
    			}
    		}
    	}
    	return true;
    };

This completes the anonymous callback delegate; now we can call EnumWindows:

   IntPtr DlgHwnd = IntPtr.Zero;
    EnumWindows(enumProc, ref DlgHwnd);
    return DlgHwnd;
}

Notice that the pointer in the callback delegate is used to return the handle of the dialog box that we have found, but this too could have been achieved using closure. Without any use of closure we would have had to pack the Owner and Caption into a data structure and passed this to the callback. Closure makes things much simpler in this case.


Dr. Mike James’ programming career has spanned many languages, starting with Fortran. The author of Foundations of Programming, he has always been interested in the latest developments and the synergy between different languages.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Linux is only free if your time has no value” - Jamie Zawinski