Making generics add up

This article was originally published in VSJ, which is now part of Developer Fusion.

When you first learn about generics, you can’t help but sense the potential power for expressing type-independent ideas. For example, you might easily think up as your first example of a generic method something that adds things together:

T Sum<T>(T a,T b)
{ return a+b; }

After all, addition is a concept that is largely type-independent, and this all makes sense when used as in:

Sum<int>(1,2);

Of course if you try this out you will discover that the compiler returns:

Operator "+" cannot be applied to
operands of type "T" and "T"

This is all very reasonable in (say) C++, where a template of this sort would work as expected, but C# generics are not C++ Templates. C# generics are implemented at runtime and are type checked at compile time. A little thought should reveal the error in the initial thinking about the example. Not all objects support the addition operation and, as T can be any class, at runtime the compiler simply cannot determine what a+b means when the type of a and b isn’t yet determined. But wait, surely we can do better? After all, when we give T a type value and write:

Sum<int>(1,2);

…the type of a and b is now determined, so the addition operation can be applied. This is, of course, yet more wrong thinking. Type safety is all about what you can check at compile time, and a type safe program doesn’t have to be run to determine its type correctness. A slightly more subtle argument is that it IS possible to determine the type of a and b in the call at compile time, and hence the compiler could perform the necessary type checking. This is true, but at the moment the compiler isn’t clever enough to derive and use instance information in this way, as it is a difficult problem given the range of ways the generic function can be called.

While the difficulties of creating a generic that can add two integers isn’t something to lose sleep over, the same problem occurs in many more important situations. For example, if you need to find the total of values stored in a generic list you might use:

public static T Sum(List<T> list) {
    T sum=0;
    for(int i=0;i<list.Count;i++)
    	sum = sum + list[i];
    return sum;
}

This too doesn’t work for the same reasons as given above. The solutions presented in this article work with adding simple generic types, but they are equally applicable to the problem of summing a generic list.

T unbound to T bound

At this point you might be starting to think that generics aren’t so powerful an idea. If you specify a type as T with no constraints, i.e. T is an unbound type parameter, then you essentially have to regard all objects of type <T> as being instances of System.Object. What this means is that you can use assignment and any of Equals, GetHashCode, GetType and ToString. All of which means that the previous generic Sum method is best thought of as being equivalent to:

object Sum<T>(object a, object b)
{   return a + b; }

…which makes it very clear why there is a problem with the attempt to add a and b together. This makes generics look a lot less than powerful, but don’t give up just yet – there are ways of uncovering their real power.

Constraints and Interfaces

The simplest way of increasing the range of operations a generic method can use is to put a base class constraint on the type parameter, i.e. to bind the T to a class hierarchy. The compiler can deduce from the binding what methods and properties T supports, and allows these to be used within the generic method. To show this in action we can build a simple “number” class that wraps a standard int and provides an operator:

class MyInt {
    private int m_value;
    public MyInt(int a)
    { m_value = a; }
    static public MyInt
    	operator +(MyInt a,MyInt b)_{
    	return new MyInt(
    		a.m_value + b.m_value); }
}

This defines a simple constructor for the object and overloads the + operator so that two such objects can be added together in a reasonable way.

Using this you can now define a fairly successful generic method that will add together any MyInt or MyInt derived objects:

T Sum<T>(T a, T b) where T:MyInt
{ return a+b; }

I say “fairly” successful because now the addition is accepted as valid between a pair of objects of type T, but there is still a problem. The compiler now complains that it can’t convert an object of type MyInt to an object of type T. This is because it can work out that objects of type T have a static “+” method that is guaranteed to add two objects of type T, but it can only determine that the method returns an object of type MyInt, which could be different from T. For example, if you derive another class, MySmallInt, from MyInt the “+” operator might add two MySmallInt objects and return a MyInt object, which isn’t a T (i.e. a MySmallInt). More to the point, although we know that the “+” operator is always going to have the signature:

T operator +(T,T)

…there is nothing enforcing this restriction – in short there is no way of making it clear that the “+” returns a T. The only solution is to use a generic cast:

return (T)(a + b);

…which works perfectly, but looks strange.

You can try and get around the need to cast by abandoning the use of operator overloading and using an Add method:

public MyInt Add(MyInt a, MyInt b)
{ return new MyInt(
    a.m_value+ b.m_value); }

But this doesn’t help very much because you generate exactly the same problem trying:

T Sum<T>(T a, T b) where T:MyInt
{ return a.Add(a,b); }

Again you have to use a generic cast:

return (T)(a.Add(a, b));

However this change isn’t entirely fruitless, because now we can consider using an Interface as a constraint instead of a base class. You can’t include an operator overloading function within an Interface specification, but there is nothing wrong with Add. The idea here is that instead of relying on constraining the generic to the Add class and its derived classes, we simply insist that any class used with the generic method has to implement the ISummable interface:

T Sum<T>(T a, T b) where T:ISummable
{ return a.Add(a,b); }

That is, any type T used in the generic function can be assumed to have implemented ISummable. The definition of the ISummable interface is just:

interface ISummable
{ MyInt Add(MyInt a, MyInt b); }

We also have to change MyInt so that it inherits ISummable (the Add interface method is already implemented):

class MyInt :ISummable

Following this nothing works because the ISummable interface isn’t generic. It always expects to have an Add function that works with MyInt objects, and this causes a compiler error when you try to use it with objects of type T. We need to change this to a generic interface specified using a parameterised T object:

interface ISummable<T>
{ T Add(T a, T b); }

This quite clearly gives the form of the signature of the Add method. There is only one change we need to provide a type when the interface is inherited:

class MyInt:ISummable<MyInt>

No other changes are needed as the type specification means that Add has to be defined exactly as it already is within MyInt:

public MyInt Add(MyInt a, MyInt b)
{ 
    return new MyInt(
    	a.m_value+ b.m_value);
}

The generic Sum method can now be written:

T Sum<T>(T a, T b) where
    T:ISummable<T>
{
    return a.Add(a,b);
}

Given one small change – the addition of <T> to the interface specification – the compiler can now work out that Add has the signature T Add(T,T) and permits it as the return value in the generic without the need for a cast. Notice that now the generic Sum method will work with any class that implements the ISummable<T> Interface, and this makes it possible to use generics across, as well as within, class hierarchies.

The same method of defining a generic interface can be used whenever you need to implement a generic that uses class methods – define a generic interface and make sure all classes that are designed to work with the generic method inherit one way or another from the generic interface. There is one problem however – you can’t add interface support to primitive types or sealed classes.

Summing primitives

There is lots of discussion about how .NET generics can be improved to allow basic operations to be included on primitive types. The most logical thing would be to define an IArithmetic<T> interface supported by all of primitive arithmetic types supported to be used as a constraint when defining generic methods, but this might slow things down. For the moment the best solution to generically summing primitives, from the point of view of clarity, is to define a suitable class or struct that wraps the primitive type and implement a suitable constraining interface, i.e something similar to the MyInt class and the ISummable interface. The big problem with this approach is that the wrappers are likely to be inefficient compared to using primitive types – a problem magnified by their use in Lists and other generic collection types.

An alternative, more lightweight, approach is to define a generic calculator object or struct that inherits from the same generic sort of interface used as a constraint, and use this to work with the primitive type. After all, we can’t add methods to the primitive types using inheritance, but we can supply a new object that knows how to perform arithmetic on them, e.g. a calc.Add(int,in) method.

For example, to add two integers we need an ISummable interface as before:

interface ISummable<T>
{
    T Add(T a, T b);
}

For each of the native types we need a Calc struct that does the work with the appropriate type:

struct IntCalc:ISummable<int>
{
    public int Add(int a, int b)
    {
    	return a+ b;
    }
}

Notice that we need a type (class or struct) suitably named for each primitive type IntCalc, Int64Calc and so on.

Finally we can implement the new generic sum method:

T Sum<T, C>(T a, T b) where C :
    ISummable<T>,new()
{
    C calc = new C();
    return calc.Add(a, b);
}

This needs two type parameters because we need to give the primitive type being summed, T, and the type of the calculator C that will do the summing. The clever part is the way that the C calculator type is constrained by the interface specified with the T type. After this we can create the calculator, which has a method with signature T Add(T,T), and use it to return the result. Notice the use of the “new()” constraint on type C. This means that the type has to have a public parameterless constructor, which can be used to create instances of the type, as in:

C calc = new C();

The generic sum is called using both the type to be added and the type of the calculator object, for example:

int temp = Sum<int,IntCalc>(1, 2);

The only problems with this approach are the need to quote two type parameters, and the effort needed to maintain a range of calculator objects, one for each type. But currently this seems to be the best we can do.

On reflection

Of course there is yet another way to make a generic add up – reflection. This works in the same way as late binding to a method on a general object. This has the advantage that we don’t have to make any special modifications to the class being used in the call to the generic method, and we don’t have to impose conditions on the type parameter T. So in this case the generic Sum method starts off:

T Sum<T>(T a, T b)
{

We then get a Type object for the first parameter, and construct an object array holding the parameter we need to pass to the Add method:

   Type RT = a.GetType();
    object[] param = new object[2];
    param[0] = a;
    param[1] = b;

There is no particular reason for chosing the first parameter to construct a Type object on, and in practice you probably should check that the second parameter also supports the Add method. Now we can use the InvokeMember method to call the named method:

   T result = (T) RT.InvokeMember(
    		"Add",
    		BindingFlags.InvokeMethod |
    		BindingFlags.Public |
    		BindingFlags.Instance,
    		null,
    		a,
    		param);
    return result;
}

Notice the use of the cast to convert from object to T. This generic method can be use with any types that provide the Add method with the correct signature. You can even modify this so that it invokes the “+” operator if it is overloaded in the class. Simply change the InvokeMember call to:

   T result = (T) RT.InvokeMember(
    		"op_Addition",
    		BindingFlags.InvokeMethod |
    		BindingFlags.Public |
    		BindingFlags.Static,
    		null,
    		a,
    		param);

Notice that the name of the operation is “op_Addition” and it has to be specified as a Static method. Sadly this doesn’t work with built-in types because they use the built-in addition operator, which can’t be invoked in this way. The only way to do this seems to be dynamic code generation, but for me at least this is a step too far.

Where next?

It is to be hoped that a richer set of constraints for generics will be introduced into the language in the future, but they haven’t made it into the next version of C# as part of the Orcas upgrade. Until then you need to be aware of the limitations of unconstrained generic types, and the problems of using class and interface-based constraints. Sometimes these are enough to make a generic implementation just too complicated.


Dr. Mike James’ programming career has spanned many languages, starting with Fortran. The author of Foundations of Programming, he has always been interested in the latest developments and the synergy between different languages.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Linux is only free if your time has no value” - Jamie Zawinski