Getting started with IL

This article was originally published in VSJ, which is now part of Developer Fusion.
If you already program in almost any .NET language, you will know that it isn’t compiled to machine code but to MS Intermediate Language, or IL. In this sense IL is the assembly language of .NET, and like all assembly languages a knowledge of it helps you understand how things work and how to make them work better. However, this assembly language isn’t quite what you might expect. If you already know a machine assembly language like x86, PowerPC or Pic, then you will be prepared for some of the low-level ideas in IL, but you might well be shocked to discover how “high” this intermediate language is. Indeed there is the argument that it’s much easier to understand if you already program in, say, C#.

The key features of IL are:

  1. It’s a stack oriented language
  2. It’s object oriented
  3. It’s strongly typed
  4. It makes heavy use of the .NET Framework classes
Let’s take a look at each of these aspects in turn.

Hello World in IL

You would doubtless be disappointed without a “Hello World” example, so let’s begin with the very simplest IL program that does something – i.e. displays Hello World in a console. To do this you first need to have a copy of the IL assembler ILasm.exe. This is included with the .NET SDK and is available for 32-bit and 64-bit machines. Notice that you don’t need Visual Studio or even any of the “Express” development environments installed. Surprisingly, VS doesn’t actually support the use of IL development. As long as you have the .NET SDK installed you will find ILasm in \Windows\Microsoft.NET\Framework\vx.y.z where the x.y.z is the version number of the SDK.

You need to set up a command Window with a Path set to the assembler’s location and the directory that contains the files you want to assemble. In case you have forgotten DOS, this is achieved by:

PATH=C:\ Windows\Microsoft.NET\
						Framework\vx.y.z
CD C:\folder that you are working in
You can use any text editor that can produce plain ASCII files to create .IL source files – I used Notepad.

The simplest IL program is very simple indeed. Enter the following lines and save the result as Hello.IL (if you are using Notepad, remember to surround the file name in double quotes when you save “Hello.IL”, otherwise you end up with a file called Hello.IL.TXT).

.assembly extern mscorlib {}
.assembly Hello {}
.module Hello.exe

.class Hello.Program
	extends [mscorlib]System.Object
{
	.method static void Main(string[]
		args) cil managed
	{
	.entrypoint
		ldstr	"Hello World"
		call void
[mscorlib]System.Console::WriteLine(
			string)
		ret
	}
}
To assemble this to an .EXE the command is:
ILasm Hello
This should produce a set of messages that look something like Figure 1. As long as everything has worked you should see a file called Hello.exe in the same directory as Hello.il. If you run this at the command prompt it prints the message as promised.

Figure 1
Figure 1: A successful assembly

Basic IL

What is interesting about this simple example is that it illustrates all of the major characteristics of the assembler. Assembler directives begin with a dot, and the first directive:
.assembly extern mscorlib {}
…informs the assembler that we are going to be using objects and methods within the mscorlib assembly, i.e. the console class and its WriteLine method. Notice that already we have objects and the .NET Framework involved in our assembler. The next two directives simply give the program that we are creating an assembly and module name. Without these the assembler and the runtime don’t really know what to do with our program – declaring it to be an assembly means that it can be run.
.assembly Hello {}
.module Hello.exe
The next line declares a class and states that it inherits from Object
.class Hello.Program
		extends [mscorlib]System.Object
{
We then define a CIL managed static method:
	.method static void Main(string[]
		args) cil managed
	{
All of which is more evidence of object orientation and use of the Framework. The entrypoint directive marks where the program should be started from, and every runnable program has to have one:
	.entrypoint
Finally we get to some IL instructions, and it’s all over very quickly!
		ldstr	"Hello World"
		call void
[mscorlib]System.Console::WriteLine(
			string)
		ret
	}
}
The ldstr, i.e. LoadString, instruction loads the string “Hello World” onto the stack. The call instruction calls the WriteLine method of the static Console class. The method picks up its parameters from the stack and what looks like a parameter definition, i.e. (string), is a type definition that says that the top of stack item is to be a string. The void return type means that the method doesn’t leave a return value on the stack. The ret, or Return, completes the method and our program. As you can see even this simple example demonstrates how stack- and object-oriented the language is, and how it uses both typing and the Framework.

The stack

Of all of the features of IL, the one that high-level language programmers tend to find strange is the central role that the stack plays. In this case the stack is a little more sophisticated than a simple block of memory with a pointer. You need to think of it in terms of a strongly typed stack made up of “slots” that hold a complete data type. When you push data onto and pop data off the stack it always works in terms of a complete data type. Nearly all IL instructions work by popping input data from the stack and pushing their results on the stack. As an example, let’s add two numbers together.

The first instruction is ldc or LoaD Constant:

ldc.i4 0x01
This pushes the 4-byte integer constant, i.e. an Int32, onto the stack. To perform an add we need two values on the stack:
ldc.i4 	0x02
add
Now we have the result of adding 1 and 2, i.e. 3, on the top of the stack and we can use WriteLine again to display the result:
call void [
mscorlib]System.Console::WriteLine(
	int32)
ret
Notice that everything is still strongly typed, and the add instruction can discover the type of the two items on the top of the stack and push an appropriate type back on the stack – you can try the same with floating point numbers:
ldc.r4 0.1
ldc.r4 0.2
add
call void
[mscorlib]System.Console::WriteLine(
	float32)
ret
The range of primitive data types available to you is similar to those in C# or VB, with some changes to the names used.

As well as the stack there are local variables, data structures and fields. But notice that in principle you can write any program using just the stack. For example, to declare a local variable called Total you would add:

.locals init(float32 Total)
The “init” is a modifier that indicates that the variables have to be initialised before use. To load the result of the addition you have to use:
	stloc Total
	ldloc Total
…before the call to WriteLine. The instruction stloc, i.e. Store to Local, pops the top of the stack into Total. You need the ldloc instruction, i.e. LoaD from Local, to push the value back on the stack so that the WriteLine can use it.

Object-oriented IL

Using a static object isn’t really the same thing as taking a full object-oriented approach – it’s just a way of writing a main program. This next example is intended to give you an idea of the full extent of IL’s object facilities. Start a new program called Arith.il. First we have the usual declarations followed by a public class definition:
.assembly extern mscorlib {}
.assembly Arith{}
.module Arith.exe

.class public Arith
{
	.method public specialname void
		.ctor(){ret}

	.method public float32
		Add(float32,float32)
	{
		ldarg.1
		ldarg.2
		add
		ret
		}
}
The class has two methods: .ctor which is its constructor (and which does nothing in this case), and Add. The Add method pushes its two parameters on the stack, adds them and leaves the result on the stack.

To try this class and its Add method out we use the static Main method again:

.class Test.Program
	extends [mscorlib]System.Object
{
	.method static void Main(string[]
		args) cil managed
	{

		.entrypoint
		newobj instance void
			Arith::.ctor()
		ldc.r4 0.1
		ldc.r4 0.2
		call instance float32
			Arith::Add(float32,float32)
		callvoid
[mscorlib]System.Console::WriteLine(
			float32)
		ret
	}
}
The newobj instruction creates an instance of the class and calls its creator, .ctor(). The result of newobj is a pointer to the instance stored on the top of the stack. Now we can load the stack with two parameter values and call the instance of Add. Notice that the instance of the class that is called is determined by the first argument, i.e. arg0, passed to the method. You can think of this as a “this” reference and note that instance methods have to explicitly use it to work with instance fields. If you assemble this program you will discover that it adds two numbers together as before. IL supports instance and static methods and fields. It support virtual and non-virtual methods and inheritance but this is beyond the scope of this introduction.

Where next?

Once you have the idea of the way that the object-oriented, strongly typed aspects of IL interact with the fact that it is a stack-oriented assembler you should find it easier to understand the documentation. You can find some very dry technical definitions of how it all works at MSDN.

Another good way of learning IL is to use the ILdasm tool, which you will find in the same directory as ILasm. This can be used to disassemble .NET programs and it provides lots of clues as to how the compilers use IL.


Harry Fairhead is a hardware-oriented programmer with an interest in computer architecture and embedded code, particularly using the .NET Compact Framework

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“A computer is a stupid machine with the ability to do incredibly smart things, while computer programmers are smart people with the ability to do incredibly stupid things. They are, in short, a perfect match” - Bill Bryson