The key features of IL are:
- It’s a stack oriented language
- It’s object oriented
- It’s strongly typed
- It makes heavy use of the .NET Framework classes
Hello World in IL
You would doubtless be disappointed without a “Hello World” example, so let’s begin with the very simplest IL program that does something – i.e. displays Hello World in a console. To do this you first need to have a copy of the IL assembler ILasm.exe. This is included with the .NET SDK and is available for 32-bit and 64-bit machines. Notice that you don’t need Visual Studio or even any of the “Express” development environments installed. Surprisingly, VS doesn’t actually support the use of IL development. As long as you have the .NET SDK installed you will find ILasm in \Windows\Microsoft.NET\Framework\vx.y.z where the x.y.z is the version number of the SDK.You need to set up a command Window with a Path set to the assembler’s location and the directory that contains the files you want to assemble. In case you have forgotten DOS, this is achieved by:
PATH=C:\ Windows\Microsoft.NET\ Framework\vx.y.z CD C:\folder that you are working inYou can use any text editor that can produce plain ASCII files to create .IL source files – I used Notepad.
The simplest IL program is very simple indeed. Enter the following lines and save the result as Hello.IL (if you are using Notepad, remember to surround the file name in double quotes when you save “Hello.IL”, otherwise you end up with a file called Hello.IL.TXT).
.assembly extern mscorlib {} .assembly Hello {} .module Hello.exe .class Hello.Program extends [mscorlib]System.Object { .method static void Main(string[] args) cil managed { .entrypoint ldstr "Hello World" call void [mscorlib]System.Console::WriteLine( string) ret } }To assemble this to an .EXE the command is:
ILasm HelloThis should produce a set of messages that look something like Figure 1. As long as everything has worked you should see a file called Hello.exe in the same directory as Hello.il. If you run this at the command prompt it prints the message as promised.
Figure 1: A successful assembly
Basic IL
What is interesting about this simple example is that it illustrates all of the major characteristics of the assembler. Assembler directives begin with a dot, and the first directive:.assembly extern mscorlib {}…informs the assembler that we are going to be using objects and methods within the mscorlib assembly, i.e. the console class and its WriteLine method. Notice that already we have objects and the .NET Framework involved in our assembler. The next two directives simply give the program that we are creating an assembly and module name. Without these the assembler and the runtime don’t really know what to do with our program – declaring it to be an assembly means that it can be run.
.assembly Hello {} .module Hello.exeThe next line declares a class and states that it inherits from Object
.class Hello.Program extends [mscorlib]System.Object {We then define a CIL managed static method:
.method static void Main(string[] args) cil managed {All of which is more evidence of object orientation and use of the Framework. The entrypoint directive marks where the program should be started from, and every runnable program has to have one:
.entrypointFinally we get to some IL instructions, and it’s all over very quickly!
ldstr "Hello World" call void [mscorlib]System.Console::WriteLine( string) ret } }The ldstr, i.e. LoadString, instruction loads the string “Hello World” onto the stack. The call instruction calls the WriteLine method of the static Console class. The method picks up its parameters from the stack and what looks like a parameter definition, i.e. (string), is a type definition that says that the top of stack item is to be a string. The void return type means that the method doesn’t leave a return value on the stack. The ret, or Return, completes the method and our program. As you can see even this simple example demonstrates how stack- and object-oriented the language is, and how it uses both typing and the Framework.
The stack
Of all of the features of IL, the one that high-level language programmers tend to find strange is the central role that the stack plays. In this case the stack is a little more sophisticated than a simple block of memory with a pointer. You need to think of it in terms of a strongly typed stack made up of “slots” that hold a complete data type. When you push data onto and pop data off the stack it always works in terms of a complete data type. Nearly all IL instructions work by popping input data from the stack and pushing their results on the stack. As an example, let’s add two numbers together.The first instruction is ldc or LoaD Constant:
ldc.i4 0x01This pushes the 4-byte integer constant, i.e. an Int32, onto the stack. To perform an add we need two values on the stack:
ldc.i4 0x02 addNow we have the result of adding 1 and 2, i.e. 3, on the top of the stack and we can use WriteLine again to display the result:
call void [ mscorlib]System.Console::WriteLine( int32) retNotice that everything is still strongly typed, and the add instruction can discover the type of the two items on the top of the stack and push an appropriate type back on the stack – you can try the same with floating point numbers:
ldc.r4 0.1 ldc.r4 0.2 add call void [mscorlib]System.Console::WriteLine( float32) retThe range of primitive data types available to you is similar to those in C# or VB, with some changes to the names used.
As well as the stack there are local variables, data structures and fields. But notice that in principle you can write any program using just the stack. For example, to declare a local variable called Total you would add:
.locals init(float32 Total)The “init” is a modifier that indicates that the variables have to be initialised before use. To load the result of the addition you have to use:
stloc Total ldloc Total…before the call to WriteLine. The instruction stloc, i.e. Store to Local, pops the top of the stack into Total. You need the ldloc instruction, i.e. LoaD from Local, to push the value back on the stack so that the WriteLine can use it.
Object-oriented IL
Using a static object isn’t really the same thing as taking a full object-oriented approach – it’s just a way of writing a main program. This next example is intended to give you an idea of the full extent of IL’s object facilities. Start a new program called Arith.il. First we have the usual declarations followed by a public class definition:.assembly extern mscorlib {} .assembly Arith{} .module Arith.exe .class public Arith { .method public specialname void .ctor(){ret} .method public float32 Add(float32,float32) { ldarg.1 ldarg.2 add ret } }The class has two methods: .ctor which is its constructor (and which does nothing in this case), and Add. The Add method pushes its two parameters on the stack, adds them and leaves the result on the stack.
To try this class and its Add method out we use the static Main method again:
.class Test.Program extends [mscorlib]System.Object { .method static void Main(string[] args) cil managed { .entrypoint newobj instance void Arith::.ctor() ldc.r4 0.1 ldc.r4 0.2 call instance float32 Arith::Add(float32,float32) callvoid [mscorlib]System.Console::WriteLine( float32) ret } }The newobj instruction creates an instance of the class and calls its creator, .ctor(). The result of newobj is a pointer to the instance stored on the top of the stack. Now we can load the stack with two parameter values and call the instance of Add. Notice that the instance of the class that is called is determined by the first argument, i.e. arg0, passed to the method. You can think of this as a “this” reference and note that instance methods have to explicitly use it to work with instance fields. If you assemble this program you will discover that it adds two numbers together as before. IL supports instance and static methods and fields. It support virtual and non-virtual methods and inheritance but this is beyond the scope of this introduction.
Where next?
Once you have the idea of the way that the object-oriented, strongly typed aspects of IL interact with the fact that it is a stack-oriented assembler you should find it easier to understand the documentation. You can find some very dry technical definitions of how it all works at MSDN.Another good way of learning IL is to use the ILdasm tool, which you will find in the same directory as ILasm. This can be used to disassemble .NET programs and it provides lots of clues as to how the compilers use IL.
Harry Fairhead is a hardware-oriented programmer with an interest in computer architecture and embedded code, particularly using the .NET Compact Framework
Comments