Adventures with IL

Dean
8 min readMay 13, 2018

--

One of the most fun things for me in the .NET is to look at the compiled IL code. IL stands for Intermediate Language. Long story short, whenever you write a piece of the code in .NET at the end your compiler will boil it down to IL.

This started originally as MSIL (Microsoft Intermediate Language). Now is more known as CIL (Common Intermediate Language), to fit the ecosystem that is Microsoft creating around the .NET. If you’re asking yourself why would you need to know more about IL is simple. It will make you a better developer. You will understand the “syntactic” sugar that abstracts a lot from you. You will see clear differences between the “for” and “foreach” and a lot more fun things. The IL is the lowest level of the programming language that is readable.

If you are up to the task you can even write a simple application using IL. To make a clarification, there are two flavors of IL. The binary representation of IL (which is what everyone is talking about when mentioning IL). The second one is IL Assembly. My first attempt, years ago, returned me to my college days and couple of courses where I was programming in AL (Assembly language). If you never saw an AL it will look really strange to you. Don’t get discouraged and try to understand it. When you grasp the basics, everything will start to make more sense.

This is not something that everyone is interested in. So if you just want to see what your code looks like in IL version, there are tools that can help you with this. If nothing else, just for educational purposes and to maybe improve yourself as a developer. For me, it was helpful to understand the differences between different approaches to the same problem. As well optimizations that compiler will do for you. From all of this, you can learn and improve your code. The tool that I always use is called ILSpy.

Now let me go into the details.

.ldstr “Hello World”

To start of this, let me write a simple Hello World console application. Then I will go through details and how it works.

.assembly extern mscorlib {}
.assembly Example
{
.ver 1:0:1:0
}
.module example.exe.method static void main() cil managed
{
.maxstack 1 .entrypoint
ldstr "Hello World"
call void [mscorlib]System.Console::WriteLine (string)
ret
}

Ok, we have some code. let us save it as example.il on the location of our choosing. To run our complicated piece of code, we need the ILAsm.exe. This is shipped with .NET framework and it can be found in the folder: <windowsfolder>\Microsoft.NET\Framework\<version>. For me, I put this in my Path environment variable so I can use it without any problems.

So, what we need to do, to run our code, is the following:

# I assume that you are in the same folder where 
# the example.il file is...
ilasm example.il

When the compilation of the code finishes, you will have in the same folder the newly created executable file with the same name as your .il file (in my case, that will be example.exe). So. we are halfway there, the only thing that is left for us to do is to run our application. The image bellows shows the output and execution of our code.

Voila, our application runs. When you look at it, it already looks like a code that we would write using C#:

namespace Example
{
class Example
{
static void Main(string[] args)
{
System.Console.WriteLine("Hello World");
}
}
}

Now that we have both implementations, we can see the similarities and the differences. Anyhow, your C# code when compiled will look same (or pretty much similar). If you don’t believe me, you can use ILSpy (or ILdasm.exe) to confirm this.

Now that we know how the IL looks like, let me try to explain what is going on in the example above:

  1. The first line that we see in the example is an import of the external library. In this particular example, this is mscorlib. In the Visual Studio terminology, this is like adding a reference towards some DLL. Nothing complicated, right?
  2. The next thing that we see is that we are providing some information about our own assembly. That is, what we are trying to make. So here we are saying, this is the assembly with the name of Example. Inside of here, we are providing even more meta-data like version information of our assembly. There is much more information that we can provide in this “header”.
  3. Ok, now we are on a line where there is a something new, module. This just defines the module name of our assembly.
  4. The following lines are now getting more interesting. We are getting into the code. When you look at it, there is a lot similarity with the C# code. It adds more keywords, which are not available in the C#. The difference is that here we have a couple of new things: a) .method begins the definition of the method (you don’t say). b) cil managed is here to indicate to the compiler that the following code should be compiled as a managed code. Hope that you know the difference between the unmanaged and managed code in .NET.
  5. The next statement actually can have a chapter (or post) for itself, .maxstack. I will try to sum it in a couple of sentences for the sake of this explanation. In short, it is called evaluation stack. What it does is our example, it says that we will allocate at most one place on the stack. So in other words, the evaluation stack is responsible that we have in the memory when we perform an operation on them. In our example, we are moving a string “Hello World” onto the stack before calling a Console.WriteLine on it. After our code is finished processing the values on the evaluation stack they will be removed from it. I hope this makes sense, as said already, this can have a post on its own (which sounds like a good idea).
  6. Now we are getting to an interesting point. As we are well aware, in the C# environment we are expected to have the Main method. IL is a little bit less strict with this, in a sense that naming is not mandatory. Here we have a .entrypoint to mark a method as an Entry Point of our application. It is easy to check this, go to the previous example and rename the method and test if everything still works as expected. Now try to remove the .entrypoint from the code and see what happens.
  7. I hope you are not getting tired already. We are now getting to the part where we are going to see how we show in the Console our “Hello World”. This is done by moving our string on the stack, using .ldstr (load string) instruction. I explained this in some detail at step 5, when I was describing evaluation stack. So we first move the string on the stack. Then we can perform an operation (in our case, write it out) on it. More of this in the next step.
  8. After all of this build up, we are at the place where our string will be written out to the console. Invocation of the methods looks a little bit strange in ILAsm, but still has some resemblance what we see in the C#. First time I saw this, it reminded me to (+ 1 2) in some languages (can someone name say from which language is this particular syntax?). So what is going to happen here is that invocation of System.Console.WriteLine will expect a string as an input. The evaluation stack will be checked for the input and after the processing is finished (string is written out), the string will be removed from the stack.
  9. And finally, we are ready to return from the method. Hence, at the last place, we see the instruction ret. Phew, a lot of explanations for something that looks so simple. Now I hope that you are at some comfortable level when you see ILAsm you will not get confused. In the following example, I will try to show the simple program with a little bit more complex structure. We will try to add two numbers and print their result. Damn, that sounds really complicated.

Ok, let us try to multiply two numbers. To make it more interesting, let us also write their result to the Console.

.assembly Example
{
.ver 1:0:1:0
}
.module example.exe.method static void main() cil managed
{
.maxstack 2
.entrypoint

ldstr "2 * 3 = "
call void [mscorlib]System.Console::Write (string)

ldc.i4.22
ldc.i4 3
mul
call void [mscorlib]System.Console::Write (int32)

ret
}

This time I will not be focusing to describe every line. I have already gone through most of it in a previous example. Ok, let me start.

As it can be seen, now we can put on the evaluation stack two items. Well, that makes sense, right? We are trying to multiply two numbers here. It is only logical that we can keep on the stack two items.

Ok, the next thing that I am doing is writing out “2 * 3 = “. Yup, in this example I will hard-code the values of multiplication. The next example will show how to allow the user to enter the numbers for himself. First I load a string onto the stack and next line is invoking Console.Write. Nothing complicated that we haven’t seen in the previous example.

Now we are coming to “crucial” part. We are pushing value 2 onto the stack. There are different ways to push an int32 onto the stack. Here I am showing two of them. The only difference is, that first format has a limitation (ldc.i4.(number), where the number goes from 0–8). The second format has no such limitation. It just accepts the number as int32 and pushes it onto the stack.

Now that we have our values prepared on the stack, we are going to use instruction mul. This is the instruction that we were waiting for. Multiply two values from the stack and push the result on the stack. And guess, what? It does what we want. Incredible. The next line only prints the int32 to the console. That is the result of the multiplication.

Woah, that was cool. Now let us up to our game for a little bit. Let us allow the user to enter the values for multiplication. It should not be that hard.

.assembly Example
{
.ver 1:0:1:0
}
.module example.exe.method static void main() cil managed
{
.maxstack 2
.entrypoint

ldstr "X: "
call void [mscorlib]System.Console::Write (string)
call string [mscorlib]System.Console::ReadLine ()
call int32 [mscorlib]System.Int32::Parse(string)

ldstr "Y: "
call void [mscorlib]System.Console::Write (string)
call string [mscorlib]System.Console::ReadLine ()
call int32 [mscorlib]System.Int32::Parse(string)

mul
call void [mscorlib]System.Console::Write (int32)

ret
}

As promised, nothing complicated if you know how to write this using C#. The common confusion that I see when I show something like this is that their first guess for the stack size is 4. But as any stack, it is easy to overlook that we are doing push and pop. So when we pop something from the stack, space is free. That is why it is enough, for this example, to only allocate place on the stack for two items.

br Exit

The world of IL is not somewhere where we would like to dwell on our daily work. For me this is a good practice, just to see how the basis of our work looks like. Especially when I want to look like to what my code compiles. I learned a couple of interesting ways to solve problems in my daily work.

For the first adventure in the world of IL, I think we did well. In the future adventures I am planning to cover a lot more things:

  1. Data Types.
  2. How to declare a variable.
  3. Loops.
  4. Conditions / Branching
  5. and much more.

Best of luck until next time.

--

--