I’m continuing work on my MSIL assembly rewriter. My primary goal for this tool is to optimize the IL in ways that the NetCF JIT compiler can’t. One such deficiency is that the NetCF JIT does a poor job of inlining. Especially troublesome is its lack of ability to inline methods with value type parameters, return types, or local variables. That means just about every mathematical function that games use ALL THE TIME.
Last week, I got my tool working on operator overloads. Operator overloads are actually static methods where the arguments are passed by value. For small structures like vectors, the call overhead makes operators very expensive (read: slow) compared to performing operations on each individual component of the type.
That is to say that the following addition between vectors:
Vector2 result = v1 + v2;
Is more expensive than the following sequence of additions:
Vector2 result;
result.X = v1.X + v2.X;
result.Y = v1.Y + v2.Y;
Some of the overhead is copying v1 and v2 onto the stack, some is copying the return value off the stack into result, and some is the call itself (method calls are expensive on Xbox 360).
Straight inlining of static methods is really easy in principle, and it didn’t take long to get it working (not counting the two years of procrastination). I then added some analysis to the method bodies, to determine which parameters are modified. If they’re not modified, then the parameters can be replaced by the actual arguments rather than copies placed in new local variables. For the mathematical operators, that eliminates not only the method call overhead, but ALL of the copying overhead, too.
This weekend, I worked on inlining value type constructors. These are interesting because they’re instance methods and so the inlining tool needs to deal with “this” references. Replacing the “this” references turned out to be easy – but it got harder when I thought about inlining Xbox 360 functions.
Pretty much the whole point of this tool is to help performance on the NetCF. In particular, I want to use it on XNA Framework games for Xbox 360 and I’m currently working on XNA Framework 3.1. (Yes, I will eventually update to XNA Framework 4.0, but I don’t have a Windows Phone right now, and I do have an Xbox 360.)
Ah, there’s a catch! The XNA Framework 3.1 has different assembly identities for Xbox 360 than for Windows, and XNA Game Studio does not include full MSIL assemblies for Xbox 360. Instead, it provides the metadata assemblies only, which don’t contain any method bodies. Argh! I can’t inline what isn’t there!
Of course, last week I’d already started on a solution for it. It just got harder when I tried instance methods. My solution was to use custom attributes to mark methods as inlinable for calls to identical methods on surrogate types.
For example,
public struct InlinableVector2
{
[InlinableMethod(typeof(Vector2))]
public static Vector2 op_UnaryNegation(Vector2 value)
{
Vector2 vector;
vector.X = -value.X;
vector.Y = -value.Y;
return vector;
}//…
In the example above, the op_UnaryNegation is declared as an inlinable method that replaces calls to Vector2.op_UnaryNegation (the minus operator).
It works great, and allows me to ensure that the methods are written according to the limitations of my assembly rewriter. For example, I don’t want to inline methods with multiple return statements. It’s way easier just to rewrite such a method by hand to have one return statement than it is to write a program to do it. I wouldn’t have control over the method implementation conforming to my tool’s limits if I could only inline the actual methods of Vector2.
It got tricky when I wanted to inline constructor calls. Instance methods have ‘this’ references and field references. My inlinable method’s ‘this’ isn’t the same type as the called method’s ‘this’. That shows up here and there in the IL, too, like in field references. Well, just before writing this blog entry, I got it working.
I’m quite pleased with my progress so far. Next up is passing value types by reference (it might already work, but I haven’t tried), and then rewriting calls with equivalent, but more efficient calls (eg, replace “m3 = Matrix.Multiply(m1, m2)” with “Matrix.Multiply(ref m1, ref m2, out m3)”).
All the optimizations this tool is performing could be done by hand in the source code. However, most of the more efficient patterns are more cumbersome to write and more difficult to read. Having a tool do it at the IL level means programmers can keep writing code the most productive way, but still get the benefit of tedious micro-optimizations done everywhere.