Code Injection with MSBuild Inline Tasks

I was recently looking for a way to automate unit tests for a JavaScript-based Windows 8 application. I discovered that activating a Windows 8 application requires use of a COM object that doesn’t support automation (it has no ProgID). Although it’s trivial to create a command-line application to invoke the COM object, I wondered if there was a way to avoid the explicit step of building another executable or DLL. I thought, “My Windows 8 project already depends on msbuild to build my app package. Can I use an MSBuild inline task?”

The short answer is yes, but it isn’t obvious how to do it.

First, let’s have a quick look at how you would activate a Windows 8 application using C#. This link provides sample code illustrating how to do it.

http://stackoverflow.com/questions/12925748/iapplicationactivationmanageractivateapplication-in-c

This code doesn’t appear to be suited to an inline task because it requires class and interface declarations. These are needed because there isn’t already a primary interop assembly containing the required types.

MSBuild inline tasks are source code fragments you define inside an msbuild targets file (or project), typically in C# or VB, that msbuild will implicitly compile into a custom task DLL so that you can use it during your build. Because the code fragment you provide becomes the body of the custom task’s Execute function, it might seem like you can’t declare the types needed for COM interop.

Enter the standard code injection hack.

The code fragment you supply is used verbatim in a generated source file, and then compiled into the task assembly. Using a “malformed” code fragment, you can add additional functions, or even classes, to your inline task. That makes it possible to, say, define an inline msbuild task that interacts with an arbitrary COM object.

MSBuild uses the code fragment like this:

public class GeneratedTask : Microsoft.Build.Utilities.Task {

   public override bool Execute() {

        // your code fragment is inserted here

        return _Success;
    }

    // other generated code...
}

It’s pretty easy to see that your code fragment is preceded by a function declaration, and is followed by a return statement and closing brace. If you include the return statement and a closing brace in your code fragment, then you can insert nested type declarations after it. To complete the code injection hack, end the snippet with the partial declaration of a Boolean function.

Your code snippet should look like this:

    //TODO: your intended Execute function

    return _Success;
}

//TODO: inject helper classes or functions here

private bool IgnoreThisInjectedFunction() {

THIS IS NOT A SECURITY VULNERABILITY. An MSBuild project already has the ability to execute arbitrary code on your computer. If this code injection technique weren’t possible, I could just have msbuild write out a file containing the code I wanted, compile it using a CSC task, and execute it with an Exec task. It would take me more steps, but you wouldn’t be any safer.

So if I could do this in a more classically-supported way anyhow, what’s the point?

The point is that it is fewer steps; the same reason msbuild has inline tasks in the first place!

Returning to my original problem, an inline task to activate Windows 8 apps looks like this:

<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <!-- This inline task activates the specified Windows 8 application. -->
  <UsingTask
    TaskName="LaunchWin8App"
    TaskFactory="CodeTaskFactory"
    AssemblyFile="$(MSBuildToolsPath)\Microsoft.Build.Tasks.v4.0.dll" >
    <ParameterGroup>
        <AppUserModelId ParameterType="System.String" Required="true" />
        <ProcessId ParameterType="System.UInt32" Output="true" />
    </ParameterGroup>
    <Task>
      <Using Namespace="System" />
      <Using Namespace="System.Runtime.InteropServices" />
      <Using Namespace="System.Runtime.CompilerServices" />
      <Code Type="Fragment" Language="cs">
        <![CDATA[
            ApplicationActivationManager appActiveManager = new ApplicationActivationManager();//Class not registered
            uint pid = 0;
            try {
                appActiveManager.ActivateApplication(AppUserModelId, null, ActivateOptions.None, out pid);
            } catch (COMException e) {
                Log.LogError(e.Message);
            }
            ProcessId = pid;

            return _Success;
        }

        public enum ActivateOptions
        {
            None = 0x00000000,  // No flags set
            DesignMode = 0x00000001,  // The application is being activated for design mode, and thus will not be able to
            // to create an immersive window. Window creation must be done by design tools which
            // load the necessary components by communicating with a designer-specified service on
            // the site chain established on the activation manager.  The splash screen normally
            // shown when an application is activated will also not appear.  Most activations
            // will not use this flag.
            NoErrorUI = 0x00000002,  // Do not show an error dialog if the app fails to activate.
            NoSplashScreen = 0x00000004,  // Do not show the splash screen when activating the app.
        }

        [ComImport, Guid("2e941141-7f97-4756-ba1d-9decde894a3d"), InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
        interface IApplicationActivationManager
        {
            // Activates the specified immersive application for the "Launch" contract, passing the provided arguments
            // string into the application.  Callers can obtain the process Id of the application instance fulfilling this contract.
            IntPtr ActivateApplication([In] String appUserModelId, [In] String arguments, [In] ActivateOptions options, [Out] out UInt32 processId);
            IntPtr ActivateForFile([In] String appUserModelId, [In] IntPtr /*IShellItemArray* */ itemArray, [In] String verb, [Out] out UInt32 processId);
            IntPtr ActivateForProtocol([In] String appUserModelId, [In] IntPtr /* IShellItemArray* */itemArray, [Out] out UInt32 processId);
        }

        [ComImport, Guid("45BA127D-10A8-46EA-8AB7-56EA9078943C")]//Application Activation Manager
        class ApplicationActivationManager : IApplicationActivationManager
        {
            [MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)/*, PreserveSig*/]
            public extern IntPtr ActivateApplication([In] String appUserModelId, [In] String arguments, [In] ActivateOptions options, [Out] out UInt32 processId);
            [MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
            public extern IntPtr ActivateForFile([In] String appUserModelId, [In] IntPtr /*IShellItemArray* */ itemArray, [In] String verb, [Out] out UInt32 processId);
            [MethodImpl(MethodImplOptions.InternalCall, MethodCodeType = MethodCodeType.Runtime)]
            public extern IntPtr ActivateForProtocol([In] String appUserModelId, [In] IntPtr /* IShellItemArray* */itemArray, [Out] out UInt32 processId);
        }

        private bool IgnoreThisInjectedFunction() {
            // msbuild will complete this with:
            //     return _Success;
            // }
        ]]>
      </Code>
    </Task>
  </UsingTask>
</Project>

Assuming the code above is placed in a file named activate.targets, using the inline task would look something like this:

<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <Import Project="activate.targets" />
  <Target Name="Hello">
    <LaunchWin8App AppUserModelId="Microsoft.BingMaps_8wekyb3d8bbwe!AppexMaps">
        <Output TaskParameter="ProcessId"
                  PropertyName="PID" />
    </LaunchWin8App>
  </Target>
</Project>

Building the sample project above will launch the Win8 Bing Maps app.

That’s it. I’m not posting anything dangerous, or even particularly revealing. But if you ever find yourself wanting to write an inline task to do COM interop, this technique might keep your source tree a wee bit simpler than it might have been.

–Stephen

Posted in MSBuild, Programming, Windows 8 | Leave a comment

XnaCmd for Xbox 360

Today, I’m releasing a command line utility for people working with XNA Game Studio and targeting Xbox 360. This utility is pretty simple and supports four commands:

  1. Send a local file to an existing title container on Xbox 360.
  2. Delete a remote file from an existing title container on Xbox 360.
  3. Print out the title ID of a game from its CCGAME file.
  4. Launch a game in an existing title container on Xbox 360.

This isn’t groundbreaking, but it enables some useful development scenarios for more advanced developers.

One of the most compelling abilities of xnacmd.exe is that it allows you to deploy a file to a title container while the game is running.

Another compelling ability is that it lets you add files to title containers created from CCGAME files. This is interesting for play testing and peer review scenarios, where all you have is the CCGAME file, and the title ID has been reassigned by Microsoft.

The ability to delete files from a container is really just there for parity, to allow you to remove files that you deployed.

Launching a game from the command line without Visual Studio can also be useful.

So that’s it; it’s just four commands.

A particularly interesting scenario for this utility is content editing and preview. There are numerous samples floating around that illustrate how to host the content pipeline to build your content from an editor, but zero samples that show how to easily deploy that content to your console and see it in-game.

With this tool, you now have the ability to update content without rebuilding your whole game, and without even restarting it. The catch is that you need to add a feature into your game that triggers a content reload. This may not be easy, but it can be worth figuring out if you can simplify life for your designers and artists.

Another scenario that I particularly like is to unlock game features or cheats using a specially-named file. You could have your game check for the existence of a file at startup (eg, “unlock.txt”), and expose hidden features only if the file is present. Then, of course, you could put your game into playtest or peer review without the unlock file, but provide it to testers and reviewers to deploy with xnacmd.exe. That way, your game can’t fail for having unreviewable content, but the features will not be unlockable except through legitimate gameplay or secret codes after the game is approved and hits the marketplace.

Now, please be aware that xnacmd.exe only works when XNA Game Studio Connect is running, or when a game launched through XNA Game Studio Connect is running, and it will only connect to your default console because I was lazy and wanted to keep the command line options simple. It will not work with games downloaded from the marketplace.

Finally, you need to have XNA Game Studio 4.0 installed in order for this command line tool to work. The stuff it depends on is not redistributable, so don’t bother asking me for a way to install it stand-alone.

Link: Download xnacmd.exe here (available under MIT license)

A few notes about usage:

XNA games are deployed into virtual file systems called title containers (similar in concept to a ZIP file). When deploying files, the title containers are referenced by something called, “title ID” (note: You’ll see this mentioned in the VS Output window when you deploy your games). When sending or deleting files from a title container with XnaCmd, you need to specify the title ID as part of the remote file path.

For example, this is a command I used for testing:

xnacmd send test.txt f6c44f88-a295-4b24-9709-80b5d4708bba\test.txt

When deploying from Visual Studio, the title ID is the GUID used in your startup assembly’s GuidAttribute (see your project’s AssemblyInfo.cs). This same GUID is used as the title ID when you build a CCGAME. However, when you submit your CCGAME to XBLIG for play testing or peer review, the CCGAME that everyone else downloads is given a new title ID by Microsoft. Use the “xnacmd titleid <ccgame>” command to see this GUID.

Final Advice

  • Be sure to close/dispose any files that you plan on updating at run time.
  • If something goes wrong and you can’t deploy anything anymore, even from Visual Studio, try killing xnatransx.exe (using Task Manager).
Posted in XNA Game Studio | 1 Comment

XNA Game Studio Connect?

It’s been about six months since I last worked on a programming project involving the XNA Game Studio Connect app (required for developing Xbox LIVE indie games). Something must have changed since the last time I used it… How does one launch the damn thing? I can’t find any way to make the application appear in my games list. In fact, the only way I’ve found to launch it is to do the following:

  1. Enter the Games Marketplace
  2. Navigate to Games
  3. Select “Titles A to Z”
  4. Select “X”
  5. Select “XNA Creators Club”
  6. Navigate to Extras
  7. Select “XNA Game Studio Connect”
  8. Select “Download Again”
  9. Select “Play Now”

Is that what everyone else is doing? Please, someone show me that I’ve overlooked a much more convenient way to launch the XNA Game Studio Connect application.

PS: It also does not register in the “recently played” list, or else I would happily use that.

Posted in XNA Game Studio | 4 Comments

Double-Precision Arithmetic and XNA Runtime on Xbox 360

I keep working on my hobby project, which is an assembly rewriter whose goal is to optimize assemblies for the XNA runtime on Xbox 360. I probably spend more than half my time debugging obscure errors resulting from invalid CIL in the assemblies. Another big chunk of time is spent on experiments to see what else I can do that might improve runtime performance.

Previously, I wrote about some of the NetCF’s code-gen peculiarities around floating-point variables and arithmetic on Xbox 360. In particular, I noted that single-precision arithmetic uses the double-precision native instructions, and then explicitly rounds the result using frsp (floating-point round to single-precision). I also noted that while double-precision arithmetic doesn’t suffer this inefficiency, the JIT-compiler emits two frsp instructions when casting from double- to single-precision (eg, when you store a result in a float variable). Well, I wondered about the trade-off and ran some experiments.

In one experiment, I found an open-source physics library compatible with the XNA runtime on Xbox 360 called Jitter. This library is meant to be portable, and doesn’t depend on the XNA Framework. In particular, it doesn’t use the XNA Framework math types – Vector2, Vector3, etc. This was interesting because it made it easy to find-and-replace all instances of “float” to “double” in the source code. After fixing a couple hundred compile-time errors, I got the demo running – this time with pure double-precision arithmetic. I had eliminated all float variables and parameters, so there were no float values anywhere (except in the rendering code, which did depend on Vector3 for drawing primitives).

I was ready to be blown away by silky-smooth, realistic physics… But my hopes were dashed when it ran 10-20% slower than before! After looking at the code and thinking real hard, I realized that the slowdown must be due to my effectively doubling the size of all data structures.

The thing that confused me was that the code had already been hand-optimized to avoid passing structures by value, so it wasn’t a matter of copying twice the data. Rather, I believe the problem was twice the distance between data structures passed by reference. Twice the distance means more frequent cache misses. Also, memory operations on Xbox 360 are really slow so doubling the size of any structures that are passed by value didn’t help, either.

So that experiment didn’t teach me anything positive, and I got sad and played some games for a while.

The following week, I was over it and I tried a different experiment. I wondered what would happen if I just rewrite the arithmetic to work at double precision, without changing the variables? That would eliminate a bunch of rounding on intermediate computations, but would require redundant double-rounding whenever I needed to store a result into a variable. It’s way too tedious to try that by hand on a big code-base, so I coded-up a new feature in my assembly rewriter.

This time, I used my assembly rewriter to rewrite all expressions in an assembly to do floating-point arithmetic at double-precision, but convert the result back to single-precision whenever storing to a float variable. Using Box2D.XNA as my test bed, I saw virtually no change in the performance. Well, it was small, mixed results, really.

So then I added another feature to replace local variables of type float with local variables of type double. That eliminated a lot of the double-rounding that occurred when computing intermediate values inside a function, and I got small perf improvement in a few tests, and I didn’t see any case that was slower.

This approach seems promising. The next thing I’m working on is a feature to decompose local structs into variables for each component field (eg, a single Vector2 variable is replaced by two float variables, representing its X and Y fields). Not all local structs are candidates (eg, if they are used in function calls), but after inlining all the Vector2 operations, many variables can be decomposed this way. After breaking them up, those float variables can be rewritten as doubles, and that eliminates more rounding!

I originally began working on decomposing structure variables to make it easier to eliminate single-use variables. This possible secondary use is a bonus, and I think the two together will work nicely. Fingers crossed.

Posted in XNA Game Studio | 2 Comments

Avoid ‘float’ Parameters on Xbox 360 (XNA)

Previously, I said I would describe inefficiencies of the NetCF JIT compiler on Xbox if I could provide a reasonable workaround. This tip is pretty reasonable: avoid using float as a parameter type in function signatures. Use double instead. Here’s why…

The NetCF uses a calling convention that passes all arguments on the stack. Furthermore, NetCF also passes all float arguments as doubles. That is, callers push each 4-byte float argument onto the stack as an 8-byte double. The called function then converts each double argument value back into a float, and writes it back onto the stack in the exact same spot. It does this by loading the double value into a register, rounding it to single-precision, and then writing it back to the stack as a single.

Why? I have no idea; but that’s not important. What’s important is that this superfluous data conversion is expensive, and you’re better off avoiding it.

The expense is incurred upon entry to the function, when each float parameter is loaded into a register, rounded, and stored back into the stack.

If you rewrite the method to receive a double parameter instead of a float, then the code generated at the caller will be the same (at the call site, it costs the same to put either a float or double onto the stack. ), but the method itself will not include expensive load-round-store initialization at the beginning of the function body.

Note: There are code examples in the text below, to illustrate.

An important caveat is that float values in complex types (eg, structs) are not converted this way when the containing type is passed by value. So, for example, if you pass a Vector2 instance by value, it will be passed in 8 bytes (2 x 4-bytes for each float member).

You have to pass the whole structure, though, not just a member.

This implies a big difference between the following functions:

float AddXandY(Vector2 v);

float AddXandY(float x, float y);

The first function will receive an 8-byte Vector2 value on the stack. The second function will receive 2 x 8-byte double values on the stack, which it will have to round to single-precision before executing any of its function body.

Take a look at the native code.

; float AddXandY(Vector2 v)
; (0xBE29F26C) size=92 bytes
mflr    r12
stw    r12, 8(r15)
addi    r1, r1, -44
stw    r15, 0(r1)
addi    r15, r1, 0
addis    r3, r0, 48681
ori    r3, r3, 62060
addi    r5, r0, 0
stw    r5, 20(r15)
stw    r3, 12(r15)
stw    r15, 0(r30)
    addi    r4, r15, 44
    lfs    fr1, 0(r4)
addi    r4, r15, 44
lfs    fr2, 4(r4)
fadd    fr1, fr1, fr2

frsp    fr1, fr1
lwz    r15, 0(r15)
addi    r1, r1, 52
stw    r15, 0(r30)
lwz    r12, 8(r15)
mtlr    r12
blr

; float AddXandY(float x, float y)
; (0xBE29F3CC) size=108 bytes
mflr    r12
stw    r12, 8(r15)
addi    r1, r1, -44
stw    r15, 0(r1)
addi    r15, r1, 0
addis    r3, r0, 48681
ori    r3, r3, 62412
addi    r5, r0, 0
stw    r5, 20(r15)
stw    r3, 12(r15)
stw    r15, 0(r30)
    lfd    fr0, 44(r15)
frsp    fr0, fr0
stfs    fr0, 44(r15)
lfd    fr0, 52(r15)
frsp    fr0, fr0
stfs    fr0, 52(r15)

   lfs    fr1, 52(r15)
    lfs    fr2, 44(r15)
    fadd    fr1, fr1, fr2
frsp    fr1, fr1
lwz    r15, 0(r15)
addi    r1, r1, 60
stw    r15, 0(r30)
lwz    r12, 8(r15)
mtlr    r12
blr

In the two disassembled functions above, the orange highlight shows the code used to convert the double arguments to floats. That conversion is only present when a function signature contains float parameters.

The green highlights above represent loading the single-precision arguments and adding them. The difference in how the values are loaded is due to the difference in parameter type – fields are loaded via an offset from the address of the containing type.

Looking at the disassembled code, there are other obvious inefficiencies. Unfortunately, most of the inefficiencies are unavoidable, or else you end up trading one for another where there isn’t a clear winner in all situations. That’s why I said I would only write about inefficiencies that have practical workarounds. Avoiding float params is one such workaround.

Here is the function declared with double parameters:

; float AddXandY(double x, double y)
; (0xBE68F4DC) size=88 bytes
mflr    r12
stw    r12, 8(r15)
addi    r1, r1, -44
stw    r15, 0(r1)
addi    r15, r1, 0
addis    r3, r0, 48744
ori    r3, r3, 62684
addi    r5, r0, 0
stw    r5, 20(r15)
stw    r3, 12(r15)
stw    r15, 0(r30)
    lfd    fr1, 52(r15)
lfd    fr2, 44(r15)
fadd    fr1, fr1, fr2

frsp    fr1, fr1
    frsp    fr1, fr1
lwz    r15, 0(r15)
addi    r1, r1, 60
stw    r15, 0(r30)
lwz    r12, 8(r15)
mtlr    r12
blr

The main difference between this function and the one using floats is that the arguments are not converted to single-precision and written back to the stack before adding them (green highlight).

A word of caution: before converting all your float parameters to double, take a look at the orange highlight. I’m pretty certain this is a bug in the JIT compiler, as what I’ve highlighted is a redundant rounding operation (frsp is “floating-point round to single-precision”). This happens whenever you cast from double to float. I’m pointing this out because if you need to store the result of floating-point arithmetic in a float variable (like a field in a struct), then using doubles could do more harm than good (depends on how much arithmetic you need to do before storing the result).

It’s worth noting that casting from float to double in an expression incurs no cost. This is because all floating-point values are automatically converted to double-precision when loaded into a register (this is done by the CPU). For this reason, when you mix double- and single-precision values in an expression, it is preferable to perform the arithmetic at double-precision.

For example, prefer this:

float result = (float)(((double)singleValue + doubleValue1) * doubleValue2);

over this:

float result = (singleValue + (float)doubleValue1) * (float)doublevalue2;

In the first case, casting singleValue to double doesn’t use an instruction, but the explicit cast for the result emits two rounding instructions. In the second case, casting doubleValue1 and doubleValue2 to singles causes explicit rounding, as well as an additional rounding before storing the result (required before storing any float value).

Avoiding float parameters is a reasonable perf tip, but if you do it, you must be cautious about mixing float and double values.

Another tip I can provide is to avoid creating very small functions. That should be obvious from the disassembly, but in case it isn’t, there is a lot of painfully expensive overhead in the examples I provided. (Wouldn’t it be great if you could force those little functions to be inlined?)

Happy coding!

PS: My highlights weren’t preserved when I published the article, so I’ve atttempted to fix it as best as I could. I apologize if the assembly code is hard to read.

Posted in Programming, XNA Game Studio | 9 Comments

XNA Code Quality on Xbox 360

More than a year ago, I wrote a tool based on CCI that could inline methods in a CIL assembly. My goal was to provide a means of improving runtime performance of XNA games on Xbox 360, where the JIT compiler doesn’t do many optimizations to begin with, and where the XNA Framework and C# language encourage coding patterns that result in particularly inefficient native code. For various reasons, I put that project aside just as I got it working. Recently, though, I blew the dust off my old source code and tried it again.

My initial experiments were flawed – I had tested the optimization on a debug assembly. Most of the performance improvement I saw could be had by enabling optimizations in the C# compiler (mainly elimination of redundant copying). Running my optimizer on an optimized C# assembly made half as much improvement as I previously reported (<7%). Anyway, that got me doing new experiments.

After two years of working on the code-gen team for the Xbox native compilers, I have a pretty good idea of what the PPC architecture can do, and what good PPC disassembly looks like. I also know what C++ code gets you the best PPC native code. I thought if I pre-optimized the CIL according to what I knew of the PPC architecture, XNA’s Xbox JIT compiler could generate better code even if it didn’t do optimizations of its own. Oh, man, was I ever naïve!

Through blind luck, I managed to eek out a total 15% improvement in Box2D, using my IL optimization tool. Half the time, though, my attempts at optimization made the code run more slowly! This ran counter to my understanding of the architecture, and I decided I needed to see the generated machine code to see what was really going on. To do this, I looked at an old XNA project, called Artemus, written by Justin Holewinski. The code I found wasn’t exactly working, but I put a bit of elbow grease into it and made a new version for XNA Game Studio 4.

Artemus uses unsafe code on the Xbox to read the JIT-compiled native machine instructions out of memory, then it sends them to a client app on the PC which disassembles the instructions. With this tool, I can now see why some of my optimization efforts resulted in slower execution: the JIT compiler is generating code that is largely agnostic of the target architecture. Specifically, it is only using a subset of available instructions and registers, which means its generated code is effectively emulating a much simpler, less capable CPU.

I feel like starting a collection to help all those poor, neglected registers!😦

The good news is that the disassembler is giving me insight I can use to improve my CIL optimizer. Before you ask, I am keeping my tools to myself for now. However, I will try to write up some of the performance pitfalls I discover, when I can also provide practical tips to avoid them. For now, more experimentation is necessary.

Posted in XNA Game Studio | 2 Comments

Faster Content Builds That Don’t Lock-up Your IDE

Yesterday, I decided to explore what a sufficiently-motivated developer could do to improve content build times in XNA Game Studio 4.0. In a relatively short time, I was able to hack together a solution that distributes content builds across available processors, and doesn’t lock up the IDE (as described here). This means Visual Studio remains responsive while your content builds, you can track build progress in the Output Window, and the whole thing will finish faster.

Before continuing, let me state clearly that what I’m about to describe requires Visual Studio 2010 Professional. Those of you using Visual C# Express Edition won’t be able to use this technique.

Okay, now I’m going to explain what I did. In my sample solution, I reduced the build time by 25% on my dual-core laptop (from 38 seconds down to 29 seconds).

For my experiment, I started with the Ship Game starter kit from create.msdn.com. I chose this starter kit because it contained enough content that locking up the IDE on rebuilds would get annoying really fast.

Here’s what I did (short version; longer version at the end):

  1. Split content into multiple content projects, all building to the same output directory.
  2. Add a Game Library project to the solution that references all the content projects.
  3. Add a VC++ Makefile project to the solution that builds the game library from (2) with multi-proc support enabled.

The key here is that I introduced a VC++ Makefile project to initiate the content build. This project type invokes arbitrary, user-defined commands, and it isn’t all broken like the C# projects. Specifically, it doesn’t lock up the UI thread when it executes long-running commands, and it correctly redirects all the command output to the Output Window, and reports errors in the Error List.

To have the makefile project build all my content projects in parallel, I set up the Build command to build the Game Library project using msbuild.exe with multi-processor support enabled.

Example Build Command:

“$(MSBuildBinPath)\msbuild.exe” /m:$(NUMBER_OF_PROCESSORS) “$(SolutionDir)Content_All\Content_All.csproj”

For Rebuild and Clean, I simply added /t:Rebuild and /t:Clean, respectively.

Note: VC# Express doesn’t support VC++ projects, which is why this won’t work for Express users.

So that’s what I did, and it works great. Content builds don’t lock up the IDE, and the Output Window displays the name of each asset as it is compiled (some reassuring feedback during long-running builds).

At this stage, my solution is set up to build content projects referenced by the Game Library project (Content_All) in parallel. When I tried it the first time, I only squeezed 2 seconds off the build time. The bottleneck in my build was the project using the normal mapping processor, so I added another project, and moved some of the models and dependencies over there. That brought more parallelism into the build and the overall build time dropped to 29 seconds (from 38).

I experimented a little with putting the audio in another project, and shuffling some of the models back and forth, but gave up before finding any additional gains. On a machine with more processors (my laptop has two), there is potential for more savings.

I should note that in order to split up the custom model processing into multiple projects, I had to include some of their dependencies twice. For example, both projects build the NormalMapping shader. The cost of the redundancy was offset by a greater savings from parallel processing.

The drawbacks to doing this are:

  1. You basically need two extra projects in your solution to enable it, for each target platform.
  2. You need to manually edit the Solution Configurations to make things build properly.
  3. For Windows projects, the content won’t be recognized by ClickOnce publishing.

The first isn’t so bad, because you can hide the extra projects in a Solution Folder. The second is something many people don’t really understand, but should be part of any advanced developer’s tool kit.

Another thing I want to note is that I separated the assets that use the custom processor from the assets that use standard processors to reduce incremental build times when NormalMappingModelProcessor.dll is modified. The standard content won’t rebuild at all – and thanks to splitting the custom content into two projects, the custom content can rebuild fully in less time as well.

Angry grumbling: I had intended to provide my modified ShipGame solution as a downloadable sample, because the web site said it was under Ms-Pl. However, the actual ZIP file contains a license file that describes the “XNA Premium Content” license, which prohibits redistribution of unmodified software.😦

Before I go, let me leave you with this disclaimer: your mileage may vary. I do not plan to write out a click-by-click tutorial, but I added more detail below.

Here’s what I did (longer version):

  1. Move ShipGame content project up to the solution folder, and rename it Content_Main.
  2. Add new content project, named Content_Custom.
  3. Delete the Content Reference from ShipGameWindows.
  4. Add new Game Library project to the solution, named Content_All.
  5. Change the output folder of Content_All to match the output folder of ShipGameWindows.
  6. In Content_All, add references to Content_Main and Content_Custom.
  7. Change the Content Root Directory of all content projects to “Content”.
  8. Add new C++ Makefile project, named Content_BuildAllParallel, and configure the Build, Rebuild, and Clean commands (details below).
  9. Move all the content that depends on NormalMappingModelProcessor into Content_Custom, along with any additional content it depends on.
  10. In Content_Custom, add a reference to NormalMappingModelProcessor.
  11. In Content_Main, remove the reference to NormalMappingModelProcessor.
  12. Edit the solution configuration so that Content_All and NormalMappingModelProcessor are excluded from all solution builds. Make sure that Content_BuildAllParallel is built when the rest of the game is built.
Posted in XNA Game Studio | 8 Comments