The Case of the Corrupt PE Binaries

After installing Windows 7, I also installed the Windows 7 SDK as I wanted to poke around the updated headers and documentation files to see what was new at a low level. Additionally, I wanted to make sure all my code compiled against the new headers and libraries in case someone taking my native debugging class tried it and ran into problems. After many years of the SDK team completely ignoring Visual Studio, the Windows 7 SDK installation now looks for Visual Studio and properly integrates with it (I believe this started with the Vista SDK). After hundreds of thousands of emails over the years from people who couldn’t compile code from my books and columns because they hadn’t gone through the manual gyrations to integrate the latest SDK with the development environment, it’s a huge help.

As I have all my builds automated, I let rip and got a build failure on a few release build x86 and x64 binaries. The failure was like the following in all those cases:

mt.exe : general error c101008d: Failed to write the updated manifest to the resource of file “……..ReleaseFTSimpTest.exe”. The binary is not a valid Windows image.

MT.EXE is the tool used to embed the manifest into your binaries.

As I was running a beta OS, I wondered if this was a problem with anti-virus, so I disabled eTrust and tried the build again but still got the same error.

As I’m using Visual Studio 2008 SP1, CL.EXE and LINK.EXE are doing all the main work of compiling, but MT.EXE comes from the SDK. Since this code compiled correctly on my Vista computer with the Vista SDK installed, my next step was to see if this was possibly a problem with MT.EXE or if CL.EXE and LINK.EXE were exposing a bug in the operating system DLLs they were using. I uninstalled the Windows 7 SDK, which reverted Visual Studio 2008 SP1 to using the Vista SDK. Giving the recompiles a go, I got the same error from MT.EXE. I verified that I was in fact using the MT.EXE from the Vista SDK. One thing that was confusing to me was that MT.EXE from the Vista SDK and Windows 7 SDK both report the same version number, but the binaries are different sizes.

Looking closer at what binaries were getting corrupt, it was only four out of 56 .EXE files in my build. Interestingly, they were all console applications that were unit tests. Firing up Visual Studio, I created the canonical test console application, Hello World, and verified that it compiled. Looking at the BUILD.HTM file, I saw that Hello World had the following in it:

Creating command line “mt.exe @c:JunkcruftHelloWorldReleaseRSP00000840485968.rsp /nologo”
Creating temporary file “c:JunkcruftHelloWorldReleaseBAT00000940485968.bat” with contents
[
@echo Manifest resource last updated at %TIME% on %DATE% > .Releasemt.dep
]

For two seconds, I was a little confused about the temporary file creation because my failing builds didn’t have that. What the output told me was that the temporary file indicating MT.EXE creates the resource update time file after it runs successfully.

My hypothesis at this point was that either LINK.EXE was creating a corrupt binary before MT.EXE worked on it, or it was MT.EXE corrupting the binary itself. Using one of my projects that produced a corrupt binary, I copied out the command lines for CL.EXE, LINK.EXE, and MT.EXE from its BUILD.HTM and ran them directly from a batch file at the command line (properly set up with VCVARS.BAT). I wanted to look at what was in the Portable Executable (PE) data to see if anything was amiss. By the way, you need to read Matt Pietrek’s definitive “An In-Depth Look into the Win32 Portable Executable File Format” Parts 1 and Part 2 you’ll learn a ton about how Windows works.

Running DUMPBIN.EXE /headers on the resulting EXE allowed me to look at the main portions. Because MT.EXE puts the manifest into the resource section of the binary, I paid special attention to it:

SECTION HEADER #4
.rsrc name
0 virtual size
4000 virtual address (00404000 to 00403FFF)
0 size of raw data
0 file pointer to raw data
0 file pointer to relocation table
0 file pointer to line numbers
0 number of relocations
0 number of line numbers
40000040 flags
Initialized Data
Read Only

Do you see the problem? Here’s a good resource section for comparison:

SECTION HEADER #4
.rsrc name
2B0 virtual size
4000 virtual address (00404000 to 004042AF)
400 size of raw data
1800 file pointer to raw data (00001800 to 00001BFF)
0 file pointer to relocation table
0 file pointer to line numbers
0 number of relocations
0 number of line numbers
40000040 flags
Initialized Data
Read Only

Interesting! There’s a resource section, but it is obviously corrupt because something not filling out the raw data information. As the manifest goes in the resource section, MT.EXE really can’t add to a section whose raw data starts at zero.

Since the app wizard generated project is not producing corrupt binaries but I have several projects that are, it’s time to look at the LINK.EXE and MT.EXE switches to see what my projects have set that are tripping either of those tools up. The easy way for me to do that is look at the .VCPROJ files in the fantastic, and free, SourceGear DiffMerge. (I can’t rave enough about DiffMerge!)

DiffMerge pointed me write to the .VCPROJ, VCLinkerTool RandomizedBaseAddress attribute as the only real difference between the files. In the app wizard project, the value is 1, for my bad project, the value is 2. A check in the project properties shows a value of 2 maps to the linker switch /DYNAMICBASE:NO. In my bad project, I change the value to /DYNAMICBASE and the previously corrupt release build built and worked perfectly.

Once I fix all my broken projects to use /DYNAMICBASE instead of /DYNAMICBASE:NO, all my build problems go away. As I mentioned earlier, the corrupt builds happened to be several of my unit tests, which is where you might see it as well. The exact situation I was getting the corrupt binary is as follows:

  1. On Windows 7 with Visual Studio 2008 SP1
  2. There’s no resource file in the binary.
  3. It’s a release build
  4. You are specifically not setting /DYNAMICBASE to the linker

After I spent the 20 minutes tracking this problem down, I thought maybe I should read the Windows 7 SDK ReadMe file to see if this was a known issue. Section 5.3.6, titled “Problem Running MT.EXE on Windows 7 Beta,” says MT.EXE fails if the .EXE does not contain a resource (.rsrc) section and the work around is to add an empty .RC file to your project to work around the problem.

After smacking my forehead wondering if this was the exact bug I was running into, I took the batch file where I ran all the CL.EXE, LINKER.EXE, and MT.EXE commands directly and removed all the /MANIFEST related switches from the LINK.EXE command line. I also commented out call to MT.EXE and rebuilt. Of course, with MT.EXE completely out of the way, the binary wasn’t corrupt.

I think I’m seeing a manifestation (pun intended!) of the bug mentioned in section 5.3.6 of the ReadMe. A little experimentation compiling and linking with the bare minimum switches necessary shows that the linker defaults to /DYNAMICBASE:NO because the optional header DLL characteristics does not show Dynamic base when you dump the binary.

While I thought I was going to have a nice Case Of… story, it turns out that I should have read the Windows 7 SDK ReadMe first, which is always a very wise idea when dealing with betas. There’s nothing like reproducing a known bug. However, I was able to find an additional workaround to the MT.EXE bug by setting the /DYNAMICBASE switch. In my situation that was better because if I added a .RC file to all the unit tests, I was going to have to add those files to my code installs and patches.

How Can I Get Access to Atmosera’s Web Help Desk?

Web Help Desk (WHD) Users for each customer will receive login instructions via email when their account is created.

Access Portal

Known Issues with Multiple Microsoft Logins

The portal uses Microsoft credentials to log you in. If you use other Microsoft accounts such as Officer 365, you will likely need to log out before attempting to login to the portal. Some users are experiencing issues related to conflicts between multiple logins in issue. Logging out and restarting your browser will take care of any such conflicts.

Stay Informed

Sign up for the latest blogs, events, and insights.

We deliver solutions that accelerate the value of Azure.
Ready to experience the full power of Microsoft Azure?

Atmosera is thrilled to announce that we have been named GitHub AI Partner of the Year.

X