Advanced PE Image Building
The easiest way to write a .NET module to the disk is by using the
Write
method:
module.Write(@"C:\Path\To\Output\Binary.exe");
Behind the scenes, this creates and invokes a
ManagedPEImageBuilder
and a ManagedPEFileBuilder
with their default
settings, and will completely reconstruct the PE image, serialize it into
a PE file and write the PE file to the disk.
To get more control over the construction of the new PE image, we can use
and configure our own instance of an IPEImageBuilder
instead:
var imageBuilder = new ManagedPEImageBuilder();
/* ... Configuration of imageBuilder here... */
After configuring, the builder can then be passed onto the ModuleDefinition::Write
method as a secondary parameter:
module.Write(@"C:\Path\To\Output\Binary.exe", imageBuilder);
It is also possible to call ModuleDefinition::ToPEImage
to turn
the module into a PEImage
first. This image can then be post-processed
and later transformed into a PEFile
to write it to the disk:
// Turn module into a new PE image.
var image = module.ToPEImage(imageBuilder);
/* ... Post processing of the PE image here ... */
// Construct a new PE file.
var fileBuilder = new ManagedPEFileBuilder();
var file = fileBuilder.CreateFile(image);
/* ... Post processing of the PE file here ... */
// Write PE file to disk.
file.Write(@"C:\Path\To\Output\Binary.exe");
To get access to additional build artifacts, such as new metadata tokens
and builder diagnostics, it is possible to call the CreateImage
method from the image builder directly, and inspect the resulting
PEImageBuildResult
object:
// Construct image.
var result = imageBuilder.CreateImage(module);
/* ... Inspect build result here ... */
// Obtain constructed PE image.
var image = result.ConstructedImage;
/* ... Post processing of the PE image here ... */
// Construct a new PE file.
var fileBuilder = new ManagedPEFileBuilder();
var file = fileBuilder.CreateFile(image);
/* ... Post processing of the PE file here ... */
// Write PE file to disk.
file.Write(@"C:\Path\To\Output\Binary.exe");
This article explores various features about the ManagedPEImageBuilder
class.
Token Mappings
Upon constructing a new PE image for a module, members defined in the
module might be re-ordered. This can make post-processing of the PE
image difficult, as metadata members cannot be looked up by their
original metadata token anymore. The PEImageBuildResult
object
returned by CreateImage
defines a property called TokenMapping
. This
object maps all members that were included in the construction of the PE
image to the newly assigned metadata tokens, allowing for new metadata
rows to be looked up easily and efficiently.
var mainMethod = module.ManagedEntrypointMethod;
// Build PE image.
var result = imageBuilder.CreateImage(module);
// Look up the new metadata row assigned to the main method.
var newToken = result.TokenMapping[mainMethod];
var mainMethodRow = result.ConstructedImage.DotNetDirectory.Metadata
.GetStream<TablesStream>()
.GetTable<MethodDefinitionRow>()
.GetByRid(newToken.Rid);
Preserving Raw Metadata Structure
Some .NET modules are carefully crafted and rely on the raw structure of all metadata streams. These kinds of modules often rely on one of the following:
- RIDs of rows within a metadata table.
- Indices of blobs within the
#Blob
,#Strings
,#US
or#GUID
heaps. - Unknown or unconventional metadata streams and their order.
The default PE image builder for .NET modules (ManagedPEImageBuilder
)
defines a property called DotNetDirectoryFactory
, which contains the
object responsible for constructing the .NET data directory, can be
configured to preserve as much of this structure as possible. With the
help of the MetadataBuilderFlags
enum, it is possible to indicate
which structures of the metadata directory need to preserved. The
following table provides an overview of all preservation metadata
builder flags that can be used and combined:
Flag | Description |
---|---|
PreserveXXXIndices |
Preserves all row indices of the original XXX metadata table. |
PreserveTableIndices |
Preserves all row indices from all original metadata tables. |
PreserveBlobIndices |
Preserves all blob indices in the #Blob stream. |
PreserveGuidIndices |
Preserves all GUID indices in the #GUID stream. |
PreserveStringIndices |
Preserves all string indices in the #Strings stream. |
PreserveUserStringIndices |
Preserves all user-string indices in the #US stream. |
PreserveUnknownStreams |
Preserves any of the unknown / unconventional metadata streams. |
PreserveStreamOrder |
Preserves the original order of all metadata streams. |
PreserveAll |
Preserves as much of the original metadata as possible. |
Below is an example on how to configure the image builder to preserve blob data and all metadata tokens to type references:
var factory = new DotNetDirectoryFactory();
factory.MetadataBuilderFlags = MetadataBuilderFlags.PreserveBlobIndices
| MetadataBuilderFlags.PreserveTypeReferenceIndices;
imageBuilder.DotNetDirectoryFactory = factory;
Warning
Preserving heap indices copies over the original contents of the heaps to the new PE image "as-is". While AsmResolver tries to reuse blobs defined in the original heaps as much as possible, this is often not possible without also preserving RIDs in the tables stream. This might result in a significant increase in file size.
Note
Preserving RIDs within metadata tables might require AsmResolver to inject placeholder rows in existing metadata tables that are solely there to fill up space between existing rows.
Warning
Preserving RIDs within metadata tables might require AsmResolver to make
use of the Edit-And-Continue metadata tables (such as the pointer
tables). The resulting tables stream could therefore be renamed from
#~
to #-
, and the file size might increase.
String Folding in #Strings Stream
Named metadata members (such as types, methods and fields) are assigned
a name by referencing a string in the #Strings
stream by its starting
offset. When a metadata member has a name that is a suffix of another
member's name, then it is possible to only store the longer name in the
#Strings
stream, and let the member with the shorter name use an
offset within the middle of this longer name. For example, consider two
members with the names ABCDEFG
and DEFG
. If ABCDEFG
is stored at
offset 1
, then the name DEFG
is implicitly defined at offset
1 + 3 = 4
, and can thus be referenced without appending DEFG
to the
stream a second time.
By default, the PE image builder will fold strings in the #Strings
stream as described in the above. However, for some input binaries, this
might make the building process take a significant amount of time.
Therefore, to disable this folding of strings, specify the
NoStringsStreamOptimization
flag in your DotNetDirectoryFactory
:
factory.MetadataBuilderFlags |= MetadataBuilderFlags.NoStringsStreamOptimization;
Warning
Some obfuscated binaries might include lots of members that have very long but similar names. For these types of binaries, disabling this optimization can result in a significantly larger output file size.
Note
When PreserveStringIndices
is set and string folding is enabled
(NoStringsStreamOptimization
is unset), the PE image builder will not
fold strings from the original #Strings
stream into each other.
However, it will still try to reuse these original strings as much as
possible.
Deduplication of Embedded Resource Data
By default, when adding two embedded resources to a file with identical contents, AsmResolver will not add the second copy of the data to the output file and instead reuse the first blob. This can drastically reduce the size of the final output file, especially for larger applications with many (small) identical resource files (e.g., many Windows Forms Applications).
While supported by most implementations of the .NET runtime, some assembly post-processors (e.g., obfuscators) may not work well with this or depend on individual resource items to be present.
To stop AsmResolver from performing this optimization, specify the
NoResourceDataDeduplication
metadata builder flag:
factory.MetadataBuilderFlags |= MetadataBuilderFlags.NoResourceDataDeduplication;
Preserving Maximum Stack Depth
CIL method bodies work with a stack, and the stack has a pre-defined
size. This pre-defined size is defined by the MaxStack
property of the
CilMethodBody
class. By default, AsmResolver automatically calculates
the maximum stack depth of a method body upon writing the module to the
disk. However, this is not always desirable.
To override this behaviour, set ComputeMaxStackOnBuild
to false
on
all method bodies to exclude in the maximum stack depth calculation.
MethodDefinition method = ...
method.CilMethodBody.ComputeMaxStackOnBuild = false;
Alternatively, if you want to force the maximum stack depths should be
either preserved or recalculated for all methods defined in the target
assembly, it is possible to provide a custom implementation of the
IMethodBodySerializer
, or set up a new CilMethodBodySerializer
with
the ComputeMaxStackOnBuildOverride
property set to any overriding value:
DotNetDirectoryFactory factory = ...;
factory.MethodBodySerializer = new CilMethodBodySerializer
{
ComputeMaxStackOnBuildOverride = false
}
Warning
Disabling max stack computation may have unexpected side-effects (such as rendering certain CIL method bodies invalid).
Strong Name Signing
Assemblies can be signed with a strong-name signature. Open a strong name private key from a file:
var snk = StrongNamePrivateKey.FromFile(@"C:\Path\To\keyfile.snk");
Prepare the image builder to delay-sign the PE image:
DotNetDirectoryFactory factory = ...;
factory.StrongNamePrivateKey = snk;
After writing the module to an output stream, use the StrongNameSigner
class to sign the image.
using Stream outputStream = ...
module.Write(outputStream, factory);
var signer = new StrongNameSigner(snk);
signer.SignImage(outputStream, module.Assembly.HashAlgorithm);
Image Builder Diagnostics
.NET modules that contain invalid metadata and/or method bodies might
cause problems upon serializing it to a PE image or file. To inspect all
errors that occurred during the construction of a PE image, call the
CreateImage
method with the ErrorListener
property set to an
instance of the DiagnosticBag
property. This is an implementation of
IErrorListener
that collects all the problems that occurred during the
process:
// Set up a diagnostic bag as an error listener.
var diagnosticBag = new DiagnosticBag();
imageBuilder.ErrorListener = diagnosticBag;
// Build image.
var result = imageBuilder.CreateImage(module);
// Print all errors.
Console.WriteLine("Construction finished with {0} errors.", diagnosticBag.Exceptions.Count);
foreach (var error in diagnosticBag.Exceptions)
Console.WriteLine(error.Message);
Whenever a problem is reported, AsmResolver attempts to recover or fill
in default data where corrupted data was encountered. To simply build
the PE image ignoring all diagnostic errors, it is also possible to pass
in EmptyErrorListener.Instance
instead:
imageBuilder.ErrorListener = EmptyErrorListener.Instance;
Warning
Using EmptyErrorListener
will surpress any non-critical builder
errors, however these errors are typically indicative of an invalid
executable being constructed. Therefore, even if an output file is
produced, it may have unexpected side-effects (such as the file not
functioning properly).
Note
Setting an instance of IErrorListener
in the image builder will only
affect the building process. If the input module is initialized from a
file containing invalid metadata, you may still experience reader
errors, even if an EmptyErrorListener
is specified to the builder. See
Advanced Module Reading for
handling reader diagnostics.
To test whether any of the errors resulted in AsmResolver to abort the
construction of the image, use the PEImageBuildResult::HasFailed
property. If this property is set to false
, the image stored in the
ConstructedImage
property can be written to the disk:
if (!result.HasFailed)
{
var fileBuilder = new ManagedPEFileBuilder();
var file = fileBuilder.CreateFile(result.ConstructedImage);
file.Write("output.exe");
}