Table of Contents

Graph Construction

There are two main strategies for constructing control and data flow graphs from a raw code stream: Static and Symbolic Graph Construction.

Static Graph Construction

The most efficient way to construct a CFG is using a static control flow graph builder. Each architecture that supports static control flow graph building implements the IStaticSuccessorResolver interface, which is able to determine the successors of a single instruction. A graph can then be constructed using the StaticFlowGraphBuilder class:

using Echo.ControlFlow.Construction;

IArchitecture<TInstruction> architecture = ...;
IStaticSuccessorResolver<TInstruction> resolver = ...;

IList<TInstruction> instructions = ...;
var builder = new StaticFlowGraphBuilder<TInstruction>(
    architecture, 
    instructions, 
    resolver
);

var cfg = builder.ConstructFlowGraph();

The control flow graph builder also supports instruction providers that can obtain or decode instructions by an offset instead. This can be useful when dealing with architectures for which the start and end of a function is not well-defined and a recursive descent disassembly is required (such as x86):

class X86InstructionDecoder : IStaticInstructionProvider<X86Instruction>
{
    X86Instruction GetInstructionAtOffset(long offset) 
    {
        return /* ... Decode instruction at provided offset ... */
    }
}

/* ... */ 

var decoder = new X86InstructionDecoder();
var builder = new StaticFlowGraphBuilder<X86Instruction>(
    architecture, 
    decoder, 
    resolver
);

var cfg = builder.ConstructFlowGraph(entryPoint: 0x1234);
Note

Often, a backend platform has this boilerplate already implemented by extension methods. For instance, Echo.Platforms.AsmResolver defines an extension method on CilMethodBody called ConstructStaticFlowGraph.

CilMethodBody methodBody = ...;
var cfg = methodBody.ConstructStaticFlowGraph();

Refer to the platform-specific documentation to see how these graphs can be constructed easily.

Symbolic Graph Construction

A big limitation of static CFG builders is that it cannot work on chunks of code that contain indirect jumps or calls. Symbolic CFG builders attempt to solve this, by keeping track of data flow as the disassembly is going. In contrast to a static successor, the symbolic graph builder requires an instance of IStateTransitioner instead. This interface takes a symbolic input state, and transforms it into a set of all the possible symbolic output states that the instruction may produce.

using Echo.DataFlow.Construction;

IArchitecture<TInstruction> architecture = ...;
StateTransitioner<TInstruction> transitioner = ...;

IList<TInstruction> instructions = ...;
var builder = new SymbolicFlowGraphBuilder<TInstruction>(
    architecture, 
    instructions, 
    transitioner
);

var cfg = builder.ConstructFlowGraph();

Most state transitioners produce a data flow graph as a by-product.

// First create the CFG.
var cfg = builder.ConstructFlowGraph();

// After building the CFG, a DFG is populated in the transitioner.
var dfg = transitioner.DataFlowGraph;
Warning

While symbolic graph construction usually is more accurate, it is significantly slower than static graph construction and can take a lot of memory.

Note

Often, a backend platform has this boilerplate already implemented by extension methods. For instance, Echo.Platforms.AsmResolver defines an extension method on CilMethodBody called ConstructStaticFlowGraph.

CilMethodBody methodBody = ...;
var cfg = methodBody.ConstructSymbolicFlowGraph(out var dfg);

Refer to the platform-specific documentation to see how these graphs can be constructed easily.