Module 2: LLVM Compiler Architecture
Understanding the compilation pipeline that makes compiler-level function masking possible.
Module Objective
Learn how the LLVM compiler framework is structured, what intermediate representations it uses, how the X86 backend transforms IR into machine code, and specifically where FunctionPeekaboo’s X86RetModPass hooks into the pipeline to instrument functions at the PreEmit phase.
1. What Is LLVM?
LLVM (originally “Low Level Virtual Machine,” now just a name) is a modular compiler infrastructure used by Clang (C/C++), Rust, Swift, and many other languages. Its key design principle is a three-phase architecture:
LLVM Three-Phase Architecture
Clang, rustc, swiftc
Parses source → LLVM IR
Optimizer
Transform passes on IR
Target-specific
IR → machine code
The frontend (e.g., Clang for C/C++) parses source code into LLVM IR (Intermediate Representation). The middle-end runs optimization passes on this IR. The backend converts optimized IR into target-specific machine code (x86, ARM, RISC-V, etc.).
FunctionPeekaboo operates in the backend — specifically in the X86 backend. This is critical because by the time the backend runs, all high-level language features have been lowered to concrete machine instructions, and FunctionPeekaboo can inject exact assembly sequences.
2. LLVM Intermediate Representation (IR)
LLVM IR is a typed, SSA-form (Static Single Assignment) intermediate language. It looks like a cross between assembly and a high-level language:
LLVM IR; A simple function that adds two integers
define i32 @add(i32 %a, i32 %b) {
entry:
%result = add i32 %a, %b
ret i32 %result
}
; A function with a conditional branch
define i32 @max(i32 %a, i32 %b) {
entry:
%cmp = icmp sgt i32 %a, %b
br i1 %cmp, label %then, label %else
then:
ret i32 %a
else:
ret i32 %b
}
Key properties of LLVM IR that matter for FunctionPeekaboo:
- SSA Form: Each variable is assigned exactly once, making data flow analysis straightforward
- Typed: Every value has a type (
i32,i64,ptr, etc.) - Target-independent: The same IR can be lowered to x86, ARM, or any other supported backend
- Function-level granularity: Each function is a self-contained unit, which aligns perfectly with per-function masking
Why Not Modify IR?
FunctionPeekaboo could theoretically inject masking logic at the IR level, but this would be problematic. At the IR level, there are no concrete machine instructions yet — the injected code would need to survive lowering, instruction selection, register allocation, and scheduling. By modifying at the backend level (after these phases), FunctionPeekaboo injects exact x86 machine instructions that go directly into the output binary.
3. The X86 Backend Pipeline
The LLVM X86 backend converts IR into x86 machine code through a series of passes. Each pass transforms the code further toward final machine code:
X86 Backend Pass Pipeline (Simplified)
| Phase | Pass Category | What Happens |
|---|---|---|
| 1 | Instruction Selection (ISel) | IR instructions are matched to x86 machine instructions using pattern matching (SelectionDAG or GlobalISel) |
| 2 | Machine IR (MIR) Optimization | Machine instructions are optimized: peephole opts, dead code elimination, instruction combining |
| 3 | Register Allocation | Virtual registers are mapped to physical x86 registers (RAX, RCX, etc.), with spilling for overflows |
| 4 | Prologue/Epilogue Insertion | Stack frame setup/teardown code is added (push rbp, sub rsp, etc.) |
| 5 | Post-RA Optimization | Further optimization after register allocation (register copy coalescing, branch folding) |
| 6 | PreEmit | Final passes before code emission — this is where X86RetModPass runs |
| 7 | Code Emission | Machine instructions are serialized to binary (MC layer) and written to the object file |
The PreEmit phase is the last opportunity to modify the machine code before it is finalized. By this point, all register allocation is done, all frame setup is in place, and the instructions are in their final form. This makes it the ideal insertion point for FunctionPeekaboo’s stubs.
4. MachineFunction and MachineFunctionPass
In the LLVM backend, each function is represented as a MachineFunction object. A MachineFunctionPass is a pass that operates on one MachineFunction at a time — it receives each function, can inspect and modify its machine instructions, and returns whether it changed anything.
C++// Simplified MachineFunctionPass structure
class X86RetModPass : public MachineFunctionPass {
public:
static char ID;
X86RetModPass() : MachineFunctionPass(ID) {}
bool runOnMachineFunction(MachineFunction &MF) override {
// This method is called once per function
// MF contains all MachineBasicBlocks
// Each MBB contains MachineInstr objects
// Check if this function should be instrumented
if (!shouldInstrument(MF))
return false; // no changes made
// Instrument the function
addPrologueStub(MF);
replaceReturns(MF);
return true; // function was modified
}
};
The MachineFunction contains MachineBasicBlock objects, which in turn contain MachineInstr objects. FunctionPeekaboo’s X86RetModPass iterates through these to find all RET instructions and replace them with epilogue stubs, and to prepend prologue stubs to the function entry.
5. Machine Instructions at the PreEmit Stage
At the PreEmit stage, instructions look like concrete x86 machine instructions, but they are still represented as MachineInstr objects (not yet serialized to bytes). For example:
MIR (Machine IR); A simple function at PreEmit stage
bb.0.entry:
liveins: $edi, $esi
$eax = LEA32r $edi, 1, $esi, 0, $noreg ; eax = edi + esi
RET 0, $eax ; return eax
; After X86RetModPass, the RET is replaced:
bb.0.entry:
liveins: $edi, $esi
; ... prologue stub (inline bytes) ...
$eax = LEA32r $edi, 1, $esi, 0, $noreg
; ... epilogue stub replaces the RET ...
CALL64pcrel32 @handler ; call handler to re-encrypt
RET 0, $eax ; then return
Key Advantage of PreEmit
At PreEmit, register allocation is complete, so FunctionPeekaboo knows exactly which physical registers are in use. The prologue stub can safely use registers that are known to be free (or save/restore them on the stack). The epilogue stub similarly knows the register state at each return point. This level of precision is only available in the backend.
6. How FunctionPeekaboo Integrates
FunctionPeekaboo adds a new MachineFunctionPass called X86RetModPass to the X86 backend’s pass pipeline. The integration requires modifying two key files in the LLVM source:
LLVM Modification Points
| File | Change |
|---|---|
lib/Target/X86/X86TargetMachine.cpp | Register X86RetModPass in the target pass pipeline at the PreEmit stage |
lib/Target/X86/X86RetModPass.cpp | The new pass implementation (function detection, stub injection, metadata generation) |
lib/Target/X86/CMakeLists.txt | Add the new source file to the build |
The pass registration in X86TargetMachine.cpp places it at the PreEmit position:
C++// In X86TargetMachine.cpp - addPreEmitPass()
void X86PassConfig::addPreEmitPass() {
// ... existing passes ...
addPass(new X86RetModPass()); // FunctionPeekaboo's pass
}
7. Function Attributes for Registration
Not every function should be instrumented — only functions explicitly marked by the developer. FunctionPeekaboo uses LLVM function attributes to identify which functions to instrument:
C++// In the implant source code, mark functions for masking:
__attribute__((annotate("peekaboo")))
void beacon_checkin() {
// This function will be self-masking
// The attribute tells X86RetModPass to instrument it
}
// Unmarked functions are left alone:
void helper_function() {
// This function will NOT be instrumented
// It stays as normal code
}
The X86RetModPass checks each function for this attribute. If present, the function is registered for instrumentation: its address and size are recorded in the .funcmeta section, and prologue/epilogue stubs are injected.
Registration Granularity
The developer has full control over which functions are masked. Functions that are called extremely frequently (hot loops) might be left unmasked for performance. Functions containing sensitive logic (C2 communication, credential handling, lateral movement) should be masked. The attribute-based approach lets the developer make this trade-off per function.
8. Building LLVM with FunctionPeekaboo
To use FunctionPeekaboo, you must build a custom LLVM/Clang toolchain with the patch applied. The typical workflow is:
Bash# 1. Clone the LLVM project
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
# 2. Apply FunctionPeekaboo patches
# (copies X86RetModPass.cpp, modifies X86TargetMachine.cpp, etc.)
git apply functionpeekaboo.patch
# 3. Build LLVM + Clang with X86 backend
cmake -S llvm -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS="clang" \
-DLLVM_TARGETS_TO_BUILD="X86"
ninja -C build
# 4. The resulting clang binary supports FunctionPeekaboo
# Use it to compile your implant:
build/bin/clang -target x86_64-pc-windows-msvc \
-O2 implant.c -o implant.exe
# 5. Post-process the PE to set the entry point
python3 modifyEP.py implant.exe
Build Time Consideration
Building LLVM from source with Clang typically takes 30–60 minutes on modern hardware with sufficient RAM (16 GB+ recommended). This is a one-time cost — once the toolchain is built, recompiling the implant is fast. The custom Clang binary is the only tool needed; no runtime dependencies are added.
9. The Compilation Flow with FunctionPeekaboo
Here is the complete compilation flow from source code to a self-masking binary:
Full Compilation Pipeline
Functions with
peekaboo attribute
Parse → LLVM IR
Standard passes
ISel → RegAlloc
→ X86RetModPass
Adjust PE entry
to .stub section
The key addition is X86RetModPass in the backend and modifyEP.py as a post-build step. The optimizer runs unchanged, meaning all standard optimizations (-O2, -O3, LTO) work normally. FunctionPeekaboo does not interfere with optimization because it runs after all optimization is complete.
Knowledge Check
Q1: At which stage of the LLVM X86 backend does X86RetModPass run?
Q2: Why does FunctionPeekaboo modify the LLVM backend rather than the IR?
Q3: What mechanism does FunctionPeekaboo use to determine which functions to instrument?