Module 8: FOLIAGE Sleep Masking
Encrypting the beacon in memory while it sleeps — the crown jewel of AceLdr's evasion
AdvancedThe Big Idea
Inspired by SecIdiot/FOLIAGE
AceLdr's sleep masking is inspired by the FOLIAGE project by SecIdiot. The core insight: Cobalt Strike beacons spend the vast majority of their lifecycle sleeping between check-ins. During sleep, the beacon's code and data sit in memory unencrypted — a sitting target for memory scanners. FOLIAGE solves this by encrypting the beacon's heap and changing its memory permissions before sleeping, then reversing everything upon wakeup.
The challenge is that you can't simply encrypt yourself and then sleep — the code performing the encryption would also be encrypted. FOLIAGE's elegant solution uses Asynchronous Procedure Calls (APCs) queued to the current thread. The APCs execute in sequence when the thread enters an alertable wait state, performing all the masking operations without the beacon's own code needing to be executing.
Sleep_Hook — The Entry Point
When the beacon calls Sleep(dwMilliseconds), AceLdr's IAT hook intercepts it. The hook in hooks/delay.c first checks for short sleeps:
C — hooks/delay.c
VOID WINAPI Sleep_Hook(DWORD dwMilliseconds) {
// Short sleep bypass: for sleeps under 1000ms, just call the
// real Sleep directly. FOLIAGE overhead isn't worth it for
// quick operational pauses (e.g., jitter calculations).
if (dwMilliseconds < 1000) {
STUB.Api.Sleep(dwMilliseconds);
return;
}
// For real sleep intervals (>= 1 second):
// 1. Generate a fresh RC4 encryption key
// 2. Encrypt the entire private heap
// 3. Queue the APC chain for sleep masking
// 4. Enter alertable wait (APCs fire, thread sleeps)
// 5. On wakeup: APCs decrypt heap, restore permissions
// (all handled by the APC chain)
USTRING Key = { 0 };
generateEncryptionKey(&Key);
// Encrypt the private heap before sleeping
encryptHeap(&Key);
// Build and queue the APC chain, then sleep
foliageSleep(dwMilliseconds, &Key);
// After waking up, decrypt the heap
encryptHeap(&Key); // RC4 is symmetric: encrypt again = decrypt
}
The 1000ms Threshold
Short sleeps (under 1 second) bypass the entire FOLIAGE mechanism. This is a performance optimization. The beacon frequently uses short sleeps for jitter, retry logic, and internal timing. Running the full APC chain for every sub-second sleep would add significant overhead and produce suspicious patterns of APC queuing. Only the "real" sleep intervals between C2 check-ins trigger FOLIAGE.
generateEncryptionKey
A fresh encryption key is generated for each sleep cycle. This prevents key reuse across multiple sleep intervals, which would weaken the encryption:
C — hooks/delay.c
VOID generateEncryptionKey(PUSTRING Key) {
// Allocate a 16-byte key buffer
Key->Length = 16;
Key->MaxLength = 16;
Key->Buffer = (PVOID)STUB.Api.HeapAlloc(
STUB.Heap, 0, Key->MaxLength);
// Fill with random bytes using the system RNG
STUB.Api.RtlGenRandom(Key->Buffer, Key->Length);
}
The key is a 16-byte random buffer used as the RC4 key for SystemFunction032. Using RtlGenRandom (the underlying function behind CryptGenRandom) ensures cryptographically strong randomness.
encryptHeap — Walking and Encrypting
This function walks every allocated block on AceLdr's private heap and encrypts (or decrypts) it using RC4:
C — hooks/delay.c
VOID encryptHeap(PUSTRING Key) {
PROCESS_HEAP_ENTRY Entry = { 0 };
// Lock the heap to prevent concurrent access during encryption
STUB.Api.RtlLockHeap(STUB.Heap);
// Walk every block on the private heap
while (STUB.Api.RtlWalkHeap(STUB.Heap, &Entry) == 0) {
// Only encrypt BUSY (allocated) blocks
if (Entry.wFlags & PROCESS_HEAP_ENTRY_BUSY) {
USTRING Data;
Data.Length = (USHORT)Entry.cbData;
Data.MaxLength = (USHORT)Entry.cbData;
Data.Buffer = Entry.lpData;
// SystemFunction032 = RC4 encrypt/decrypt
// Since RC4 is symmetric, calling this twice with the
// same key encrypts then decrypts (restoring plaintext)
STUB.Api.SystemFunction032(&Data, Key);
}
}
STUB.Api.RtlUnlockHeap(STUB.Heap);
}
Why RC4 via SystemFunction032?
SystemFunction032 — The Undocumented RC4
SystemFunction032 is an undocumented function exported by advapi32.dll (internally in cryptsp.dll). It implements RC4 stream cipher encryption. AceLdr uses it because:
- No custom crypto code needed — Uses a Windows API, keeping the PIC shellcode small
- RC4 is symmetric — Encrypting with the same key twice returns the original data, simplifying the encrypt/decrypt flow
- Simple interface — Takes just a data buffer and a key, both as
USTRINGstructures - No initialization overhead — Unlike AES, RC4 needs no block padding or mode of operation
RC4 Symmetry Property
RC4 is a stream cipher that generates a keystream from the key and XORs it with the data. Since XOR is its own inverse (A XOR K XOR K = A), calling SystemFunction032 twice with the same key first encrypts, then decrypts. This is why encryptHeap is called both before and after sleeping — the same function handles both directions.
The 10-Step APC Chain
The heart of FOLIAGE is a chain of 10 APCs queued to the current thread. These APCs execute in FIFO (first-in, first-out) order when the thread enters an alertable wait state via NtWaitForSingleObject.
FOLIAGE APC Chain (APCs 0-9)
| # | APC Function | Action | Purpose |
|---|---|---|---|
| 0 | NtWaitForSingleObject |
Wait on timer (sleep duration) | The actual sleep — thread is suspended for the requested interval |
| 1 | NtGetContextThread |
Capture thread context (via trampoline) | Save RSP, RIP, etc. before masking — used for Patriot evasion |
| 2 | NtSetContextThread |
Set RIP to legitimate wait location | Make thread appear to be sleeping inside ntdll, not in beacon code |
| 3 | NtProtectVirtualMemory |
Change beacon memory to RW |
Remove execute permission — beacon code becomes non-executable data |
| 4 | SystemFunction032 |
Encrypt beacon code with RC4 | Beacon code is now encrypted AND non-executable |
| 5 | NtWaitForSingleObject |
Wait on timer (sleep duration) | The actual sleep happens here with beacon fully masked |
| 6 | SystemFunction032 |
Decrypt beacon code with RC4 | Restore beacon code to plaintext |
| 7 | NtProtectVirtualMemory |
Change beacon memory back to RX |
Restore execute permission so beacon can run again |
| 8 | NtSetContextThread |
Restore original thread context (via trampoline) | RIP points back to beacon code — used for Patriot evasion |
| 9 | NtSetEvent |
Signal completion event | Signals that the chain is complete so the outer wait can finish |
Visualizing the Chain
APC Execution Flow
Wait (setup)
Get context
Spoof context
RX → RW
Encrypt code
SLEEP (masked)
Decrypt code
RW → RX
Restore context
Signal done
Why Use APCs Instead of Direct Calls?
The Self-Encryption Problem
Consider what happens if you try to encrypt yourself directly:
- You call
SystemFunction032to encrypt the beacon's memory region - But the code calling
SystemFunction032is inside that memory region - After encryption, the next instruction to execute is now encrypted garbage
- The program crashes
APCs solve this elegantly. All APC functions (NtProtectVirtualMemory, SystemFunction032, etc.) live in ntdll.dll or advapi32.dll — system DLLs that are NOT part of the beacon's memory. When the thread enters an alertable wait, the kernel's APC dispatcher (inside ntdll) dequeues and executes each APC in order. The beacon's own code never needs to execute during the masking process.
Patriot Evasion via JMP RBX Trampoline
The Patriot tool detects sleeping beacons by inspecting the thread's saved context (particularly RIP) to see if it points to suspicious memory. FOLIAGE counters this on APCs 1 and 8:
The Patriot Problem
When a thread sleeps, its context is saved. If a tool calls NtGetContextThread on a sleeping beacon thread, it sees that RIP points into unbacked memory — proof that the thread was executing beacon code. Even with the beacon encrypted, the saved context reveals its presence.
The Solution: Context Spoofing
APCs 1 and 8 in the chain handle this by using the JMP [RBX] trampoline (the same gadget from Module 7) as an intermediary for NtGetContextThread and NtSetContextThread:
Pseudocode — Patriot evasion
// APC 1: Capture context through trampoline
// - NtGetContextThread is called through the JMP RBX gadget
// - The captured context shows RIP at the gadget (in a legit DLL)
// - This context is saved for later restoration
// APC 2: Set spoofed context
// - NtSetContextThread sets RIP to point inside ntdll
// (e.g., NtWaitForSingleObject + some offset)
// - Now if Patriot inspects the thread during sleep,
// RIP appears to be inside ntdll = legitimate
// APC 8: Restore original context through trampoline
// - NtSetContextThread restores the saved context from APC 1
// - RIP is set back to the beacon code, ready to resume
Thread Context During Sleep
RIP = beacon code
(suspicious)
Save & spoof
thread context
RIP = ntdll.dll
(legitimate)
Restore real
context
RIP = beacon code
(running again)
CFG (Control Flow Guard) Handling
Windows Control Flow Guard (CFG) validates indirect call targets against a bitmap of approved addresses. Since APC dispatch involves indirect calls, AceLdr needs to register its hook functions as valid CFG targets:
C — cfg handling
// Before queuing APCs, register the trampoline as a valid
// CFG call target. Without this, CFG-enforced processes would
// terminate the thread when APC dispatch tries to call through
// an unregistered address.
PVOID CfgAddr = C_PTR(TrampolineFunc);
SIZE_T CfgSize = sizeof(PVOID);
// NtSetInformationVirtualMemory with VmCfgCallTargetInformation
// adds our trampoline to the CFG bitmap
STUB.Api.NtSetInformationVirtualMemory(
NtCurrentProcess(),
VmCfgCallTargetInformation,
1,
&CfgAddr,
&CfgInfo,
sizeof(CfgInfo)
);
Why CFG Matters
Many modern processes run with CFG enabled. If AceLdr queues an APC whose target address isn't in the CFG bitmap, the APC dispatch will trigger a CFG violation — crashing the process. By calling NtSetInformationVirtualMemory with VmCfgCallTargetInformation, AceLdr adds its trampoline function as a valid call target, allowing the APC chain to execute without triggering CFG enforcement.
The Complete Sleep Cycle
Full FOLIAGE Sleep Sequence
Sleep(60000)intercepts
RC4 key
private heap
APCs
wait
+ spoof context
(fully masked)
+ restore
private heap
resumes
During the masked sleep window, a memory scanner sees:
- No executable beacon code — Memory permissions are
RW, notRXorRWX - No beacon signatures — Code and heap data are RC4-encrypted
- No suspicious thread context — RIP points into
ntdll.dll - No beacon strings on the heap — The private heap is separately encrypted
Knowledge Check
Module 8 Quiz
1. When does AceLdr encrypt the private heap relative to the APC chain?
Sleep_Hook via encryptHeap(&Key). The APC chain handles encrypting the beacon's code region (which is a separate concern). After wakeup, encryptHeap is called again with the same key to decrypt (RC4 symmetry).2. Why can AceLdr use the same encryptHeap function for both encryption and decryption?
encryptHeap is called with the same key before sleep (encrypt) and after wakeup (decrypt).3. What does the thread context look like to a scanner during the masked sleep?
NtSetContextThread to set RIP to a location inside ntdll.dll (such as an offset within NtWaitForSingleObject). This makes the thread appear to be legitimately waiting inside a system call. Tools like Patriot that inspect sleeping thread contexts will see ntdll as the instruction pointer, not suspicious beacon memory.