Difficulty: Beginner

Module 1: EDR Call Stack Telemetry

Why your call stack is the one thing you can't easily lie about — until now.

Why This Module?

Before we tear apart SilentMoonwalk's spoofing engine, you need to understand what EDRs actually see when they inspect a thread's call stack. Every major endpoint detection product now treats the call stack as a high-confidence telemetry source. This module explains exactly how that telemetry is gathered and why it matters for offensive operations.

The Call Stack as a Forensic Artifact

Every thread in a Windows process maintains a call stack — a region of memory that records the chain of function calls leading to the current point of execution. When thread A calls function B, which calls function C, the stack contains return addresses that trace back through C → B → A.

For defenders, this is gold. A legitimate thread sleeping inside NtWaitForSingleObject should show a clean stack trace rooted in known system DLLs. If that same sleeping thread has a return address pointing into unbacked private memory (no file on disk), it screams injected code.

Clean vs Suspicious Call Stack

Legitimate Stack

ntdll!NtWaitForSingleObject

KERNELBASE!WaitForSingleObjectEx

kernel32!SleepEx

myapp.exe!WorkerThread+0x42

kernel32!BaseThreadInitThunk

ntdll!RtlUserThreadStart

Beacon Stack (Detected)

ntdll!NtWaitForSingleObject

KERNELBASE!WaitForSingleObjectEx

0x000001A2F3E01337 (unbacked RWX!)

0x000001A2F3E00A10 (unbacked RWX!)

kernel32!BaseThreadInitThunk

ntdll!RtlUserThreadStart

How EDRs Collect Stack Telemetry

EDRs use multiple mechanisms to capture call stacks. Understanding each one reveals what must be spoofed and when.

1. Kernel Callbacks (PsSetCreateThreadNotifyRoutine)

EDR kernel drivers register callbacks via PsSetCreateThreadNotifyRoutine and PsSetCreateProcessNotifyRoutine. When a thread is created, the driver is notified and can immediately inspect the thread's start address and initial stack. This catches basic injection but does not continuously monitor the stack.

2. ETW Stack Walking (Event Tracing for Windows)

ETW is the most pervasive telemetry source. The kernel can be configured to capture stack traces alongside events. When a syscall event fires, the kernel walks the stack and attaches the trace to the ETW event record. EDRs consume these events from both kernel-mode and user-mode providers:

C// EDR kernel driver enabling stack traces for syscall events
// EVENT_ENABLE_PROPERTY_STACK_TRACE causes the kernel to walk the
// stack at event capture time and embed it in the ETW record
EVENT_TRACE_PROPERTIES props;
props.EnableFlags = EVENT_TRACE_FLAG_SYSTEMCALL;
// The stack trace is captured by the kernel at the exact moment
// the syscall fires - this is hard to fake from usermode

Key Insight: Kernel-Captured Stacks

When the kernel captures a stack trace via ETW, it reads the actual RSP-based stack frames at that instant. This means any spoofing must be in place before the syscall executes, not after. The stack must look clean at the exact moment the kernel event fires. This is why simple post-call cleanup approaches fail.

3. RtlWalkFrameChain / RtlCaptureStackBackTrace

These are the primary user-mode APIs for stack walking. EDRs call them from hooks, callbacks, or dedicated scanning threads:

C// RtlCaptureStackBackTrace - captures return addresses from the stack
USHORT RtlCaptureStackBackTrace(
    ULONG  FramesToSkip,    // Skip N frames (usually 0-2 for the hook itself)
    ULONG  FramesToCapture,  // How many frames to collect
    PVOID  *BackTrace,       // Output array of return addresses
    PULONG BackTraceHash     // Optional hash of all addresses
);

// Example: EDR hook on NtAllocateVirtualMemory
NTSTATUS Hook_NtAllocateVirtualMemory(...) {
    PVOID stack[64];
    USHORT frames = RtlCaptureStackBackTrace(0, 64, stack, NULL);

    // Check each return address
    for (USHORT i = 0; i < frames; i++) {
        if (!IsAddressInKnownModule(stack[i])) {
            // ALERT: return address points to unbacked memory!
            LogSuspiciousStack(stack, frames);
        }
    }
    // Forward to real syscall
    return Original_NtAllocateVirtualMemory(...);
}

4. Inline Hooking with Stack Inspection

Many EDRs place inline hooks (detours) on sensitive functions like NtAllocateVirtualMemory, NtProtectVirtualMemory, and NtWriteVirtualMemory. Inside the hook handler, the EDR captures the current call stack and analyzes it. If any return address points to suspicious memory, the call is flagged or blocked.

Telemetry Source	Capture Mode	When It Fires	Spoofing Difficulty
ETW Stack Traces	Kernel	Syscall entry	High — kernel reads real RSP
Kernel Callbacks	Kernel	Thread/process creation	Medium — only at creation time
RtlWalkFrameChain	User-mode	On-demand / periodic	Medium — walks actual frames
Inline Hook Stacks	User-mode	On hooked API call	Medium — stack must be clean at call
Thread Scanning	User-mode	Periodic sweeps	Must be clean during sleep

What Makes a Stack "Suspicious"

EDRs apply several heuristics when analyzing captured stack traces:

Detection Heuristics

Unbacked return addresses — Any return address that doesn't map to a known loaded module (DLL/EXE) on disk is immediately suspicious.
RWX memory regions — Return addresses pointing into memory with PAGE_EXECUTE_READWRITE protections suggest dynamically generated code.
Missing unwind data — On x64, legitimate functions have RUNTIME_FUNCTION entries in the .pdata section. Functions without unwind metadata are likely shellcode.
Impossible call chains — A return address inside function F, but F never calls the function below it in the stack. The call chain doesn't make logical sense.
Abnormal stack depth — A sleeping thread with an unusually shallow or deep stack compared to typical wait patterns.

The Evolution of Stack-Based Evasion

Stack spoofing techniques evolved through several generations, each addressing new detection capabilities:

Generation	Technique	Tool	Limitation
Gen 1	Zeroing return address	ThreadStackSpoofer	Stack unwind fails — NULL frames are suspicious
Gen 2	Overwriting with legitimate addr	CallStackSpoofingPOC	Static frame, unwind codes don't match, single-frame only
Gen 3	ROP-based frame fabrication	SilentMoonwalk	Builds entire synthetic chain that passes RtlVirtualUnwind
Gen 4	Synthetic unwind metadata & unwinder emulation	Draugr (NtDallas) / Unwinder (Kudaes)	Draugr: JMP [RBX] chaining with fake RUNTIME_FUNCTION/UNWIND_INFO for BOFs. Unwinder: computes stack mathematically from unwind metadata

Where SilentMoonwalk Fits

SilentMoonwalk was a breakthrough because it was the first public tool to construct entire synthetic call chains that survive structured exception handling (SEH) unwinding via RtlVirtualUnwind. Rather than just replacing a single return address, it fabricates multiple stack frames, each with proper alignment and return addresses that correspond to real functions with valid RUNTIME_FUNCTION metadata in legitimate DLLs.

Why Simple Spoofs Fail

To understand why SilentMoonwalk's approach is necessary, consider what happens when you simply overwrite a return address on the stack:

C++// Naive approach: just overwrite the return address before sleeping
void NaiveSpoof() {
    // Save real return address
    PVOID realRetAddr = _ReturnAddress();

    // Overwrite with address inside kernel32
    // This is on the stack, so we can modify it
    *(PVOID*)(_AddressOfReturnAddress()) = (PVOID)kernel32_SleepEx_addr;

    // Now sleep - the stack shows kernel32!SleepEx as caller
    SleepEx(5000, FALSE);

    // Restore (we wake up here via the real return)
    // PROBLEM: RtlVirtualUnwind will try to unwind from kernel32!SleepEx
    // and the unwind codes won't match the actual stack frame layout.
    // The stack pointer deltas will be wrong, and the unwind will
    // either crash or produce garbage frames above this point.
}

The problem is that RtlVirtualUnwind uses UNWIND_INFO metadata to determine how much stack space a function allocated. If you claim the return address is inside SleepEx, the unwinder will apply SleepEx's unwind codes to the stack, computing wrong RSP values and following garbage pointers for every subsequent frame. The entire unwind chain collapses.

The SilentMoonwalk Promise

SilentMoonwalk solves this by building stack frames where the return address, RSP offsets, and frame layout all agree with the unwind metadata of the spoofed function. When RtlVirtualUnwind processes each frame, the math checks out: the frame size matches the unwind codes, the return address is at the expected offset, and the unwinder smoothly transitions to the next frame in the chain. The result is a complete, walkable call stack that looks indistinguishable from a legitimate one.

Next: x64 Stack Frames & Unwinding →