Difficulty: Intermediate

Module 4: ROP Gadgets & NtContinue

Redirecting execution by overwriting the thread context — one CONTEXT at a time.

Module Objective

Understand what Return-Oriented Programming (ROP) is conceptually, how NtContinue provides a clean mechanism for setting the thread context to execute arbitrary functions, and why Ekko uses NtContinue as the callback for every timer instead of calling API functions directly.

1. What Is Return-Oriented Programming?

Return-Oriented Programming (ROP) is a code-reuse exploitation technique that chains together small sequences of existing code (called "gadgets") that each end with a RET instruction. By controlling the stack, an attacker can make the CPU "return" from one gadget into the next, executing a sequence of operations without injecting any new code.

In classical ROP exploitation, the attacker corrupts the stack so that each RET pops the address of the next gadget from the stack, forming a chain of tiny code snippets that together perform arbitrary operations (like calling VirtualProtect to mark memory as executable, then jumping to shellcode).

Classical ROP Chain on the Stack

Gadget 1
pop rcx; ret
Gadget 2
pop rdx; ret
Gadget 3
pop r8; ret
Target Function
VirtualProtect

Ekko does not use classical stack-based ROP chains. Instead, it uses a more elegant mechanism: NtContinue. Rather than manipulating the stack to chain gadgets via RET instructions, Ekko sets up complete CONTEXT structures for each operation and uses NtContinue to atomically load them into the CPU. This is sometimes called "context-oriented programming" or a "ROP-like" approach because the concept is similar (redirect execution to existing code), but the mechanism is different (CONTEXT replacement instead of stack manipulation).

2. NtContinue — The Context Restorer

NtContinue is an undocumented ntdll function that restores a thread's execution context from a CONTEXT structure. It is normally used internally by Windows for exception handling — after an exception handler runs, NtContinue restores the thread to continue execution from where the exception occurred (or a modified location).

C// NtContinue - restores thread context
// Exported by ntdll.dll
//
// NTSTATUS NtContinue(
//     PCONTEXT ThreadContext,   // Context to restore
//     BOOLEAN  RaiseAlert       // Whether to test for alert delivery
// );

// Ekko resolves it at runtime:
PVOID NtContinue = GetProcAddress(
    GetModuleHandleA("Ntdll"),
    "NtContinue"
);

What NtContinue Actually Does

When called, NtContinue takes the CONTEXT structure pointed to by its first argument and replaces the entire thread state with the values in that structure. This includes:

NtContinue does not return to its caller. Execution continues at whatever address RIP was set to in the CONTEXT structure. This is what makes it so powerful for Ekko — you can redirect execution to any function with any arguments without constructing a traditional call frame.

3. NtContinue as a Timer Callback

Recall from Module 2 that CreateTimerQueueTimer expects a callback with the signature:

CVOID CALLBACK WaitOrTimerCallback(
    PVOID lpParameter,
    BOOLEAN TimerOrWaitFired
);

Ekko exploits a critical alignment between this callback signature and NtContinue's signature:

Parameter PositionRegister (x64)Callback SignatureNtContinue Signature
1st argumentRCXPVOID lpParameterPCONTEXT ThreadContext
2nd argumentRDXBOOLEAN TimerOrWaitFiredBOOLEAN RaiseAlert

The Signature Match

Both the timer callback and NtContinue take a pointer as their first argument (in RCX). When Ekko registers NtContinue as the timer callback and passes a CONTEXT* as the parameter, the timer infrastructure calls NtContinue(&CtxStruct, TimerFired). NtContinue reads the CONTEXT from RCX, replaces the thread state with those register values, and execution jumps to whatever RIP was set to in that context. The second argument (RaiseAlert/TimerOrWaitFired) is effectively ignored for Ekko's purposes — it just needs to be a valid boolean value, which it is.

C// Ekko's pattern: NtContinue as callback, CONTEXT* as parameter
CreateTimerQueueTimer(
    &hNewTimer,
    hTimerQueue,
    NtContinue,      // Callback = NtContinue
    &RopProtRW,      // Parameter = CONTEXT* (loaded into RCX)
    100, 0,
    WT_EXECUTEINTIMERTHREAD );

// When this timer fires:
// 1. Timer system calls: NtContinue(&RopProtRW, TRUE)
// 2. NtContinue reads RopProtRW context
// 3. Sets RIP = VirtualProtect, RCX = ImageBase, etc.
// 4. Execution jumps to VirtualProtect with correct arguments

4. The CONTEXT Structure

The CONTEXT structure is a large Windows structure that holds the complete CPU register state for a thread. On x64, it is defined in winnt.h and contains over 50 fields:

C// Simplified view of CONTEXT (x64) - key fields for Ekko
typedef struct _CONTEXT {
    // Control registers
    DWORD64 Rip;    // Instruction pointer
    DWORD64 Rsp;    // Stack pointer
    DWORD   EFlags; // Processor flags

    // Segment registers
    WORD SegCs, SegDs, SegEs, SegFs, SegGs, SegSs;

    // General-purpose registers
    DWORD64 Rax, Rcx, Rdx, Rbx;
    DWORD64 Rbp, Rsi, Rdi;
    DWORD64 R8, R9, R10, R11, R12, R13, R14, R15;

    // Floating-point / SSE state
    M128A Xmm0, Xmm1, /* ... */ Xmm15;

    // ... additional fields (debug registers, vector registers, etc.)
    DWORD ContextFlags;  // Which parts of the context are valid

} CONTEXT;

The full CONTEXT structure on x64 is 1232 bytes. Ekko allocates six of these on the stack (plus the initial capture context), totaling over 8.5 KB of stack space for the context structures alone.

5. Building a CONTEXT for an API Call

Each CONTEXT in Ekko's chain is constructed by copying the captured baseline context and modifying the registers needed for a specific function call. Here is how Ekko constructs the VirtualProtect(RW) context:

C// Step 1: Copy the baseline context (captured from timer thread)
memcpy( &RopProtRW, &CtxThread, sizeof(CONTEXT) );

// Step 2: Set RIP to the target function
RopProtRW.Rip = (DWORD64)VirtualProtect;

// Step 3: Set arguments per x64 calling convention
RopProtRW.Rcx = (DWORD64)ImageBase;         // arg1: lpAddress
RopProtRW.Rdx = (DWORD64)ImageSize;         // arg2: dwSize
RopProtRW.R8  = PAGE_READWRITE;             // arg3: flNewProtect
RopProtRW.R9  = (DWORD64)&OldProtect;       // arg4: lpflOldProtect

// Step 4: Adjust RSP for stack alignment
RopProtRW.Rsp -= 8;

The x64 Calling Convention (Microsoft)

On x64 Windows, the first four integer/pointer arguments are passed in registers RCX, RDX, R8, R9 (in that order). Additional arguments go on the stack. The caller must also provide 32 bytes of "shadow space" on the stack above the return address for the callee to use. The RSP adjustment (Rsp -= 8) in Ekko is related to stack alignment, which is covered in detail in Module 6.

6. Why Not Call Functions Directly?

A natural question is: why go through all this complexity with CONTEXT structures and NtContinue instead of just calling VirtualProtect and SystemFunction032 directly?

The Self-Encryption Paradox (Revisited)

Direct function calls cannot work because:

  1. VirtualProtect(PAGE_READWRITE) would remove execute permission from the page containing the calling code, causing an immediate access violation
  2. SystemFunction032 would encrypt the code that is currently executing, turning the next instruction into encrypted garbage
  3. The calling code is inside the image region being protected and encrypted

NtContinue solves this because it lives in ntdll.dll, which is outside the encrypted region. When the timer fires and calls NtContinue, execution is inside ntdll's code, not the implant's code. NtContinue then sets RIP to VirtualProtect (in kernel32.dll) or SystemFunction032 (in advapi32.dll) — all of which are system DLLs that are never encrypted by Ekko.

7. NtContinue vs. SetThreadContext

Windows also provides SetThreadContext / NtSetContextThread for modifying a thread's registers. Why does Ekko use NtContinue instead?

AspectNtContinueNtSetContextThread
Self-modificationCan modify the calling thread's own contextRequires the thread to be suspended (cannot modify self while running)
Return behaviorDoes not return — execution continues at new RIPReturns to caller after setting context
Thread requirementOperates on current threadRequires a thread handle
Use in callbacksNatural fit — callback calls NtContinue, context is replacedWould need a second thread to suspend and modify the first

NtContinue is the natural choice for timer callbacks because it allows the timer thread to modify its own context. The timer infrastructure calls NtContinue, which replaces the timer thread's registers, and execution seamlessly continues at the target function. No thread suspension or cross-thread manipulation is needed.

8. Security Implications of NtContinue

From a defensive perspective, NtContinue is a powerful primitive because it allows arbitrary code execution without creating new threads or injecting shellcode. The execution happens within the context of an existing, legitimate thread (the timer thread), using code that already exists in loaded system DLLs.

Detection Challenges

NtContinue-based execution is difficult to detect because:

The primary detection vector is monitoring for CreateTimerQueueTimer calls where the callback is set to NtContinue with a CONTEXT* parameter — a highly unusual pattern that no legitimate software uses.

9. Putting It Together: The NtContinue Execution Model

NtContinue Execution Flow (Single Timer)

Timer Fires
DueTime reached
Timer Dispatch
Calls callback(param)
NtContinue(ctx)
Loads CONTEXT
Target API
RIP = VirtualProtect

Each timer follows the same pattern: the timer infrastructure calls NtContinue with a CONTEXT pointer, NtContinue replaces all registers, and execution jumps to the target API with the correct arguments already in registers. The target API executes normally and returns — but since the RSP and return address were set up by the context, the return flows back into the timer infrastructure, which then waits for the next timer to fire.

Knowledge Check

Q1: What does NtContinue do when called?

A) Creates a new thread with the specified context
B) Replaces the current thread's entire register state with the provided CONTEXT and resumes execution at the new RIP
C) Suspends the current thread and saves its context
D) Copies the current context into the provided CONTEXT structure

Q2: Why does Ekko use NtContinue as the timer callback instead of calling target APIs directly?

A) Direct API calls are slower than NtContinue
B) Windows does not allow VirtualProtect to be called from timer callbacks
C) Direct calls would execute from within the region being encrypted/protected, causing crashes
D) NtContinue provides built-in encryption functionality

Q3: On x64 Windows, which registers hold the first four function arguments?

A) RCX, RDX, R8, R9
B) RAX, RBX, RCX, RDX
C) RDI, RSI, RDX, RCX
D) R8, R9, R10, R11