Difficulty: Beginner

Module 3: Target Function Selection

Choosing the right function to hook determines whether your injection fires in milliseconds or never.

Why Function Choice Matters

ThreadlessInject works by hooking a function inside the target process so that when an existing thread calls that function, your shellcode executes. The choice of which function to hook is critical. Hook a function that is called frequently and you get near-instant execution. Hook a function that is never called and your shellcode sits dormant forever. Hook a function that is currently being executed by multiple threads and you risk crashing the process.

Requirements for a Good Target Function

A suitable target function for ThreadlessInject must satisfy several criteria simultaneously. Finding a function that meets all of these requirements is the art of target selection:

RequirementReasonRisk if Violated
Called frequentlyShellcode needs to execute promptlyPayload never fires or fires too late
Exported by a loaded DLLYou need to resolve its address from your injector processCannot find the function to hook
Prologue ≥ hook sizeNeed space for the hook instruction (5 bytes for a relative CALL/JMP, or 14 for an absolute JMP)Overwrite spills into next function, crash
Not currently executingThread safety during hook installationThread hits partially-written hook, crash
Not hooked by EDRAvoid conflicts with existing EDR hooksDouble-hook conflict, detection, crash
Safe to detour brieflyYour shellcode adds latency to the callTimeouts, UI freezes, deadlocks

Sleep-Based vs. Event-Driven Triggers

Target functions broadly fall into two categories based on when and why they are called:

Sleep-Based Targets

These are functions called on a timer or periodic basis. The most common example is a thread that calls Sleep() or WaitForSingleObject() in a loop. Many Windows services and background processes have worker threads that sleep for a fixed interval, wake up, do some work, and sleep again. Hooking the sleep function guarantees your code runs every time the thread wakes up from (or enters) a sleep cycle.

C++// Example: a typical worker thread pattern in many Windows services
// This thread calls Sleep() every 1000ms - a reliable hook target
DWORD WINAPI WorkerThread(LPVOID param) {
    while (running) {
        DoPeriodicWork();
        Sleep(1000);  // <-- If we hook Sleep(), our code runs every ~1 second
    }
    return 0;
}

// ThreadlessInject's default approach: hook an export the process regularly calls
// Common choice: functions in ntdll.dll or kernel32.dll that service loops call

Event-Driven Targets

These functions are called in response to external events: network packets, user input, file system changes, etc. They may fire unpredictably but often fire quickly when the system is under normal operation. Examples include message dispatch functions, I/O completion routines, and network receive handlers.

ThreadlessInject's Default Approach

The ThreadlessInject tool allows the operator to specify which function to hook via command-line arguments. The user provides the DLL name and the export name. This flexibility means the operator can pick the best function for their specific target process. In practice, commonly chosen targets include exports from ntdll.dll, kernel32.dll, or application-specific DLLs that the target process is known to call frequently.

Resolving the Target Function Address

To hook a function in a remote process, you need its virtual address in that process. For DLLs like ntdll.dll and kernel32.dll, there is a useful property of Windows: these system DLLs are mapped at the same base address in every process (ASLR randomizes the base once at boot, but all processes share the same randomized address). This means you can resolve the function address in your own process and use that same address in the target process.

C++// Resolving the target function address
// For system DLLs (ntdll, kernel32), the address is the same in all processes

// Method 1: GetProcAddress in our own process (works for system DLLs)
HMODULE hNtdll = GetModuleHandleA("ntdll.dll");
FARPROC pTarget = GetProcAddress(hNtdll, "NtWaitForSingleObject");
// pTarget is valid in both our process AND the target process

// Method 2: For non-system DLLs, enumerate remote process modules
// Use NtQueryInformationProcess or CreateToolhelp32Snapshot to find
// the DLL base in the remote process, then parse its export table
MODULEENTRY32 me32;
me32.dwSize = sizeof(MODULEENTRY32);
HANDLE hSnap = CreateToolhelp32Snapshot(TH32CS_SNAPMODULE, targetPid);
// Walk modules to find target DLL base address...
// Then parse PE export directory to find function RVA
// remoteAddr = remoteDllBase + exportRVA

Non-System DLLs: Different Base Addresses

For application-specific DLLs, ASLR may randomize the base address independently for each process. You cannot assume the DLL is at the same address in the target as in your process. You need to enumerate the remote process's module list (via CreateToolhelp32Snapshot with TH32CS_SNAPMODULE, or by reading the remote PEB) to find the actual base address of the DLL, then parse its export table to find the target function's RVA and compute its absolute address.

Prologue Size Analysis

You need enough prologue bytes to accommodate your hook instruction. For a 14-byte absolute JMP you need at least 14 bytes; the actual ThreadlessInject tool uses a 5-byte relative CALL, requiring only 5 bytes. If the function's prologue is shorter than the hook size, you will overwrite into the middle of a subsequent instruction, which will cause a crash when execution returns to the function after the overwritten region. You must analyze the target function's disassembly to confirm it has enough room.

x86-64 ASM; Example: NtWaitForSingleObject prologue (ntdll.dll)
; This has plenty of room for a 14-byte hook
4C 8B D1          mov r10, rcx      ; 3 bytes
B8 04 00 00 00    mov eax, 0x4      ; 5 bytes  (syscall number)
0F 05             syscall            ; 2 bytes
C3                ret                ; 1 byte
; Total first instruction block: 11 bytes
; But we need 14, so we'd overwrite into the syscall+ret
; This is fine IF we save and replay all overwritten bytes

; Better example: a larger function prologue
48 89 5C 24 08    mov [rsp+8], rbx   ; 5 bytes
48 89 6C 24 10    mov [rsp+16], rbp  ; 5 bytes
48 89 74 24 18    mov [rsp+24], rsi  ; 5 bytes
57                push rdi            ; 1 byte
; 16 bytes - plenty of room for the 14-byte hook

In practice, most non-trivial functions have prologues well over 14 bytes because they save multiple registers and allocate stack space. Small stub functions (like the ntdll syscall stubs shown above) require more care, but they are still hookable if you save all the overwritten bytes correctly in the trampoline.

Common Target Functions

Here are functions commonly considered as hook targets for threadless injection, along with their characteristics:

FunctionDLLCall PatternNotes
NtWaitForSingleObjectntdll.dllBlocking waits (very frequent)Called by any thread doing synchronization
NtClosentdll.dllHandle cleanup (frequent)Called whenever handles are closed
NtQueryInformationFilentdll.dllFile operationsFrequent in I/O-heavy processes
Sleepkernel32.dllPeriodic timer loopsGood for services with worker threads
GetTickCountkernel32.dllTiming checksCalled by many GUI applications
MessageBoxWuser32.dllUser interactionOnly fires on dialog display (rare)

Target Selection Decision Flow

Identify target
process
List loaded
DLLs & exports
Find frequently
called export
Verify prologue
≥ 14 bytes
Confirm no
EDR hook conflict

Stability Considerations

Hooking a function introduces latency to every call. If the hooked function is called from a time-sensitive context (interrupt handler, DPC, or a tight real-time loop), the added execution time of your shellcode could cause timeouts, deadlocks, or data corruption. This is especially important for the one-shot pattern used by ThreadlessInject: the hook only needs to fire once, but while it is installed, every call to the hooked function takes the detour. Module 7 covers how ThreadlessInject solves this with immediate cleanup after the first execution.

ThreadlessInject's Approach

ThreadlessInject lets the operator specify the target DLL and export function on the command line. This design choice reflects the reality that the best target depends on the specific target process. For generic injection into common Windows services, ntdll.dll exports are safe bets. For application-specific injection, the operator can profile the target process to identify the most frequently called functions and choose accordingly.

Pop Quiz: Target Function Selection

Q1: Why can you use GetProcAddress in your own process to find the address of an ntdll.dll function in the target process?

Windows randomizes the base address of system DLLs like ntdll.dll and kernel32.dll once at boot time. After that, every process maps them at the same randomized address. So resolving a function in your own process gives you the same virtual address that is valid in the target process.

Q2: What happens if the target function's prologue is shorter than 14 bytes?

If you overwrite 14 bytes but the prologue instructions only span, say, 10 bytes, the last 4 bytes of your hook overwrite the beginning of the 11th byte's instruction. When the trampoline jumps back to "original function + 14", it lands in the middle of a partially-overwritten instruction, causing undefined behavior (usually a crash).

Q3: Why is hooking a function that is currently being executed by multiple threads dangerous?

If a thread's instruction pointer is somewhere within the first 14 bytes of the target function at the exact moment you overwrite those bytes, that thread will try to execute a partially-overwritten instruction sequence. This is a race condition that typically results in an access violation or illegal instruction exception, crashing the thread or the process.