Module 1: The Thread Creation Problem
Every classic injection technique has the same Achilles' heel: creating a thread in a remote process.
Why This Module?
Before understanding ThreadlessInject by CCob (EthicalChaos), you must understand the problem it solves. Traditional process injection techniques rely on creating a new thread in the target process to execute injected code. This single operation generates a cascade of telemetry that modern EDR products exploit ruthlessly. ThreadlessInject exists because thread creation is, from an attacker's perspective, the loudest thing you can do.
The Classic Injection Pattern
Nearly every traditional process injection technique follows the same three-step pattern. First, you allocate memory in the target process. Second, you write your payload (shellcode or a DLL) into that memory. Third, you trigger execution of that payload. It is this third step — triggering execution — that has historically been the most detectable, because the most common approach is to create a new thread in the remote process.
The canonical implementation uses CreateRemoteThread, a Win32 API exported by kernel32.dll. This function asks the kernel to create a new thread in a specified process, starting execution at an address you control:
C++// Classic injection: allocate, write, execute via new thread
HANDLE hProcess = OpenProcess(PROCESS_ALL_ACCESS, FALSE, targetPid);
// Step 1: Allocate memory in target
LPVOID remoteBuf = VirtualAllocEx(hProcess, NULL, shellcodeLen,
MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
// Step 2: Write shellcode
WriteProcessMemory(hProcess, remoteBuf, shellcode, shellcodeLen, NULL);
// Step 3: Create a remote thread to execute it
HANDLE hThread = CreateRemoteThread(hProcess, NULL, 0,
(LPTHREAD_START_ROUTINE)remoteBuf, NULL, 0, NULL);
This pattern works, but it is trivially detectable. The call to CreateRemoteThread triggers a well-documented chain of kernel events that security products monitor.
Kernel-Level Thread Creation Callbacks
Windows provides a documented mechanism for kernel-mode drivers to receive notifications whenever a new thread is created: PsSetCreateThreadNotifyRoutine. Every major EDR product registers a callback with this function. When your CreateRemoteThread call creates a new thread, the kernel walks its list of registered callbacks and invokes each one, passing the owning process ID, the new thread ID, and a boolean indicating creation (TRUE) or deletion (FALSE).
| Kernel Callback | What It Reports | Used By |
|---|---|---|
PsSetCreateThreadNotifyRoutine | New thread creation: owning process PID, thread ID, create/delete flag | All major EDRs |
PsSetCreateProcessNotifyRoutineEx | New process creation with full image path | All major EDRs |
PsSetLoadImageNotifyRoutine | DLL/image loads into any process | All major EDRs |
ObRegisterCallbacks | Handle operations (OpenProcess, OpenThread) | Most EDRs |
The critical observation is this: when process A creates a thread in process B, the EDR driver sees that the creating process differs from the target process. A thread created by a process in itself is normal. A thread created by an external process is immediately suspicious, and the EDR will flag it for further analysis or outright block it.
Cross-Process Thread Creation = Immediate Alert
EDR products like CrowdStrike Falcon, Microsoft Defender for Endpoint, and SentinelOne all monitor PsSetCreateThreadNotifyRoutine. When they detect cross-process thread creation (creator PID != target PID), this alone is often sufficient to flag the operation as suspicious and trigger deeper behavioral analysis or termination.
ETW Telemetry: Even More Visibility
Event Tracing for Windows (ETW) provides additional userland telemetry on thread creation. The Microsoft-Windows-Kernel-Process ETW provider emits ThreadStart and ThreadStop events for every thread in the system. EDR agents subscribe to these events and correlate them with their kernel callback data.
The ETW telemetry includes the thread's start address, which is the address passed to CreateRemoteThread. If this start address points into a region that was recently allocated with VirtualAllocEx and has executable permissions, the detection is practically certain. The sequence "allocate remote RWX memory, write to it, create thread pointing to it" is one of the most well-known attack patterns in the Windows ecosystem.
C++// What the ETW event looks like to the EDR:
// Event: ThreadStart/Start
// Fields:
// ProcessId: target.exe (PID 1234)
// ThreadId: newly created thread
// StartAddress: 0x00000213A0010000 <-- points to VirtualAllocEx'd memory
// StackBase: ...
// StackLimit: ...
//
// EDR correlation:
// - StartAddress is in private, recently-allocated, RWX memory
// - Creating process (attacker.exe) != target process (target.exe)
// - VERDICT: Malicious remote thread injection
Alternatives to CreateRemoteThread (Still Loud)
Attackers have tried many variations to avoid CreateRemoteThread detection, but all share the fundamental problem of creating a new thread or scheduling code execution through monitored mechanisms:
Traditional Injection Execution Methods
Even when attackers use lower-level native API functions like NtCreateThreadEx, the kernel callback still fires because the actual thread creation happens in the kernel. The PsSetCreateThreadNotifyRoutine callback is triggered by the kernel's internal thread creation path, not by the specific userland API that was called. Switching from CreateRemoteThread to NtCreateThreadEx is merely cosmetic from a detection standpoint.
APC-Based Injection: Better, But Still Detectable
Asynchronous Procedure Calls (APCs) represent a step forward because they execute code on an existing thread rather than creating a new one. However, APCs have their own problems. The target thread must be in an alertable wait state (meaning it called SleepEx, WaitForSingleObjectEx, or similar with the bAlertable flag set to TRUE). Not all threads are in an alertable state, making APC injection unreliable. Additionally, modern EDR products now monitor NtQueueApcThread calls where the calling process differs from the target process.
C++// APC injection: still requires finding an alertable thread
// and the cross-process QueueUserAPC call is monitored
HANDLE hThread = OpenThread(THREAD_SET_CONTEXT, FALSE, targetThreadId);
QueueUserAPC((PAPCFUNC)remoteShellcodeAddr, hThread, 0);
// Problem 1: thread must be in alertable wait state
// Problem 2: cross-process APC queuing is now monitored by EDRs
// Problem 3: if thread never enters alertable state, payload never runs
The ThreadlessInject Insight
What if you could make the target process execute your code without creating a thread and without queuing an APC? What if, instead of telling the target process to run something new, you modified something the target is already doing so that it runs your code as part of its normal operation? This is the core insight behind ThreadlessInject: hook a function that the target process already calls regularly, so that the next time an existing thread calls that function, your code runs.
The Detection Surface Summary
To appreciate why ThreadlessInject is significant, consider the full detection surface of traditional injection:
| Detection Layer | What It Catches | ThreadlessInject Avoids? |
|---|---|---|
| Kernel thread callbacks | Cross-process thread creation | Yes — no new thread |
| ETW thread events | Thread start address in suspicious memory | Yes — no new thread |
| Handle monitoring | OpenProcess with suspicious access rights | Partial — still needs handle |
| Memory scanning | RWX memory regions, shellcode patterns | Partial — still allocates memory |
| API hooking (userland) | Calls to CreateRemoteThread, WriteProcessMemory | Yes — uses Nt* APIs |
| Behavioral analysis | Pattern: alloc + write + thread creation | Yes — no thread in pattern |
ThreadlessInject eliminates the single most detectable component of the injection chain — the thread creation event — while accepting that some detection surface (like cross-process memory allocation and writing) remains. The trade-off is heavily in the attacker's favor because thread creation was by far the strongest signal available to defenders.
Pop Quiz: The Thread Creation Problem
Q1: Why does switching from CreateRemoteThread to NtCreateThreadEx not evade kernel-level detection?
Q2: What is the primary limitation of APC-based injection?
Q3: What is the core innovation of ThreadlessInject?