Difficulty: Beginner

Module 2: Windows Thread Pool Architecture

The user-mode thread pool API, its core data structures, and the I/O completion port that drives everything.

Module Objective

Understand the Windows Thread Pool API surface, the relationship between TP_POOL, TP_WORK, TP_TIMER, TP_WAIT, TP_IO, and TP_ALPC objects, how worker threads are managed, and how I/O completion ports (IOCP) serve as the central dispatch mechanism. This is essential groundwork for understanding all 8 PoolParty variants.

1. Why Thread Pools Exist

Creating and destroying threads is expensive. Each thread requires kernel stack allocation, context initialization, and scheduler registration. Thread pools solve this by maintaining a set of pre-created worker threads that wait for work items to process:

The fact that every Windows process has a thread pool is critical for PoolParty — it means every process has injectable infrastructure already running.

2. The Thread Pool API

Windows exposes the thread pool through both the public Win32 API and the internal ntdll.dll Tp* functions. The public API wraps the internal functions:

Public APIInternal (ntdll)Purpose
CreateThreadpoolWorkTpAllocWorkCreate a work callback item
SubmitThreadpoolWorkTpPostWorkSubmit work to the pool queue
CreateThreadpoolTimerTpAllocTimerCreate a timer callback item
SetThreadpoolTimerTpSetTimerArm the timer with a due time
CreateThreadpoolWaitTpAllocWaitCreate a wait callback item
SetThreadpoolWaitTpSetWaitAssociate a handle to wait on
CreateThreadpoolIoTpAllocIoCompletionCreate an I/O callback item
StartThreadpoolIoTpStartAsyncIoOperationBegin async I/O tracking

PoolParty works at the internal Tp* level, directly manipulating the structures these APIs create and manage.

3. Core Data Structures

3.1 TP_POOL

The central structure representing a thread pool instance. Every process has at least one default TP_POOL. Key fields include:

C++ (Reconstructed)// Simplified TP_POOL structure (undocumented, reverse-engineered)
struct TP_POOL {
    TP_TASK_CALLBACKS  TaskCallbacks;
    LIST_ENTRY         WorkerListHead;      // Linked list of worker threads
    ULONG              NumWorkers;           // Current worker count
    ULONG              MinWorkers;           // Minimum worker threads
    ULONG              MaxWorkers;           // Maximum worker threads
    HANDLE             CompletionPort;       // IOCP handle - THE dispatch mechanism
    TP_QUEUE           TaskQueue[3];         // High, Normal, Low priority queues
    HANDLE             WorkerFactory;        // Worker factory kernel object
    // ... additional fields
};

The IOCP Is Everything

The CompletionPort field is the I/O completion port that all worker threads wait on. Every work item, timer expiration, wait satisfaction, and I/O completion is dispatched through this single IOCP. If you can post a completion packet to this IOCP, you can trigger code execution on a worker thread. This is the insight behind several PoolParty variants.

3.2 TP_WORK

Represents a work callback — a function pointer and context to be executed by a worker thread:

C++ (Reconstructed)struct TP_WORK {
    TP_TASK        Task;           // Contains the callback + context
    TP_POOL       *Pool;           // Owning pool
    LIST_ENTRY     ListEntry;      // Queue linkage
    ULONG          Flags;
    // ...
};

struct TP_TASK {
    TP_CALLBACK_ENVIRON_V3 *CallbackEnviron;
    PTP_WORK_CALLBACK       WorkCallback;    // Function to call
    PVOID                   Context;          // Argument to callback
    // ...
};

3.3 TP_TIMER

A timer-based callback item. When the timer expires, the callback is dispatched through the IOCP:

C++ (Reconstructed)struct TP_TIMER {
    TP_TASK        Task;            // Callback + context
    TP_POOL       *Pool;
    LARGE_INTEGER  DueTime;         // When to fire
    ULONG          Period;          // Recurring interval (0 = one-shot)
    ULONG          Window;          // Coalescing window
    LIST_ENTRY     WindowListEntry; // Timer window list
    // ...
};

3.4 TP_WAIT

Waits on a kernel object (event, mutex, semaphore, process). When the object is signaled, the callback fires:

C++ (Reconstructed)struct TP_WAIT {
    TP_TASK        Task;         // Callback + context
    TP_POOL       *Pool;
    HANDLE         WaitObject;   // Object to wait on
    HANDLE         WaitHandle;   // Registered wait handle
    // ...
};

3.5 TP_IO

I/O completion callback. When an async I/O operation completes on a file handle bound to the thread pool, the callback fires:

C++ (Reconstructed)struct TP_IO {
    TP_TASK        Task;            // Callback + context
    TP_POOL       *Pool;
    HANDLE         FileHandle;      // The I/O target
    ULONG          PendingCount;    // Outstanding operations
    // ...
};

3.6 TP_ALPC

ALPC (Advanced Local Procedure Call) port callback. When a message arrives on a bound ALPC port, the callback fires. This is used internally by Windows for RPC dispatch:

C++ (Reconstructed)struct TP_ALPC {
    TP_TASK        Task;          // Callback + context
    TP_POOL       *Pool;
    HANDLE         AlpcPort;      // ALPC port handle
    // ...
};

4. I/O Completion Ports (IOCP)

The IOCP is the backbone of the Windows thread pool. It is a kernel object that acts as a queue of completion packets and a thread wakeup mechanism:

IOCP as Central Dispatcher

TP_WORK
Work callback
IOCP Queue
Completion packets
Worker Thread
TppWorkerThread
TP_TIMER
Timer expiry
IOCP Queue
Completion packets
Worker Thread
TppWorkerThread
TP_IO
I/O completion
IOCP Queue
Completion packets
Worker Thread
TppWorkerThread

Worker threads call NtRemoveIoCompletion (the native equivalent of GetQueuedCompletionStatus) to block until a completion packet arrives. When one does, the worker thread extracts the work item from the packet and calls the registered callback.

C++// How worker threads consume work (simplified)
while (true) {
    ULONG_PTR completionKey;
    LPOVERLAPPED overlapped;
    DWORD bytes;

    // Block until work is available
    NtRemoveIoCompletion(pool->CompletionPort,
                         &completionKey,
                         &overlapped,
                         &ioStatusBlock,
                         NULL);  // infinite timeout

    // The completion key and overlapped encode the work item type
    TP_TASK* task = DecodeWorkItem(completionKey, overlapped);
    task->WorkCallback(task->Instance, task->Context);
}

5. Worker Threads and the Worker Factory

Worker threads are not created with CreateThread. Instead, Windows uses a Worker Factory kernel object that manages thread creation and destruction:

ComponentDescription
Worker FactoryKernel object (TpWorkerFactory) that creates and manages worker threads for a pool
Start RoutineThe function new worker threads begin executing — typically TppWorkerThread
Min/Max ThreadsThe factory automatically scales between min and max based on load
NtCreateWorkerFactoryCreates the factory, specifying the IOCP and start routine
C++// Worker factory creation (internal, during pool init)
NTSTATUS NtCreateWorkerFactory(
    PHANDLE WorkerFactoryHandle,
    ACCESS_MASK DesiredAccess,
    POBJECT_ATTRIBUTES ObjectAttributes,
    HANDLE CompletionPortHandle,    // The pool's IOCP
    HANDLE WorkerProcessHandle,     // Process the threads run in
    PVOID StartRoutine,             // TppWorkerThread
    PVOID StartParameter,           // Pool context
    ULONG MaxThreadCount,
    SIZE_T StackReserve,
    SIZE_T StackCommit
);

Variant 1 Preview

The Worker Factory’s StartRoutine determines what function new worker threads execute. PoolParty Variant 1 uses NtSetInformationWorkerFactory to change the StartRoutine to point at injected shellcode, then triggers creation of a new worker thread. The new thread starts executing the shellcode directly.

6. The Default Thread Pool

Every Windows process automatically has a default thread pool. It is created lazily when any thread pool API is first called. Key characteristics:

PoolParty’s first challenge in each variant is to locate the target process’s TP_POOL and extract the relevant handles (IOCP, Worker Factory) from it.

7. Object Relationships

How the Pieces Fit Together

ObjectLives InConnected To
TP_POOLUser-mode heapIOCP handle, Worker Factory handle, task queues
TP_WORKUser-mode heapBack-pointer to TP_POOL, callback + context
TP_TIMERUser-mode heapBack-pointer to TP_POOL, timer parameters
TP_WAITUser-mode heapBack-pointer to TP_POOL, wait object handle
TP_IOUser-mode heapBack-pointer to TP_POOL, file handle
TP_ALPCUser-mode heapBack-pointer to TP_POOL, ALPC port handle
Worker FactoryKernelIOCP handle, StartRoutine, thread counts
IOCPKernelCompletion packet queue, waiting thread list

The key takeaway: all roads lead to the IOCP. Whether you inject a work item, fire a timer, signal a wait, complete an I/O, or send an ALPC message, the callback dispatch goes through the I/O completion port. Understanding this central role is essential for understanding every PoolParty variant.

Knowledge Check

Q1: What kernel object serves as the central dispatch mechanism for all thread pool callbacks?

A) I/O Completion Port (IOCP)
B) Event object
C) Mutex
D) Semaphore

Q2: What internal ntdll function do worker threads call to wait for work items?

A) NtWaitForSingleObject
B) NtDelayExecution
C) NtRemoveIoCompletion
D) NtWaitForMultipleObjects

Q3: Why is the default thread pool significant for PoolParty?

A) It has no security restrictions
B) Every Windows process has one, providing a universal injection target
C) It runs with SYSTEM privileges
D) It cannot be monitored by EDRs