Difficulty: Beginner

Module 3: Thread Pool Internals

TppWorkerThread, work item lifecycle, how the kernel dispatches callbacks, and TP_POOL deep dive.

Module Objective

Understand the internal execution flow of a thread pool worker thread from the moment it is created by the worker factory, through its main loop waiting on the IOCP, to the point where it dispatches a callback. This understanding is critical for knowing exactly where PoolParty inserts its hooks into the dispatch chain.

1. TppWorkerThread — The Worker Main Loop

Every worker thread in a Windows thread pool begins execution at ntdll!TppWorkerThread. This function contains the main dispatch loop that all work items pass through:

C++ (Pseudocode)// ntdll!TppWorkerThread - simplified pseudocode
VOID TppWorkerThread(PTP_POOL Pool)
{
    // Register this thread with the worker factory
    TppWorkerThreadInit(Pool);

    while (TRUE)
    {
        IO_STATUS_BLOCK iosb;
        PVOID completionKey;
        PVOID overlapped;

        // Wait on the pool's IOCP for the next work item
        NTSTATUS status = NtRemoveIoCompletion(
            Pool->CompletionPort,
            &completionKey,
            &overlapped,
            &iosb,
            NULL  // infinite wait
        );

        if (!NT_SUCCESS(status))
            break;

        // Determine work item type from the completion packet
        TP_TASK_TYPE type = DecodeTaskType(completionKey, overlapped);

        switch (type)
        {
            case TP_TASK_WORK:
                TppWorkpExecuteCallback((PTP_WORK)overlapped);
                break;
            case TP_TASK_TIMER:
                TppTimerpExecuteCallback((PTP_TIMER)overlapped);
                break;
            case TP_TASK_WAIT:
                TppWaitpExecuteCallback((PTP_WAIT)overlapped);
                break;
            case TP_TASK_IO:
                TppIopExecuteCallback((PTP_IO)overlapped);
                break;
            case TP_TASK_ALPC:
                TppAlpcpExecuteCallback((PTP_ALPC)overlapped);
                break;
            case TP_TASK_DIRECT:
                TppDirectpExecuteCallback((PTP_DIRECT)overlapped);
                break;
        }

        // Notify the worker factory this thread is available again
        TppWorkerThreadReady(Pool);
    }
}

Key Insight for PoolParty

The worker thread does not validate the origin of completion packets. If a completion packet arrives on the IOCP with a properly formatted work item, the worker thread will dispatch it. This is the fundamental trust assumption PoolParty exploits — completion packets are trusted implicitly.

2. Work Item Lifecycle

A normal work item goes through these stages from creation to callback execution:

Work Item Lifecycle

TpAllocWork
Allocate TP_WORK
TpPostWork
Queue to pool
NtSetIoCompletion
Post to IOCP
Callback
Worker executes

2.1 Allocation (TpAllocWork)

TpAllocWork allocates a TP_WORK structure on the heap, initializes the callback pointer and context, and associates it with a TP_POOL:

C++// Creating a work item (normal usage)
PTP_WORK workItem = NULL;
TpAllocWork(&workItem, MyCallback, myContext, NULL);
// workItem->Task.WorkCallback = MyCallback
// workItem->Task.Context = myContext
// workItem->Pool = default pool (or specified pool)

2.2 Submission (TpPostWork)

TpPostWork inserts the work item into the pool’s task queue and posts a completion packet to the IOCP to wake a worker thread:

C++ (Pseudocode)VOID TpPostWork(PTP_WORK Work)
{
    // Insert into the pool's pending task queue
    TppWorkInsertQueue(Work->Pool, Work, TP_PRIORITY_NORMAL);

    // Post a completion packet to wake a worker thread
    NtSetIoCompletion(
        Work->Pool->CompletionPort,
        (ULONG_PTR)Work,     // CompletionKey = work item pointer
        NULL,                  // ApcContext
        STATUS_SUCCESS,
        0
    );
}

2.3 Dispatch

A sleeping worker thread is woken by the IOCP, dequeues the completion packet, extracts the TP_WORK pointer, and calls the registered callback:

C++ (Pseudocode)VOID TppWorkpExecuteCallback(PTP_WORK Work)
{
    PTP_CALLBACK_INSTANCE instance;
    TppInitCallbackInstance(&instance, Work);

    // Call the user's callback function
    Work->Task.WorkCallback(
        instance,
        Work->Task.Context,
        Work
    );

    TppCleanupCallbackInstance(&instance);
}

3. The Worker Factory Kernel Object

The worker factory is a kernel-mode object type (TpWorkerFactory) that manages the lifecycle of worker threads. Key properties:

PropertyDescriptionRelevant To
StartRoutineFunction pointer for new threads (TppWorkerThread)Variant 1
StartParameterParameter passed to StartRoutine (TP_POOL pointer)Variant 1
MinThreadCountMinimum number of worker threadsThread creation trigger
MaxThreadCountMaximum number of worker threadsScaling limits
CompletionPortThe IOCP that workers wait onAll IOCP variants
WorkerCountCurrent number of active workersScaling decisions

3.1 Querying the Worker Factory

The NtQueryInformationWorkerFactory syscall retrieves information about a worker factory, including its start routine and thread counts:

C++// Query worker factory information
typedef struct _WORKER_FACTORY_BASIC_INFORMATION {
    LARGE_INTEGER Timeout;
    LARGE_INTEGER RetryDelay;
    ULONG         IdleWorkerCount;
    ULONG         TotalWorkerCount;
    ULONG         ActiveWorkerCount;
    ULONG         WaitingWorkerCount;
    ULONG         PendingWorkerCount;
    PVOID         StartRoutine;        // TppWorkerThread normally
    PVOID         StartParameter;      // TP_POOL pointer
    HANDLE        CompletionPort;      // Pool IOCP
    NTSTATUS      LastThreadCreationStatus;
    PROCESS_ID    ProcessId;
} WORKER_FACTORY_BASIC_INFORMATION;

WORKER_FACTORY_BASIC_INFORMATION wfInfo;
NtQueryInformationWorkerFactory(
    hWorkerFactory,
    WorkerFactoryBasicInformation,
    &wfInfo,
    sizeof(wfInfo),
    NULL
);

3.2 Modifying the Worker Factory

NtSetInformationWorkerFactory can modify worker factory properties. This is the API that Variant 1 uses to change the StartRoutine:

C++// Change the StartRoutine (used by Variant 1)
NtSetInformationWorkerFactory(
    hWorkerFactory,
    WorkerFactoryThreadMinimum,  // Information class
    &newMinimum,                  // Trigger new thread creation
    sizeof(ULONG)
);

4. How the Kernel Dispatches Callbacks

The flow from kernel to user-mode callback involves several layers:

Kernel-to-User Callback Dispatch

Source Event
Timer, I/O, signal
Kernel IOCP
IoSetIoCompletion
Worker Wakes
NtRemoveIoCompletion
User Callback
WorkCallback()
  1. Event occurs — a timer fires, I/O completes, a wait object is signaled
  2. Kernel posts to IOCP — the kernel calls IoSetIoCompletion (internal) to post a completion packet to the pool’s IOCP
  3. Worker thread wakesNtRemoveIoCompletion returns with the completion packet data
  4. DispatchTppWorkerThread decodes the packet type and calls the appropriate Tpp*ExecuteCallback function
  5. Callback runs — the user’s callback function executes in the context of the worker thread

5. Locating the Target’s TP_POOL

PoolParty needs to find the target process’s TP_POOL to extract handle values (IOCP, Worker Factory). The approach used by PoolParty involves:

5.1 Handle Enumeration

C++// Enumerate handles in the target process using NtQueryInformationProcess
// or NtQuerySystemInformation(SystemHandleInformation)
typedef struct _SYSTEM_HANDLE_TABLE_ENTRY_INFO {
    USHORT UniqueProcessId;
    USHORT CreatorBackTraceIndex;
    UCHAR  ObjectTypeIndex;
    UCHAR  HandleAttributes;
    USHORT HandleValue;
    PVOID  Object;
    ULONG  GrantedAccess;
} SYSTEM_HANDLE_TABLE_ENTRY_INFO;

// Filter for IoCompletion (IOCP) and TpWorkerFactory object types
// Duplicate the handles into our process for inspection
HANDLE dupHandle;
DuplicateHandle(hTargetProcess, (HANDLE)entry.HandleValue,
                GetCurrentProcess(), &dupHandle,
                0, FALSE, DUPLICATE_SAME_ACCESS);

5.2 Identifying the Thread Pool IOCP

Not every IOCP in a process belongs to the thread pool. PoolParty identifies the correct one by querying worker factory objects and checking which IOCP they reference:

C++// For each worker factory handle found:
WORKER_FACTORY_BASIC_INFORMATION wfbi;
NtQueryInformationWorkerFactory(dupWfHandle,
    WorkerFactoryBasicInformation,
    &wfbi, sizeof(wfbi), NULL);

// wfbi.StartRoutine should be ntdll!TppWorkerThread
// wfbi.CompletionPort is the thread pool IOCP
// wfbi.StartParameter is the TP_POOL pointer in the target

6. TP_POOL Deep Dive

The TP_POOL structure (reconstructed from reverse engineering) contains fields that each PoolParty variant targets:

FieldOffset (x64)Used By VariantPurpose
TaskQueue[High]+0x...2High-priority work item linked list
TaskQueue[Normal]+0x...2Normal-priority work item linked list
TaskQueue[Low]+0x...2Low-priority work item linked list
CompletionPort+0x...4, 5, 7IOCP handle value
TimerQueue+0x...8Timer item ordered list
WaitObjectList+0x...3Wait item list

Offsets Are Version-Dependent

The exact field offsets within TP_POOL and other internal structures vary between Windows versions. PoolParty uses pattern matching and heuristics rather than hardcoded offsets to maintain compatibility across Windows 10 and 11 builds.

7. The TP_DIRECT Fast Path

In addition to the standard callback types, there is a special TP_DIRECT structure that provides a fast path for execution. Unlike TP_WORK or TP_TIMER, a TP_DIRECT item is dispatched directly from the IOCP completion packet without going through the standard task queue:

C++ (Reconstructed)struct TP_DIRECT {
    TP_TASK Task;      // Callback and context
    // Minimal structure - no pool linkage needed
};

// TP_DIRECT dispatch in TppWorkerThread:
// If the completion packet has a specific signature,
// treat the overlapped pointer as a TP_DIRECT and call
// its callback immediately.

Variant 7 exploits this fast path. We will cover it in detail in Module 6.

Knowledge Check

Q1: What function do all worker threads begin executing at?

A) NtCreateThreadEx
B) TppWorkerThread
C) RtlUserThreadStart
D) CreateThread callback

Q2: Why does the worker thread trust completion packets from the IOCP?

A) Each packet is digitally signed by the kernel
B) The IOCP validates packet origins
C) Worker threads verify callback pointers before calling them
D) There is no validation - packets are trusted implicitly

Q3: How does PoolParty identify which IOCP handle belongs to the thread pool?

A) By querying worker factory objects and checking which IOCP they reference
B) The thread pool IOCP is always handle value 4
C) By reading the PEB directly
D) The IOCP has a special name attribute