Module 2: PE Loader Fundamentals
Everything a PE loader must do to transform a flat file into a running image in memory — the foundation of Donut’s inmem_pe.c.
Module Objective
Understand the five core tasks every PE loader must perform: mapping sections, resolving imports, applying base relocations, executing TLS callbacks, and calling the entry point. These are the exact steps Donut’s PIC loader implements in inmem_pe.c for native EXE/DLL payloads.
1. PE Header Anatomy
Every PE file begins with a DOS header (IMAGE_DOS_HEADER) whose e_lfanew field points to the NT headers. The NT headers contain the actual PE metadata:
C// Navigating from a raw PE buffer to the critical headers
PIMAGE_DOS_HEADER dos = (PIMAGE_DOS_HEADER)pe_buffer;
PIMAGE_NT_HEADERS nt = (PIMAGE_NT_HEADERS)((BYTE*)dos + dos->e_lfanew);
// Optional header contains the key fields for loading
DWORD image_base = nt->OptionalHeader.ImageBase;
DWORD size_of_image = nt->OptionalHeader.SizeOfImage;
DWORD entry_point = nt->OptionalHeader.AddressOfEntryPoint;
WORD num_sections = nt->FileHeader.NumberOfSections;
DWORD section_align = nt->OptionalHeader.SectionAlignment;
DWORD file_align = nt->OptionalHeader.FileAlignment;
| Field | Purpose | Typical Value |
|---|---|---|
ImageBase | Preferred virtual address for the image | 0x00400000 (EXE) / 0x10000000 (DLL) |
SizeOfImage | Total virtual size when fully mapped | Varies by binary |
AddressOfEntryPoint | RVA of the entry point function | Offset from image base |
SectionAlignment | Alignment of sections in memory | 0x1000 (4 KB page) |
FileAlignment | Alignment of sections in the file on disk | 0x200 (512 bytes) |
2. Step 1: Allocate and Map Sections
The first job of a PE loader is to allocate memory for the entire image and copy each section to its correct virtual offset. Sections in a PE file have different offsets on disk (file alignment) versus in memory (section alignment).
C// Allocate memory for the entire image
LPVOID base = VirtualAlloc(
NULL,
nt->OptionalHeader.SizeOfImage,
MEM_COMMIT | MEM_RESERVE,
PAGE_READWRITE // Start as RW, fix permissions later
);
// Copy PE headers (everything before the first section)
memcpy(base, pe_buffer, nt->OptionalHeader.SizeOfHeaders);
// Map each section to its virtual address
PIMAGE_SECTION_HEADER sec = IMAGE_FIRST_SECTION(nt);
for (WORD i = 0; i < nt->FileHeader.NumberOfSections; i++) {
if (sec[i].SizeOfRawData > 0) {
memcpy(
(BYTE*)base + sec[i].VirtualAddress, // Destination: VA offset
pe_buffer + sec[i].PointerToRawData, // Source: file offset
sec[i].SizeOfRawData // Size on disk
);
}
}
Virtual vs. File Alignment
A section might start at file offset 0x400 but virtual address 0x1000. The loader must use VirtualAddress for the destination (memory layout) and PointerToRawData for the source (file layout). Getting this wrong is the single most common PE loader bug.
3. Step 2: Apply Base Relocations
If the image cannot be loaded at its preferred ImageBase (which is almost always the case during injection), all absolute addresses in the code must be adjusted. The relocation table (IMAGE_DIRECTORY_ENTRY_BASERELOC) lists every location that needs patching.
C// Calculate the delta between actual and preferred base
ULONG_PTR delta = (ULONG_PTR)base - nt->OptionalHeader.ImageBase;
if (delta != 0) {
PIMAGE_DATA_DIRECTORY reloc_dir =
&nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC];
PIMAGE_BASE_RELOCATION reloc =
(PIMAGE_BASE_RELOCATION)((BYTE*)base + reloc_dir->VirtualAddress);
while (reloc->VirtualAddress && reloc->SizeOfBlock) {
DWORD count = (reloc->SizeOfBlock - sizeof(IMAGE_BASE_RELOCATION)) / 2;
WORD *entry = (WORD*)(reloc + 1);
for (DWORD i = 0; i < count; i++) {
WORD type = entry[i] >> 12;
WORD offset = entry[i] & 0x0FFF;
if (type == IMAGE_REL_BASED_DIR64) {
// 64-bit relocation: add delta to the 8-byte value
*(ULONG_PTR*)((BYTE*)base + reloc->VirtualAddress + offset) += delta;
} else if (type == IMAGE_REL_BASED_HIGHLOW) {
// 32-bit relocation: add delta to the 4-byte value
*(DWORD*)((BYTE*)base + reloc->VirtualAddress + offset) += (DWORD)delta;
}
// IMAGE_REL_BASED_ABSOLUTE (type 0) = padding, skip
}
reloc = (PIMAGE_BASE_RELOCATION)((BYTE*)reloc + reloc->SizeOfBlock);
}
}
Relocation Process
0x00400000
0x01A80000
0x01680000
+= delta
4. Step 3: Resolve Imports
The import table lists every external function the PE calls. The loader must walk the Import Directory, load each required DLL, and write the resolved function addresses into the Import Address Table (IAT).
CPIMAGE_DATA_DIRECTORY imp_dir =
&nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
PIMAGE_IMPORT_DESCRIPTOR imp =
(PIMAGE_IMPORT_DESCRIPTOR)((BYTE*)base + imp_dir->VirtualAddress);
while (imp->Name) {
// Load the required DLL
char *dll_name = (char*)((BYTE*)base + imp->Name);
HMODULE hDll = LoadLibraryA(dll_name);
// Walk the thunk arrays
PIMAGE_THUNK_DATA orig = (PIMAGE_THUNK_DATA)((BYTE*)base + imp->OriginalFirstThunk);
PIMAGE_THUNK_DATA first = (PIMAGE_THUNK_DATA)((BYTE*)base + imp->FirstThunk);
while (orig->u1.AddressOfData) {
if (IMAGE_SNAP_BY_ORDINAL(orig->u1.Ordinal)) {
// Import by ordinal
first->u1.Function = (ULONG_PTR)GetProcAddress(
hDll, MAKEINTRESOURCEA(IMAGE_ORDINAL(orig->u1.Ordinal)));
} else {
// Import by name
PIMAGE_IMPORT_BY_NAME name =
(PIMAGE_IMPORT_BY_NAME)((BYTE*)base + orig->u1.AddressOfData);
first->u1.Function = (ULONG_PTR)GetProcAddress(hDll, name->Name);
}
orig++;
first++;
}
imp++;
}
Import by Name vs. Ordinal
Most imports are by name (a string like "CreateFileW"). Some imports use ordinals (numeric IDs). The high bit of the thunk value distinguishes the two cases: if set, it is an ordinal import; otherwise, it points to an IMAGE_IMPORT_BY_NAME structure containing the function name string.
5. Step 4: Set Section Permissions
After loading, each section must have its memory protection set according to its characteristics flags. The .text section needs PAGE_EXECUTE_READ, .rdata needs PAGE_READONLY, and .data needs PAGE_READWRITE.
Cfor (WORD i = 0; i < nt->FileHeader.NumberOfSections; i++) {
DWORD protect = PAGE_READONLY;
DWORD chars = sec[i].Characteristics;
BOOL is_exec = (chars & IMAGE_SCN_MEM_EXECUTE) != 0;
BOOL is_write = (chars & IMAGE_SCN_MEM_WRITE) != 0;
BOOL is_read = (chars & IMAGE_SCN_MEM_READ) != 0;
if (is_exec && is_write) protect = PAGE_EXECUTE_READWRITE;
else if (is_exec && is_read) protect = PAGE_EXECUTE_READ;
else if (is_exec) protect = PAGE_EXECUTE;
else if (is_write) protect = PAGE_READWRITE;
else if (is_read) protect = PAGE_READONLY;
DWORD old;
VirtualProtect(
(BYTE*)base + sec[i].VirtualAddress,
sec[i].Misc.VirtualSize,
protect, &old
);
}
6. Step 5: TLS Callbacks and Entry Point
Before calling the main entry point, the loader must execute any Thread Local Storage (TLS) callbacks registered in the TLS directory. Then it calls the entry point, which differs based on the PE type:
| PE Type | Entry Point Signature | How Called |
|---|---|---|
| EXE | int main() / WinMain() | Direct call, no arguments needed for basic execution |
| DLL | BOOL DllMain(HINSTANCE, DWORD, LPVOID) | Called with DLL_PROCESS_ATTACH reason |
C// Execute TLS callbacks if present
PIMAGE_DATA_DIRECTORY tls_dir =
&nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_TLS];
if (tls_dir->Size) {
PIMAGE_TLS_DIRECTORY tls =
(PIMAGE_TLS_DIRECTORY)((BYTE*)base + tls_dir->VirtualAddress);
PIMAGE_TLS_CALLBACK *callback =
(PIMAGE_TLS_CALLBACK*)tls->AddressOfCallBacks;
while (callback && *callback) {
(*callback)((PVOID)base, DLL_PROCESS_ATTACH, NULL);
callback++;
}
}
// Call the entry point
DWORD_PTR ep = (DWORD_PTR)base + nt->OptionalHeader.AddressOfEntryPoint;
if (is_dll) {
typedef BOOL (WINAPI *DllMainFunc)(HINSTANCE, DWORD, LPVOID);
((DllMainFunc)ep)((HINSTANCE)base, DLL_PROCESS_ATTACH, NULL);
} else {
typedef int (*ExeMainFunc)(void);
((ExeMainFunc)ep)();
}
7. How Donut Differs from a Standard Loader
Donut’s PE loader in inmem_pe.c follows the same five steps, but with critical differences that make it work as PIC shellcode:
PIC-Specific Adaptations
- No
LoadLibraryAat first — the loader must resolveLoadLibraryAitself via PEB walking before it can use it to load import DLLs - Hash-based API resolution — instead of calling
GetProcAddressby string, Donut uses API name hashes to find function addresses by walking DLL export tables directly - No global variables — all state is passed through the
DONUT_INSTANCEstructure pointer, making the code position-independent - Decryption first — the PE payload is Chaskey-encrypted inside
DONUT_MODULE; the loader must decrypt and decompress before mapping - Optional export calling — for DLLs, Donut can call a specific exported function (not just
DllMain) with user-supplied arguments
8. The Complete Loading Flow
PE Loading Pipeline (Donut inmem_pe.c)
Chaskey CTR
aPLib / LZNT1
Knowledge Check
1. Why must base relocations be applied when loading a PE at a non-preferred address?
2. What is the correct order of PE loading operations?
3. How does Donut’s PIC loader resolve API functions without calling GetProcAddress?