Difficulty: Beginner

Module 2: PE Loader Fundamentals

Everything a PE loader must do to transform a flat file into a running image in memory — the foundation of Donut’s inmem_pe.c.

Module Objective

Understand the five core tasks every PE loader must perform: mapping sections, resolving imports, applying base relocations, executing TLS callbacks, and calling the entry point. These are the exact steps Donut’s PIC loader implements in inmem_pe.c for native EXE/DLL payloads.

1. PE Header Anatomy

Every PE file begins with a DOS header (IMAGE_DOS_HEADER) whose e_lfanew field points to the NT headers. The NT headers contain the actual PE metadata:

C// Navigating from a raw PE buffer to the critical headers
PIMAGE_DOS_HEADER dos = (PIMAGE_DOS_HEADER)pe_buffer;
PIMAGE_NT_HEADERS nt  = (PIMAGE_NT_HEADERS)((BYTE*)dos + dos->e_lfanew);

// Optional header contains the key fields for loading
DWORD image_base     = nt->OptionalHeader.ImageBase;
DWORD size_of_image  = nt->OptionalHeader.SizeOfImage;
DWORD entry_point    = nt->OptionalHeader.AddressOfEntryPoint;
WORD  num_sections   = nt->FileHeader.NumberOfSections;
DWORD section_align  = nt->OptionalHeader.SectionAlignment;
DWORD file_align     = nt->OptionalHeader.FileAlignment;
FieldPurposeTypical Value
ImageBasePreferred virtual address for the image0x00400000 (EXE) / 0x10000000 (DLL)
SizeOfImageTotal virtual size when fully mappedVaries by binary
AddressOfEntryPointRVA of the entry point functionOffset from image base
SectionAlignmentAlignment of sections in memory0x1000 (4 KB page)
FileAlignmentAlignment of sections in the file on disk0x200 (512 bytes)

2. Step 1: Allocate and Map Sections

The first job of a PE loader is to allocate memory for the entire image and copy each section to its correct virtual offset. Sections in a PE file have different offsets on disk (file alignment) versus in memory (section alignment).

C// Allocate memory for the entire image
LPVOID base = VirtualAlloc(
    NULL,
    nt->OptionalHeader.SizeOfImage,
    MEM_COMMIT | MEM_RESERVE,
    PAGE_READWRITE  // Start as RW, fix permissions later
);

// Copy PE headers (everything before the first section)
memcpy(base, pe_buffer, nt->OptionalHeader.SizeOfHeaders);

// Map each section to its virtual address
PIMAGE_SECTION_HEADER sec = IMAGE_FIRST_SECTION(nt);
for (WORD i = 0; i < nt->FileHeader.NumberOfSections; i++) {
    if (sec[i].SizeOfRawData > 0) {
        memcpy(
            (BYTE*)base + sec[i].VirtualAddress,   // Destination: VA offset
            pe_buffer   + sec[i].PointerToRawData,  // Source: file offset
            sec[i].SizeOfRawData                     // Size on disk
        );
    }
}

Virtual vs. File Alignment

A section might start at file offset 0x400 but virtual address 0x1000. The loader must use VirtualAddress for the destination (memory layout) and PointerToRawData for the source (file layout). Getting this wrong is the single most common PE loader bug.

3. Step 2: Apply Base Relocations

If the image cannot be loaded at its preferred ImageBase (which is almost always the case during injection), all absolute addresses in the code must be adjusted. The relocation table (IMAGE_DIRECTORY_ENTRY_BASERELOC) lists every location that needs patching.

C// Calculate the delta between actual and preferred base
ULONG_PTR delta = (ULONG_PTR)base - nt->OptionalHeader.ImageBase;

if (delta != 0) {
    PIMAGE_DATA_DIRECTORY reloc_dir =
        &nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC];
    PIMAGE_BASE_RELOCATION reloc =
        (PIMAGE_BASE_RELOCATION)((BYTE*)base + reloc_dir->VirtualAddress);

    while (reloc->VirtualAddress && reloc->SizeOfBlock) {
        DWORD count = (reloc->SizeOfBlock - sizeof(IMAGE_BASE_RELOCATION)) / 2;
        WORD *entry = (WORD*)(reloc + 1);

        for (DWORD i = 0; i < count; i++) {
            WORD type   = entry[i] >> 12;
            WORD offset = entry[i] & 0x0FFF;

            if (type == IMAGE_REL_BASED_DIR64) {
                // 64-bit relocation: add delta to the 8-byte value
                *(ULONG_PTR*)((BYTE*)base + reloc->VirtualAddress + offset) += delta;
            } else if (type == IMAGE_REL_BASED_HIGHLOW) {
                // 32-bit relocation: add delta to the 4-byte value
                *(DWORD*)((BYTE*)base + reloc->VirtualAddress + offset) += (DWORD)delta;
            }
            // IMAGE_REL_BASED_ABSOLUTE (type 0) = padding, skip
        }
        reloc = (PIMAGE_BASE_RELOCATION)((BYTE*)reloc + reloc->SizeOfBlock);
    }
}

Relocation Process

Preferred Base
0x00400000
Actual Base
0x01A80000
Delta
0x01680000
Patch all entries
+= delta

4. Step 3: Resolve Imports

The import table lists every external function the PE calls. The loader must walk the Import Directory, load each required DLL, and write the resolved function addresses into the Import Address Table (IAT).

CPIMAGE_DATA_DIRECTORY imp_dir =
    &nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
PIMAGE_IMPORT_DESCRIPTOR imp =
    (PIMAGE_IMPORT_DESCRIPTOR)((BYTE*)base + imp_dir->VirtualAddress);

while (imp->Name) {
    // Load the required DLL
    char *dll_name = (char*)((BYTE*)base + imp->Name);
    HMODULE hDll = LoadLibraryA(dll_name);

    // Walk the thunk arrays
    PIMAGE_THUNK_DATA orig = (PIMAGE_THUNK_DATA)((BYTE*)base + imp->OriginalFirstThunk);
    PIMAGE_THUNK_DATA first = (PIMAGE_THUNK_DATA)((BYTE*)base + imp->FirstThunk);

    while (orig->u1.AddressOfData) {
        if (IMAGE_SNAP_BY_ORDINAL(orig->u1.Ordinal)) {
            // Import by ordinal
            first->u1.Function = (ULONG_PTR)GetProcAddress(
                hDll, MAKEINTRESOURCEA(IMAGE_ORDINAL(orig->u1.Ordinal)));
        } else {
            // Import by name
            PIMAGE_IMPORT_BY_NAME name =
                (PIMAGE_IMPORT_BY_NAME)((BYTE*)base + orig->u1.AddressOfData);
            first->u1.Function = (ULONG_PTR)GetProcAddress(hDll, name->Name);
        }
        orig++;
        first++;
    }
    imp++;
}

Import by Name vs. Ordinal

Most imports are by name (a string like "CreateFileW"). Some imports use ordinals (numeric IDs). The high bit of the thunk value distinguishes the two cases: if set, it is an ordinal import; otherwise, it points to an IMAGE_IMPORT_BY_NAME structure containing the function name string.

5. Step 4: Set Section Permissions

After loading, each section must have its memory protection set according to its characteristics flags. The .text section needs PAGE_EXECUTE_READ, .rdata needs PAGE_READONLY, and .data needs PAGE_READWRITE.

Cfor (WORD i = 0; i < nt->FileHeader.NumberOfSections; i++) {
    DWORD protect = PAGE_READONLY;
    DWORD chars   = sec[i].Characteristics;

    BOOL is_exec  = (chars & IMAGE_SCN_MEM_EXECUTE) != 0;
    BOOL is_write = (chars & IMAGE_SCN_MEM_WRITE)   != 0;
    BOOL is_read  = (chars & IMAGE_SCN_MEM_READ)    != 0;

    if (is_exec && is_write)      protect = PAGE_EXECUTE_READWRITE;
    else if (is_exec && is_read)   protect = PAGE_EXECUTE_READ;
    else if (is_exec)               protect = PAGE_EXECUTE;
    else if (is_write)              protect = PAGE_READWRITE;
    else if (is_read)               protect = PAGE_READONLY;

    DWORD old;
    VirtualProtect(
        (BYTE*)base + sec[i].VirtualAddress,
        sec[i].Misc.VirtualSize,
        protect, &old
    );
}

6. Step 5: TLS Callbacks and Entry Point

Before calling the main entry point, the loader must execute any Thread Local Storage (TLS) callbacks registered in the TLS directory. Then it calls the entry point, which differs based on the PE type:

PE TypeEntry Point SignatureHow Called
EXEint main() / WinMain()Direct call, no arguments needed for basic execution
DLLBOOL DllMain(HINSTANCE, DWORD, LPVOID)Called with DLL_PROCESS_ATTACH reason
C// Execute TLS callbacks if present
PIMAGE_DATA_DIRECTORY tls_dir =
    &nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_TLS];
if (tls_dir->Size) {
    PIMAGE_TLS_DIRECTORY tls =
        (PIMAGE_TLS_DIRECTORY)((BYTE*)base + tls_dir->VirtualAddress);
    PIMAGE_TLS_CALLBACK *callback =
        (PIMAGE_TLS_CALLBACK*)tls->AddressOfCallBacks;
    while (callback && *callback) {
        (*callback)((PVOID)base, DLL_PROCESS_ATTACH, NULL);
        callback++;
    }
}

// Call the entry point
DWORD_PTR ep = (DWORD_PTR)base + nt->OptionalHeader.AddressOfEntryPoint;
if (is_dll) {
    typedef BOOL (WINAPI *DllMainFunc)(HINSTANCE, DWORD, LPVOID);
    ((DllMainFunc)ep)((HINSTANCE)base, DLL_PROCESS_ATTACH, NULL);
} else {
    typedef int (*ExeMainFunc)(void);
    ((ExeMainFunc)ep)();
}

7. How Donut Differs from a Standard Loader

Donut’s PE loader in inmem_pe.c follows the same five steps, but with critical differences that make it work as PIC shellcode:

PIC-Specific Adaptations

8. The Complete Loading Flow

PE Loading Pipeline (Donut inmem_pe.c)

Decrypt
Chaskey CTR
Decompress
aPLib / LZNT1
Map Sections
Relocations
Imports
Permissions
Entry Point

Knowledge Check

1. Why must base relocations be applied when loading a PE at a non-preferred address?

When a PE is compiled, the linker embeds absolute addresses assuming the image will load at ImageBase. If it loads elsewhere, the delta between actual and preferred base must be added to every relocated address.

2. What is the correct order of PE loading operations?

Sections must be mapped first (so the image exists in memory), then relocations fix absolute addresses, then imports are resolved (requiring the relocated image), then permissions are set, and finally the entry point is called.

3. How does Donut’s PIC loader resolve API functions without calling GetProcAddress?

The PIC loader walks the PEB’s InMemoryOrderModuleList to find loaded DLLs (like kernel32.dll and ntdll.dll), then traverses their export tables comparing function name hashes to pre-computed values stored in the DONUT_INSTANCE.