Difficulty: Beginner

Module 3: PEB Walking & API Hashing

How to find any Windows API without a single suspicious import.

The Problem

Normal programs #include <windows.h> and call functions directly. The linker adds them to the import table, and anyone running strings on the binary can see exactly what APIs it uses. A binary importing NtAllocateVirtualMemory, NtProtectVirtualMemory, and NtQueueApcThread screams "shellcode loader." AceLdr imports zero functions. It finds everything at runtime.

The PEB: Your Window Into the Process

Every Windows thread has a TEB (Thread Environment Block). The TEB contains a pointer to the PEB (Process Environment Block). The PEB contains, among other things, a linked list of all loaded DLLs. AceLdr walks this list to find modules like ntdll.dll.

PEB Module Discovery Chain

TEB
(gs:[0x60] on x64)
PEB
PEB_LDR_DATA
InLoadOrderModuleList
(linked list of LDR_DATA_TABLE_ENTRY)

TEB and PEB Access on x64

On x64 Windows, the TEB is pointed to by the gs segment register. The PEB pointer is at offset 0x60 within the TEB. So gs:[0x60] gives you the PEB address directly. In C code, this is equivalent to NtCurrentTeb()->ProcessEnvironmentBlock. The PEB then contains a Ldr field (a PEB_LDR_DATA pointer) which holds the linked lists of loaded modules.

Walking the Module List

The PEB_LDR_DATA structure contains three doubly-linked lists of loaded modules: InLoadOrderModuleList, InMemoryOrderModuleList, and InInitializationOrderModuleList. AceLdr uses the InLoadOrderModuleList. Each entry in the list is an LDR_DATA_TABLE_ENTRY containing the DLL's base address, name, and size.

C - from util.c FindModule()PVOID FindModule( ULONG hash, PPEB peb, PULONG size )
{
    PLIST_ENTRY             Hdr = NULL;
    PLIST_ENTRY             Ent = NULL;
    PLDR_DATA_TABLE_ENTRY   Ldr = NULL;

    // Get head of the doubly-linked list
    Hdr = & peb->Ldr->InLoadOrderModuleList;
    Ent = Hdr->Flink;  // First entry

    // Walk the list until we wrap around to the head
    for( ; Hdr != Ent; Ent = Ent->Flink )
    {
        Ldr = C_PTR( Ent );
        // Compare hash of this module's name to our target hash
        if( HashString( Ldr->BaseDllName.Buffer,
                        Ldr->BaseDllName.Length ) == hash )
        {
            if( size != NULL )
                *size = Ldr->SizeOfImage;
            return Ldr->DllBase;  // Return base address of the DLL
        }
    }
    return NULL;
}

Why InLoadOrderModuleList?

The first two modules in the load order list are always the executable itself and ntdll.dll. Since AceLdr primarily needs ntdll.dll (for native API functions), using the load-order list is predictable and reliable. The hash comparison adds flexibility: AceLdr can find any module by name without hardcoding positions in the list.

DJB2 Hashing: Names Without Strings

Instead of storing the string "NtAllocateVirtualMemory" (which defenders can search for), AceLdr stores its DJB2 hash: 0xf783b8ec. At runtime, it hashes every export name until it finds a match.

The DJB2 algorithm (created by Daniel J. Bernstein) is simple and fast:

  1. Start with the magic value 5381
  2. For each character: hash = hash * 33 + character
  3. The multiplication by 33 is done as (hash << 5) + hash for speed
  4. AceLdr converts to uppercase first for case-insensitive matching
C - from util.c HashString()UINT32 HashString( PVOID buffer, ULONG size )
{
    UCHAR  Cur = 0;
    ULONG  Djb = 5381;   // DJB2 magic starting value
    PUCHAR Ptr = buffer;

    while ( TRUE )
    {
        Cur = *Ptr;
        if( !size ) { if( !*Ptr ) break; }
        else { if( (ULONG)(Ptr - (PUCHAR)buffer) >= size ) break; }

        if( Cur >= 'a' ) Cur -= 0x20;  // Case-insensitive (to uppercase)

        Djb = (( Djb << 5 ) + Djb ) + Cur;  // hash * 33 + char
        ++Ptr;
    }
    return Djb;
}

Python Companion Script

AceLdr includes a Python script to pre-compute hashes at build time. This is how the #define constants in include.h are generated:

Python - scripts/hashstring.py#!/usr/bin/env python3
# The same algorithm in Python - used to pre-compute hashes
def hash_string( string ):
    hash = 5381
    for x in string.upper():
        hash = (( hash << 5 ) + hash ) + ord(x)
    return hash & 0xFFFFFFFF

# Example: python hashstring.py "NtAllocateVirtualMemory"
# Output:  0xf783b8ec

Why DJB2 Specifically?

DJB2 is popular in shellcode for several reasons: it's trivial to implement (a few instructions), it produces few collisions for typical Windows API names, and the 32-bit output is compact. The tradeoff is that since DJB2 is well-known, defenders can pre-compute hash tables for all Windows API names and reverse-lookup suspicious constants. Tools like HashDB do exactly this. AceLdr accepts this risk because the hashes are compiled into position-independent code that's harder to statically analyze than a normal import table.

Finding Exported Functions

Once we have a module's base address (from FindModule), we walk its export table to find specific functions. The export directory contains three parallel arrays:

C - from util.c FindFunction()PVOID FindFunction( PVOID image, ULONG hash )
{
    // Navigate PE headers to the export directory
    Hdr = image;
    Nth = image + Hdr->e_lfanew;
    Dir = &Nth->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
    Exp = image + Dir->VirtualAddress;

    // Three parallel arrays: names, functions, ordinals
    Aon = image + Exp->AddressOfNames;
    Aof = image + Exp->AddressOfFunctions;
    Aoo = image + Exp->AddressOfNameOrdinals;

    for( Idx = 0; Idx < Exp->NumberOfNames; ++Idx )
    {
        // Hash each export name, compare to our target
        if( HashString( image + Aon[Idx], 0 ) == hash )
            return image + Aof[ Aoo[Idx] ];  // Found it!
    }
    return NULL;
}

Complete API Resolution Chain

TEB
PEB
Walk module list
(FindModule)
Walk export table
(FindFunction)
Function pointer
ready to call!

AceLdr's Hash Constants

These are pre-computed in include.h using the Python script and matched at runtime:

C - from include.h (selected hashes)#define H_LIB_NTDLL                  0x1edab0ed  // "ntdll.dll"
#define H_LIB_KERNEL32               0x6ddb9555  // "kernel32.dll"
#define H_API_NTALLOCATEVIRTUALMEMORY 0xf783b8ec
#define H_API_NTPROTECTVIRTUALMEMORY  0x50e92888
#define H_API_RTLCREATEHEAP           0xe1af6849
#define H_API_SLEEP                   0x0e07cd7e
#define H_API_GETPROCESSHEAP          0x36c007a2
// ... 50+ more hashes

The Full Resolution Flow in Practice

When AceLdr starts, resolveLoaderFunctions() calls FindModule(H_LIB_NTDLL, peb, NULL) to get ntdll.dll's base, then calls FindFunction(ntdllBase, H_API_NTALLOCATEVIRTUALMEMORY) to get the actual function pointer. This is stored in an API structure and called throughout AceLdr's loading sequence. The entire process happens without a single import table entry.

Pop Quiz: PEB & Hashing

Q1: Why use DJB2 hashing instead of storing API name strings?

Static analysis tools like YARA rules and strings analysis look for suspicious API names. By using hashes, the binary contains only opaque 32-bit integers that can't be easily identified. The tradeoff is a runtime cost of hashing every export name for comparison.

Q2: The PEB is accessed through which structure?

On x64 Windows, the TEB is accessible via the gs segment register. The PEB pointer is at offset 0x60 within the TEB. So gs:[0x60] gives you the PEB address, which then provides access to loaded module lists and other process information.

Q3: The DJB2 hash starts with a "magic number." What is it?

DJB2 uses 5381 as its initial hash value. This was chosen by Daniel J. Bernstein and provides good distribution with minimal collisions for typical string inputs.