Difficulty: Intermediate

Module 5: Symbol Resolution & Linking

From unresolved names to live function pointers: the runtime linker inside COFFLoader.

Why This Module?

After section data is loaded into memory (Module 4), the code still contains unresolved references. Every call to BeaconPrintf, every reference to KERNEL32$GetCurrentProcessId -- these are just symbol names. COFFLoader must resolve each symbol to an actual memory address. This module covers the process_symbol() function and the three categories of symbols it handles.

Three Categories of Symbols

When COFFLoader encounters a symbol during relocation processing, it must determine what kind of symbol it is and resolve it accordingly. There are three categories:

CategoryHow to IdentifyResolution Method
Internal (section-defined)SectionNumber > 0sectionMapping[SectionNumber - 1] + Value
Beacon APIName starts with __imp_Beacon or matches InternalFunctions tableLook up in the InternalFunctions[30] table
DLL ImportName contains $ separator (LIBRARY$Function)LoadLibraryA + GetProcAddress

Symbol Name Retrieval

Before processing a symbol, COFFLoader must retrieve its name. Recall from Module 2 that names can be stored inline (up to 8 chars) or in the string table:

C// Getting the symbol name from a coff_sym_t entry
char* get_symbol_name(coff_sym_t* sym, char* string_table) {
    if (sym->first.value[0] != 0) {
        // Short name: stored inline in the 8-byte Name field
        // Note: may NOT be null-terminated if exactly 8 chars
        return sym->first.Name;  // up to 8 characters
    } else {
        // Long name: first.value[0]==0 means first.value[1] is string table offset
        return string_table + sym->first.value[1];
    }
}

The __imp_ Prefix Convention

This is one of the most critical details in COFF loading. When a BOF declares an imported function with DECLSPEC_IMPORT (__declspec(dllimport)), the compiler generates a symbol with the __imp_ prefix:

C// BOF source declares:
DECLSPEC_IMPORT DWORD WINAPI KERNEL32$GetCurrentProcessId(void);

// Compiler generates symbol: __imp_KERNEL32$GetCurrentProcessId  (x64)
// On x86, it would be:       __imp__KERNEL32$GetCurrentProcessId (extra underscore)

// The __imp_ prefix tells the loader: "this symbol is an INDIRECT reference"
// The BOF code does NOT call the function directly.
// Instead, it reads a function pointer from a known address and calls through it.

Direct vs. Indirect Calls

Without __declspec(dllimport), the compiler would generate a direct CALL to the symbol. With it, the compiler generates an indirect call through a pointer: CALL [rip + offset_to___imp_symbol]. The __imp_ symbol resolves to a memory location that contains the function address (a pointer-to-function), not the function itself. This is why COFFLoader stores resolved addresses in the functionMapping table -- the code reads the pointer from that table.

TEXTHow __imp_ works at the machine code level:

Without dllimport:
  E8 xx xx xx xx    CALL function_address    ; direct call (REL32)

With dllimport (__imp_ prefix):
  FF 15 xx xx xx xx CALL [rip + offset]      ; indirect call through pointer

The [rip + offset] points to a slot in functionMapping that contains
the actual address of the function. The loader fills this slot during
symbol resolution.

The process_symbol() Function

COFFLoader's process_symbol() handles all three symbol categories. Here is its logic flow:

C// Simplified process_symbol() logic
void* process_symbol(char* symbolName) {

    // 1. Check if it is a Beacon internal function
    //    Strip the __imp_ prefix first, then check the InternalFunctions table
    char* cleanName = symbolName;
    if (starts_with(symbolName, "__imp_")) {
        cleanName = symbolName + 6;  // skip "__imp_"
    }
    // On x86: skip "__imp__" (7 chars) due to extra underscore

    // Check against InternalFunctions[30] table
    for (int i = 0; i < 30; i++) {
        if (InternalFunctions[i][0] != NULL) {
            if (strcmp(cleanName, (char*)InternalFunctions[i][0]) == 0) {
                // Found it -- return the function pointer
                return (void*)InternalFunctions[i][1];
            }
        }
    }

    // 2. Not a Beacon function -- must be a DLL import
    //    Parse the LIBRARY$Function format
    char  libraryName[256];
    char  functionName[256];
    // Split cleanName on '$' character
    // e.g., "KERNEL32$GetCurrentProcessId" -> library="KERNEL32", function="GetCurrentProcessId"

    HMODULE hLib = LoadLibraryA(libraryName);
    if (hLib == NULL) return NULL;

    void* addr = GetProcAddress(hLib, functionName);
    return addr;
}

The InternalFunctions Table

COFFLoader maintains a static array of 30 entries mapping Beacon API function names to their implementation addresses. This table is populated before RunCOFF() processes any symbols:

C// Declared in beacon_compatibility.h:
extern unsigned char* InternalFunctions[30][2];

// Each entry is: { "FunctionName", function_pointer }
// Populated in RunCOFF() before relocation processing:

InternalFunctions[0][0] = (unsigned char*)"BeaconDataParse";
InternalFunctions[0][1] = (unsigned char*)&BeaconDataParse;

InternalFunctions[1][0] = (unsigned char*)"BeaconDataInt";
InternalFunctions[1][1] = (unsigned char*)&BeaconDataInt;

InternalFunctions[2][0] = (unsigned char*)"BeaconDataShort";
InternalFunctions[2][1] = (unsigned char*)&BeaconDataShort;

InternalFunctions[3][0] = (unsigned char*)"BeaconDataLength";
InternalFunctions[3][1] = (unsigned char*)&BeaconDataLength;

InternalFunctions[4][0] = (unsigned char*)"BeaconDataExtract";
InternalFunctions[4][1] = (unsigned char*)&BeaconDataExtract;

InternalFunctions[5][0] = (unsigned char*)"BeaconFormatAlloc";
InternalFunctions[5][1] = (unsigned char*)&BeaconFormatAlloc;

InternalFunctions[6][0] = (unsigned char*)"BeaconFormatReset";
InternalFunctions[6][1] = (unsigned char*)&BeaconFormatReset;

InternalFunctions[7][0] = (unsigned char*)"BeaconFormatFree";
InternalFunctions[7][1] = (unsigned char*)&BeaconFormatFree;

InternalFunctions[8][0] = (unsigned char*)"BeaconFormatAppend";
InternalFunctions[8][1] = (unsigned char*)&BeaconFormatAppend;

InternalFunctions[9][0] = (unsigned char*)"BeaconFormatPrintf";
InternalFunctions[9][1] = (unsigned char*)&BeaconFormatPrintf;

InternalFunctions[10][0] = (unsigned char*)"BeaconFormatToString";
InternalFunctions[10][1] = (unsigned char*)&BeaconFormatToString;

InternalFunctions[11][0] = (unsigned char*)"BeaconFormatInt";
InternalFunctions[11][1] = (unsigned char*)&BeaconFormatInt;

InternalFunctions[12][0] = (unsigned char*)"BeaconPrintf";
InternalFunctions[12][1] = (unsigned char*)&BeaconPrintf;

InternalFunctions[13][0] = (unsigned char*)"BeaconOutput";
InternalFunctions[13][1] = (unsigned char*)&BeaconOutput;

// ... additional entries for BeaconUseToken, BeaconRevertToken,
//     BeaconIsAdmin, BeaconGetSpawnTo, BeaconSpawnTemporaryProcess,
//     BeaconInjectProcess, BeaconInjectTemporaryProcess,
//     BeaconCleanupProcess, toWideChar, etc.

DLL Import Resolution

For symbols that are not Beacon API functions, COFFLoader parses the LIBRARY$Function naming convention:

TEXTSymbol Name Parsing:

Input:  "__imp_KERNEL32$GetCurrentProcessId"
Step 1: Strip __imp_ prefix  -> "KERNEL32$GetCurrentProcessId"
Step 2: Split on '$'          -> library = "KERNEL32", function = "GetCurrentProcessId"
Step 3: LoadLibraryA("KERNEL32")
Step 4: GetProcAddress(hModule, "GetCurrentProcessId")
Result: 0x00007FFA1A2B3C4D (address of GetCurrentProcessId in kernel32.dll)

Input:  "__imp_NTDLL$NtQuerySystemInformation"
Step 1: Strip __imp_         -> "NTDLL$NtQuerySystemInformation"
Step 2: Split on '$'          -> library = "NTDLL", function = "NtQuerySystemInformation"
Step 3: LoadLibraryA("NTDLL")
Step 4: GetProcAddress(hModule, "NtQuerySystemInformation")
Result: 0x00007FFA1B2C3D4E

Ordinal-Based Imports

Some DLL functions are exported by ordinal (a numeric identifier) rather than by name. COFFLoader supports ordinal-based resolution using the LIBRARY$Function@ordinal format. When the symbol contains an @ after the function name, COFFLoader extracts the ordinal number and uses it with GetProcAddress (passing the ordinal as the low-word of the name parameter). This is rare in BOFs but supported for completeness.

Internal Symbol Resolution

Not all symbols require external resolution. Symbols defined within the BOF itself (local functions, static variables, section names) have SectionNumber > 0. These are resolved directly from the sectionMapping array:

C// For a symbol with SectionNumber > 0:
// The symbol is defined in the COFF file itself.
// Its address = base of its section + its Value offset.

if (coff_symbol_is_defined(&symbols[symIdx])) {
    int sectionIndex = symbols[symIdx].SectionNumber - 1;  // 0-based
    void* address = sectionMapping[sectionIndex] + symbols[symIdx].Value;
    // 'address' now points to the symbol in loaded memory
}
TEXTExample: Resolving the "go" function symbol

Symbol table entry:
  Name = "go"
  Value = 0x00        (offset 0 within its section -- it's the first function)
  SectionNumber = 1   (defined in section 1, which is .text)
  StorageClass = 2    (EXTERNAL -- globally visible)

Resolution:
  sectionIndex = 1 - 1 = 0
  address = sectionMapping[0] + 0x00
  address = 0x00007FF8A1230000   (base of .text allocation)

This is the entry point address that COFFLoader will call.

Storing Resolved Addresses

For external symbols (Beacon API and DLL imports) with the __imp_ prefix, the resolved function address is stored in the functionMapping table during relocation processing (Module 6). The table is indexed by a sequential counter that increments for each external function call relocation processed -- not by the symbol table index:

C// Store the resolved address in the function pointer table
// functionMapping is indexed by a sequential counter (functionMappingCount)
// that increments for each external function call relocation processed
uint64_t resolved_addr = (uint64_t)process_symbol(symbolName);

// Write the address into the next available function pointer slot
*(uint64_t*)(functionMapping + functionMappingCount * sizeof(uint64_t)) = resolved_addr;
functionMappingCount++;  // advance to next slot

Symbol Resolution Flow

Symbol Name
from symbol table
process_symbol()
classify & resolve
functionMapping[i]
store pointer

Architecture Differences: x64 vs x86

COFFLoader handles both architectures with a preprocessor-based prefix:

C// The prefix before symbol names varies by architecture:
#ifdef _WIN64
    #define PREPENDSYMBOLVALUE "__imp_"    // x64: __imp_FunctionName
#else
    #define PREPENDSYMBOLVALUE "__imp__"   // x86: __imp__FunctionName (extra _)
#endif

// x86 C calling convention prepends an underscore to all symbol names.
// Combined with __declspec(dllimport), x86 gets __imp__ (double underscore)
// while x64 gets __imp_ (single underscore after imp).
ArchitecturePrefixExample Symbol
x64 (AMD64)__imp___imp_KERNEL32$GetCurrentProcessId
x86 (i386)__imp____imp__KERNEL32$GetCurrentProcessId
x64 entry(none)go
x86 entry__go

The Complete Resolution Path

When COFFLoader processes relocations, it encounters a symbol index. It looks up the symbol in the symbol table, retrieves its name, and calls process_symbol(). For external symbols, the name is checked against the InternalFunctions table (Beacon API), and if not found, parsed as LIBRARY$Function for DLL resolution. The resolved address is stored in functionMapping, and the relocation engine patches the BOF code to reference the correct slot. This is essentially runtime linking -- what ld.exe or link.exe would do at build time, COFFLoader does at load time.

Pop Quiz: Symbol Resolution

Q1: A symbol is named "__imp_ADVAPI32$OpenProcessToken". How does COFFLoader resolve it?

COFFLoader first strips the __imp_ prefix to get "ADVAPI32$OpenProcessToken". It checks the InternalFunctions table (no match -- this is not a Beacon API function). Then it splits on $ to get library="ADVAPI32" and function="OpenProcessToken", calls LoadLibraryA to get the DLL handle, and GetProcAddress to get the function address.

Q2: Why does the compiler generate an __imp_ prefix for dllimport symbols?

The __imp_ prefix is the Microsoft convention for indirect import references. With __declspec(dllimport), the compiler generates code that reads a function pointer from a known location (CALL [rip + offset]) rather than a direct CALL. The __imp_ symbol resolves to the slot containing the pointer, not to the function itself. COFFLoader fills this slot in the functionMapping table.

Q3: How is a symbol defined within the BOF (SectionNumber=1, Value=0x20) resolved?

A symbol with SectionNumber > 0 is defined internally. SectionNumber is 1-based, so section 1 maps to sectionMapping[0]. The Value field (0x20) is the offset within that section. The resolved address is sectionMapping[0] + 0x20, which points directly into the loaded section memory.