Difficulty: Beginner

Module 2: COFF File Format Deep Dive

Every byte matters: headers, sections, symbols, strings, and relocations in raw binary.

Why This Module?

COFFLoader is fundamentally a COFF parser. To understand how it loads and executes BOFs, you must first understand the binary layout it parses. This module walks through the five major components of a COFF object file: the file header, section table, symbol table, string table, and relocation entries. These are the exact structures defined in COFFLoader.h that the loader reads.

COFF Binary Layout

A COFF object file has a well-defined binary layout. Unlike a PE, there is no DOS header, no PE signature, and no optional header. The file begins immediately with the COFF file header.

TEXTCOFF Object File Layout:

Offset 0x00:  +--------------------------+
              | COFF File Header         |  20 bytes
              | (coff_file_header_t)     |
              +--------------------------+
              | Section Header 1         |  40 bytes each
              | (coff_sect_t)            |
              +--------------------------+
              | Section Header 2         |
              +--------------------------+
              | ...                      |
              +--------------------------+
              | Section Header N         |
              +--------------------------+
              | Section 1 Raw Data       |  variable size
              +--------------------------+
              | Section 1 Relocations    |  10 bytes each
              +--------------------------+
              | Section 2 Raw Data       |
              +--------------------------+
              | Section 2 Relocations    |
              +--------------------------+
              | ...                      |
              +--------------------------+
              | Symbol Table             |  18 bytes per symbol
              +--------------------------+
              | String Table             |  variable size
              +--------------------------+

The COFF File Header

The file header is always 20 bytes and sits at offset 0. COFFLoader defines it as coff_file_header_t:

Ctypedef struct coff_file_header {
    uint16_t Machine;              // 0x8664 = AMD64, 0x14C = i386
    uint16_t NumberOfSections;     // how many section headers follow
    uint32_t TimeDateStamp;        // compilation timestamp (often zero)
    uint32_t PointerToSymbolTable; // file offset to the symbol table
    uint32_t NumberOfSymbols;      // total entries in symbol table
    uint16_t SizeOfOptionalHeader; // always 0 for object files
    uint16_t Characteristics;      // flags (usually 0 for .obj)
} coff_file_header_t;
FieldOffsetSizePurpose
Machine0x002Target architecture. COFFLoader checks for 0x8664 (AMD64)
NumberOfSections0x022Count of section headers immediately following this header
TimeDateStamp0x044Unix timestamp of compilation. Not used by the loader
PointerToSymbolTable0x084File offset to the symbol table. Critical for symbol resolution
NumberOfSymbols0x0C4Number of entries (including aux symbols). Used to locate the string table
SizeOfOptionalHeader0x102Always 0 for object files (no optional header)
Characteristics0x122Flags. Usually 0 for unlinked objects

Key Insight: Finding the String Table

The string table immediately follows the symbol table. Since each symbol entry is exactly 18 bytes, the string table starts at: PointerToSymbolTable + (NumberOfSymbols * 18). The first 4 bytes of the string table are a uint32_t giving the total size of the string table (including those 4 bytes). Symbol names longer than 8 characters are stored here and referenced by offset.

The Section Table

Immediately following the 20-byte file header is an array of section headers. Each header is 40 bytes, defined as coff_sect_t:

C#pragma pack(push, 1)
typedef struct coff_sect {
    char     Name[8];                // section name (e.g., ".text\0\0\0")
    uint32_t VirtualSize;            // 0 for object files
    uint32_t VirtualAddress;         // 0 for object files
    uint32_t SizeOfRawData;          // size of section data in the file
    uint32_t PointerToRawData;       // file offset to the raw data
    uint32_t PointerToRelocations;   // file offset to relocation entries
    uint32_t PointerToLineNumbers;   // file offset to line numbers (usually 0)
    uint16_t NumberOfRelocations;    // count of relocation entries for this section
    uint16_t NumberOfLinenumbers;    // count of line number entries (usually 0)
    uint32_t Characteristics;        // flags: executable, readable, writable, etc.
} coff_sect_t;
#pragma pack(pop)

Common Sections in a BOF

SectionCharacteristicsContent
.textCODE | EXECUTE | READCompiled machine code (the go() function and helpers)
.dataINITIALIZED | READ | WRITEInitialized global/static variables
.rdataINITIALIZED | READRead-only data: string literals, constant tables
.bssUNINITIALIZED | READ | WRITEZero-initialized globals. SizeOfRawData is 0 (no file data)
.xdataINITIALIZED | READException handling unwind data (x64)
.pdataINITIALIZED | READFunction table for structured exception handling

Section Characteristics Flags

The Characteristics field is a bitmask. COFFLoader defines the relevant flags:

C#define IMAGE_SCN_CNT_CODE               0x00000020  // section contains code
#define IMAGE_SCN_CNT_UNINITIALIZED_DATA 0x00000080  // section contains uninitialized data (.bss)
#define IMAGE_SCN_MEM_EXECUTE            0x20000000  // section is executable
#define IMAGE_SCN_MEM_READ               0x40000000  // section is readable
#define IMAGE_SCN_MEM_WRITE              0x80000000  // section is writable
#define IMAGE_SCN_MEM_DISCARDABLE        0x02000000  // section can be discarded

The Symbol Table

The symbol table is an array of 18-byte entries located at the file offset specified by PointerToSymbolTable. Each entry is defined as coff_sym_t:

Ctypedef struct coff_sym {
    union {
        char     Name[8];      // short name (if <= 8 chars)
        uint32_t value[2];     // value[0]==0 means value[1] is string table offset
    } first;
    uint32_t Value;            // value depends on StorageClass and SectionNumber
    uint16_t SectionNumber;    // 1-based index of the section, or special values
    uint16_t Type;             // symbol type (0x20 = function)
    uint8_t  StorageClass;     // IMAGE_SYM_CLASS_EXTERNAL (2), STATIC (3), etc.
    uint8_t  NumberOfAuxSymbols; // number of auxiliary symbol entries that follow
} coff_sym_t;

Symbol Name Resolution

Symbol names can be stored in two ways, depending on length:

TEXTIf the name is 8 characters or shorter:
  first.Name[0..7] contains the name directly (null-padded)

If the name is longer than 8 characters:
  first.value[0] == 0x00000000   (sentinel: first 4 bytes are zero)
  first.value[1] == offset into string table

Example: Symbol name "__imp_KERNEL32$GetCurrentProcessId"
  first.value[0] = 0x00000000
  first.value[1] = 0x0000004A  --> string table offset 0x4A

Important Symbol Fields

FieldKey ValuesMeaning
SectionNumber1, 2, 3...1-based index of the section containing this symbol
SectionNumber0IMAGE_SYM_UNDEFINED -- external symbol, must be resolved
StorageClass2 (EXTERNAL)Symbol is globally visible or needs to be imported
StorageClass3 (STATIC)Symbol is local to the section (e.g., section name)
Value(offset)For defined symbols: offset within the section. For undefined: 0
NumberOfAuxSymbols0 or 1Auxiliary entries follow (e.g., section definition aux records)

How COFFLoader Classifies Symbols

COFFLoader uses two helper functions to classify symbols. A symbol is defined if its SectionNumber is greater than 0 (it exists in a section). A symbol is external if its StorageClass is IMAGE_SYM_CLASS_EXTERNAL (2). An external symbol with SectionNumber == 0 is an unresolved import that must be linked at load time.

C// From COFFLoader -- symbol classification helpers
int coff_symbol_is_defined(coff_sym_t* sym) {
    return (sym->SectionNumber > 0);
}

int coff_symbol_is_external(coff_sym_t* sym) {
    return (sym->StorageClass == IMAGE_SYM_CLASS_EXTERNAL);  // StorageClass == 2
}

The String Table

The string table immediately follows the symbol table. Its structure is simple:

TEXTString Table Layout:
  Offset 0: uint32_t Size;          // total size of string table (including this field)
  Offset 4: char[] strings;         // null-terminated strings packed sequentially

Example:
  04 00 00 00  2E 74 65 78  74 00 5F 67  6F 00 5F 5F   ....text._go.__
  69 6D 70 5F  4B 45 52 4E  45 4C 33 32  24 47 65 74   imp_KERNEL32$Get
  ...

Symbols reference strings by offset from the START of the string table.
  value[1] = 4  --> ".text"
  value[1] = 10 --> "_go"
  value[1] = 14 --> "__imp_KERNEL32$GetCurrentProcessId"

The first 4 bytes are the size field itself, so valid string offsets start at 4. If the string table only contains the size field (size == 4), there are no long symbol names.

Relocation Entries

Each section can have its own relocation table. The relocation entries tell the loader which bytes in the section need to be patched once the final addresses of symbols are known. Each entry is 10 bytes:

Ctypedef struct coff_reloc {
    uint32_t VirtualAddress;      // offset within the section to patch
    uint32_t SymbolTableIndex;    // index into the symbol table
    uint16_t Type;                // relocation type (architecture-specific)
} coff_reloc_t;
FieldSizePurpose
VirtualAddress4Byte offset within the section where the fixup must be applied
SymbolTableIndex4Index into the symbol table identifying the target symbol
Type2How to compute the fixup value (architecture-dependent)

AMD64 Relocation Types (from COFFLoader.h)

C#define IMAGE_REL_AMD64_ADDR64    0x0001  // 64-bit absolute address
#define IMAGE_REL_AMD64_ADDR32NB  0x0003  // 32-bit address without image base (RVA)
#define IMAGE_REL_AMD64_REL32     0x0004  // 32-bit relative (RIP-relative)
#define IMAGE_REL_AMD64_REL32_1   0x0005  // REL32 + 1 byte displacement
#define IMAGE_REL_AMD64_REL32_2   0x0006  // REL32 + 2 byte displacement
#define IMAGE_REL_AMD64_REL32_3   0x0007  // REL32 + 3 bytes displacement
#define IMAGE_REL_AMD64_REL32_4   0x0008  // REL32 + 4 bytes displacement
#define IMAGE_REL_AMD64_REL32_5   0x0009  // REL32 + 5 bytes displacement

The REL32 type is the most common in x64 BOFs. It computes: target_address - (fixup_address + 4). The variants REL32_1 through REL32_5 add an additional displacement of 1-5 bytes to account for instruction encodings where the relocation is not the last part of the instruction.

Putting It All Together

When the compiler generates a call to BeaconPrintf, it emits a CALL instruction with a placeholder 32-bit offset, a symbol table entry for __imp_BeaconPrintf (or the architecture-specific variant), and a relocation entry pointing from the CALL instruction to the symbol. At load time, COFFLoader resolves the symbol to an actual memory address and patches the CALL instruction's offset to reach that address.

Visualizing a Real BOF

Here is what a minimal BOF looks like at the binary level after compilation:

TEXTSource: void go(char* a, int l) { BeaconPrintf(0, "hello"); }

After compilation (x86_64-w64-mingw32-gcc -c):

COFF Header:       Machine=0x8664, Sections=4, Symbols=12
Section 1: .text   Size=0x2A, 1 relocation  (the go() code)
Section 2: .data   Size=0x00, 0 relocations (empty)
Section 3: .rdata  Size=0x06, 0 relocations ("hello\0")
Section 4: .xdata  Size=0x08, 0 relocations (unwind info)

Symbol Table:
  [0] .text     Section=1, Class=STATIC, Value=0
  [2] .data     Section=2, Class=STATIC, Value=0
  [4] .rdata    Section=3, Class=STATIC, Value=0
  [6] go        Section=1, Class=EXTERNAL, Value=0  <-- entry point
  [7] __imp_BeaconPrintf  Section=0, Class=EXTERNAL  <-- unresolved import

Relocations for .text:
  Offset=0x0F, Symbol=4 (.rdata), Type=REL32   <-- reference to "hello" string
  Offset=0x1A, Symbol=7 (__imp_BeaconPrintf), Type=REL32  <-- call to BeaconPrintf

Pop Quiz: COFF Format

Q1: How does COFFLoader find the string table in a COFF file?

The string table immediately follows the symbol table. Since each symbol entry is exactly 18 bytes, the string table offset is PointerToSymbolTable + (NumberOfSymbols * sizeof(coff_sym_t)). The first 4 bytes of the string table give its total size.

Q2: A symbol has SectionNumber=0 and StorageClass=2. What does this mean?

StorageClass 2 is IMAGE_SYM_CLASS_EXTERNAL. SectionNumber 0 means IMAGE_SYM_UNDEFINED -- the symbol is not defined in any section of this object file. Combined, this means it is an external import that the loader must resolve (e.g., a DLL function or Beacon API call).

Q3: How are symbol names longer than 8 characters stored in the COFF symbol table?

In the coff_sym_t union, if first.value[0] is zero, then first.value[1] is an offset into the string table where the full null-terminated name is stored. This is the standard COFF convention for long symbol names, and COFFLoader uses this to look up names like "__imp_KERNEL32$GetProcAddress".