Module 2: COFF File Format Deep Dive
Every byte matters: headers, sections, symbols, strings, and relocations in raw binary.
Why This Module?
COFFLoader is fundamentally a COFF parser. To understand how it loads and executes BOFs, you must first understand the binary layout it parses. This module walks through the five major components of a COFF object file: the file header, section table, symbol table, string table, and relocation entries. These are the exact structures defined in COFFLoader.h that the loader reads.
COFF Binary Layout
A COFF object file has a well-defined binary layout. Unlike a PE, there is no DOS header, no PE signature, and no optional header. The file begins immediately with the COFF file header.
TEXTCOFF Object File Layout:
Offset 0x00: +--------------------------+
| COFF File Header | 20 bytes
| (coff_file_header_t) |
+--------------------------+
| Section Header 1 | 40 bytes each
| (coff_sect_t) |
+--------------------------+
| Section Header 2 |
+--------------------------+
| ... |
+--------------------------+
| Section Header N |
+--------------------------+
| Section 1 Raw Data | variable size
+--------------------------+
| Section 1 Relocations | 10 bytes each
+--------------------------+
| Section 2 Raw Data |
+--------------------------+
| Section 2 Relocations |
+--------------------------+
| ... |
+--------------------------+
| Symbol Table | 18 bytes per symbol
+--------------------------+
| String Table | variable size
+--------------------------+
The COFF File Header
The file header is always 20 bytes and sits at offset 0. COFFLoader defines it as coff_file_header_t:
Ctypedef struct coff_file_header {
uint16_t Machine; // 0x8664 = AMD64, 0x14C = i386
uint16_t NumberOfSections; // how many section headers follow
uint32_t TimeDateStamp; // compilation timestamp (often zero)
uint32_t PointerToSymbolTable; // file offset to the symbol table
uint32_t NumberOfSymbols; // total entries in symbol table
uint16_t SizeOfOptionalHeader; // always 0 for object files
uint16_t Characteristics; // flags (usually 0 for .obj)
} coff_file_header_t;
| Field | Offset | Size | Purpose |
|---|---|---|---|
| Machine | 0x00 | 2 | Target architecture. COFFLoader checks for 0x8664 (AMD64) |
| NumberOfSections | 0x02 | 2 | Count of section headers immediately following this header |
| TimeDateStamp | 0x04 | 4 | Unix timestamp of compilation. Not used by the loader |
| PointerToSymbolTable | 0x08 | 4 | File offset to the symbol table. Critical for symbol resolution |
| NumberOfSymbols | 0x0C | 4 | Number of entries (including aux symbols). Used to locate the string table |
| SizeOfOptionalHeader | 0x10 | 2 | Always 0 for object files (no optional header) |
| Characteristics | 0x12 | 2 | Flags. Usually 0 for unlinked objects |
Key Insight: Finding the String Table
The string table immediately follows the symbol table. Since each symbol entry is exactly 18 bytes, the string table starts at: PointerToSymbolTable + (NumberOfSymbols * 18). The first 4 bytes of the string table are a uint32_t giving the total size of the string table (including those 4 bytes). Symbol names longer than 8 characters are stored here and referenced by offset.
The Section Table
Immediately following the 20-byte file header is an array of section headers. Each header is 40 bytes, defined as coff_sect_t:
C#pragma pack(push, 1)
typedef struct coff_sect {
char Name[8]; // section name (e.g., ".text\0\0\0")
uint32_t VirtualSize; // 0 for object files
uint32_t VirtualAddress; // 0 for object files
uint32_t SizeOfRawData; // size of section data in the file
uint32_t PointerToRawData; // file offset to the raw data
uint32_t PointerToRelocations; // file offset to relocation entries
uint32_t PointerToLineNumbers; // file offset to line numbers (usually 0)
uint16_t NumberOfRelocations; // count of relocation entries for this section
uint16_t NumberOfLinenumbers; // count of line number entries (usually 0)
uint32_t Characteristics; // flags: executable, readable, writable, etc.
} coff_sect_t;
#pragma pack(pop)
Common Sections in a BOF
| Section | Characteristics | Content |
|---|---|---|
.text | CODE | EXECUTE | READ | Compiled machine code (the go() function and helpers) |
.data | INITIALIZED | READ | WRITE | Initialized global/static variables |
.rdata | INITIALIZED | READ | Read-only data: string literals, constant tables |
.bss | UNINITIALIZED | READ | WRITE | Zero-initialized globals. SizeOfRawData is 0 (no file data) |
.xdata | INITIALIZED | READ | Exception handling unwind data (x64) |
.pdata | INITIALIZED | READ | Function table for structured exception handling |
Section Characteristics Flags
The Characteristics field is a bitmask. COFFLoader defines the relevant flags:
C#define IMAGE_SCN_CNT_CODE 0x00000020 // section contains code
#define IMAGE_SCN_CNT_UNINITIALIZED_DATA 0x00000080 // section contains uninitialized data (.bss)
#define IMAGE_SCN_MEM_EXECUTE 0x20000000 // section is executable
#define IMAGE_SCN_MEM_READ 0x40000000 // section is readable
#define IMAGE_SCN_MEM_WRITE 0x80000000 // section is writable
#define IMAGE_SCN_MEM_DISCARDABLE 0x02000000 // section can be discarded
The Symbol Table
The symbol table is an array of 18-byte entries located at the file offset specified by PointerToSymbolTable. Each entry is defined as coff_sym_t:
Ctypedef struct coff_sym {
union {
char Name[8]; // short name (if <= 8 chars)
uint32_t value[2]; // value[0]==0 means value[1] is string table offset
} first;
uint32_t Value; // value depends on StorageClass and SectionNumber
uint16_t SectionNumber; // 1-based index of the section, or special values
uint16_t Type; // symbol type (0x20 = function)
uint8_t StorageClass; // IMAGE_SYM_CLASS_EXTERNAL (2), STATIC (3), etc.
uint8_t NumberOfAuxSymbols; // number of auxiliary symbol entries that follow
} coff_sym_t;
Symbol Name Resolution
Symbol names can be stored in two ways, depending on length:
TEXTIf the name is 8 characters or shorter:
first.Name[0..7] contains the name directly (null-padded)
If the name is longer than 8 characters:
first.value[0] == 0x00000000 (sentinel: first 4 bytes are zero)
first.value[1] == offset into string table
Example: Symbol name "__imp_KERNEL32$GetCurrentProcessId"
first.value[0] = 0x00000000
first.value[1] = 0x0000004A --> string table offset 0x4A
Important Symbol Fields
| Field | Key Values | Meaning |
|---|---|---|
| SectionNumber | 1, 2, 3... | 1-based index of the section containing this symbol |
| SectionNumber | 0 | IMAGE_SYM_UNDEFINED -- external symbol, must be resolved |
| StorageClass | 2 (EXTERNAL) | Symbol is globally visible or needs to be imported |
| StorageClass | 3 (STATIC) | Symbol is local to the section (e.g., section name) |
| Value | (offset) | For defined symbols: offset within the section. For undefined: 0 |
| NumberOfAuxSymbols | 0 or 1 | Auxiliary entries follow (e.g., section definition aux records) |
How COFFLoader Classifies Symbols
COFFLoader uses two helper functions to classify symbols. A symbol is defined if its SectionNumber is greater than 0 (it exists in a section). A symbol is external if its StorageClass is IMAGE_SYM_CLASS_EXTERNAL (2). An external symbol with SectionNumber == 0 is an unresolved import that must be linked at load time.
C// From COFFLoader -- symbol classification helpers
int coff_symbol_is_defined(coff_sym_t* sym) {
return (sym->SectionNumber > 0);
}
int coff_symbol_is_external(coff_sym_t* sym) {
return (sym->StorageClass == IMAGE_SYM_CLASS_EXTERNAL); // StorageClass == 2
}
The String Table
The string table immediately follows the symbol table. Its structure is simple:
TEXTString Table Layout:
Offset 0: uint32_t Size; // total size of string table (including this field)
Offset 4: char[] strings; // null-terminated strings packed sequentially
Example:
04 00 00 00 2E 74 65 78 74 00 5F 67 6F 00 5F 5F ....text._go.__
69 6D 70 5F 4B 45 52 4E 45 4C 33 32 24 47 65 74 imp_KERNEL32$Get
...
Symbols reference strings by offset from the START of the string table.
value[1] = 4 --> ".text"
value[1] = 10 --> "_go"
value[1] = 14 --> "__imp_KERNEL32$GetCurrentProcessId"
The first 4 bytes are the size field itself, so valid string offsets start at 4. If the string table only contains the size field (size == 4), there are no long symbol names.
Relocation Entries
Each section can have its own relocation table. The relocation entries tell the loader which bytes in the section need to be patched once the final addresses of symbols are known. Each entry is 10 bytes:
Ctypedef struct coff_reloc {
uint32_t VirtualAddress; // offset within the section to patch
uint32_t SymbolTableIndex; // index into the symbol table
uint16_t Type; // relocation type (architecture-specific)
} coff_reloc_t;
| Field | Size | Purpose |
|---|---|---|
| VirtualAddress | 4 | Byte offset within the section where the fixup must be applied |
| SymbolTableIndex | 4 | Index into the symbol table identifying the target symbol |
| Type | 2 | How to compute the fixup value (architecture-dependent) |
AMD64 Relocation Types (from COFFLoader.h)
C#define IMAGE_REL_AMD64_ADDR64 0x0001 // 64-bit absolute address
#define IMAGE_REL_AMD64_ADDR32NB 0x0003 // 32-bit address without image base (RVA)
#define IMAGE_REL_AMD64_REL32 0x0004 // 32-bit relative (RIP-relative)
#define IMAGE_REL_AMD64_REL32_1 0x0005 // REL32 + 1 byte displacement
#define IMAGE_REL_AMD64_REL32_2 0x0006 // REL32 + 2 byte displacement
#define IMAGE_REL_AMD64_REL32_3 0x0007 // REL32 + 3 bytes displacement
#define IMAGE_REL_AMD64_REL32_4 0x0008 // REL32 + 4 bytes displacement
#define IMAGE_REL_AMD64_REL32_5 0x0009 // REL32 + 5 bytes displacement
The REL32 type is the most common in x64 BOFs. It computes: target_address - (fixup_address + 4). The variants REL32_1 through REL32_5 add an additional displacement of 1-5 bytes to account for instruction encodings where the relocation is not the last part of the instruction.
Putting It All Together
When the compiler generates a call to BeaconPrintf, it emits a CALL instruction with a placeholder 32-bit offset, a symbol table entry for __imp_BeaconPrintf (or the architecture-specific variant), and a relocation entry pointing from the CALL instruction to the symbol. At load time, COFFLoader resolves the symbol to an actual memory address and patches the CALL instruction's offset to reach that address.
Visualizing a Real BOF
Here is what a minimal BOF looks like at the binary level after compilation:
TEXTSource: void go(char* a, int l) { BeaconPrintf(0, "hello"); }
After compilation (x86_64-w64-mingw32-gcc -c):
COFF Header: Machine=0x8664, Sections=4, Symbols=12
Section 1: .text Size=0x2A, 1 relocation (the go() code)
Section 2: .data Size=0x00, 0 relocations (empty)
Section 3: .rdata Size=0x06, 0 relocations ("hello\0")
Section 4: .xdata Size=0x08, 0 relocations (unwind info)
Symbol Table:
[0] .text Section=1, Class=STATIC, Value=0
[2] .data Section=2, Class=STATIC, Value=0
[4] .rdata Section=3, Class=STATIC, Value=0
[6] go Section=1, Class=EXTERNAL, Value=0 <-- entry point
[7] __imp_BeaconPrintf Section=0, Class=EXTERNAL <-- unresolved import
Relocations for .text:
Offset=0x0F, Symbol=4 (.rdata), Type=REL32 <-- reference to "hello" string
Offset=0x1A, Symbol=7 (__imp_BeaconPrintf), Type=REL32 <-- call to BeaconPrintf
Pop Quiz: COFF Format
Q1: How does COFFLoader find the string table in a COFF file?
Q2: A symbol has SectionNumber=0 and StorageClass=2. What does this mean?
Q3: How are symbol names longer than 8 characters stored in the COFF symbol table?