Module 7: The symbol<T> Template
Position-independent string access — how shellcode references data without absolute addresses.
Important Distinction
The symbol<T> template is for position-independent string access. It is not part of the API resolution system (which uses resolve::module, resolve::_api, and RESOLVE_IMPORT as covered in Module 6). symbol<T> solves a different problem: how does shellcode reference its own embedded string data when it doesn't know what address it was loaded at?
The String Problem
When you write a string literal in C/C++, the compiler places it in the .rdata (read-only data) section and generates code that references it using an absolute address. For a normal executable, this works fine — the loader maps the binary at its preferred base address, and everything lines up.
But shellcode has no loader. It gets injected at an arbitrary address. Those absolute references now point to garbage (or, more likely, an access violation).
C++// Normal code - compiler generates absolute address reference
const char* msg = "Hello";
// Compiled to something like: lea rax, [0x140003000]
// If shellcode is loaded at 0x200000 instead of 0x140000000... CRASH
// Shellcode needs: calculate the ACTUAL address of "Hello" at runtime
// regardless of where in memory the shellcode was loaded
The symbol<T> Implementation
Stardust's solution lives in common.h. The symbol<T> struct uses a function called RipData() whose own address serves as a known reference point. Because RipData() is a function, the CPU can find it via RIP-relative addressing (which works at any load address). The compile-time distance between RipData() and the string data is baked into the binary and never changes, regardless of where the shellcode is loaded.
C++// common.h - symbol<T> (simplified)
template<typename T>
struct symbol {
// 's' holds the compile-time distance from RipData to the string
uintptr_t s;
// RipData() returns its own runtime address
// The address of RipData itself is the anchor point
static auto RipData() -> uintptr_t {
return (uintptr_t)&RipData;
}
// To get the runtime string address:
// runtime_string_addr = RipData_runtime_addr - compile_time_distance
// Which is: &RipData - s
auto get() -> T {
return (T)( RipData() - s );
}
};
Step-by-Step: The Math
Here's what happens when Stardust accesses a string via symbol<T>:
- At compile time: The linker calculates
s = &RipData - &string_data. This is a fixed distance — it depends only on the relative layout of code and data in the binary, not on any absolute address. - At runtime: The shellcode is loaded at some unknown base address.
RipData()uses RIP-relative addressing to return its own actual address. - The calculation:
RipData() - sgives the actual runtime address of the string.
Compile-Time vs Runtime Address Translation
At Compile Time (base 0x1000)
s = 0x1500 - 0x1200 = 0x300
At Runtime (loaded at 0x7000)
RipData() - s = 0x7500 - 0x300 = 0x7200 ✔
The distance (0x300) stays constant — only the base address changes.
The G_SYM Macro
For convenience, Stardust provides a G_SYM macro that wraps common uses of symbol<T>. Instead of manually constructing a symbol<T> and calling .get(), you can use G_SYM as a shorthand to access global symbol data in a position-independent way.
C++// Instead of manual symbol<T> usage:
auto str = symbol<const char*>{ offset_value }.get();
// G_SYM provides a cleaner interface:
auto str = G_SYM( my_string );
Comparison: AceLdr's OFFSET Macro
AceLdr solves the same problem with its OFFSET macro. Despite the different syntax (C macro vs C++ template), the underlying principle is identical:
Stardust: symbol<T>
- C++ template struct
- Uses
RipData()function address as anchor - Formula:
&RipData - s - Type-safe via template parameter
AceLdr: OFFSET Macro
- C preprocessor macro
- Uses a known code label as anchor
- Formula:
known_addr - compile_time_distance - Cast manually by the caller
Both approaches rely on the same fundamental insight: the relative distance between two points in the binary is fixed at compile time. If you can determine the runtime address of one point (via RIP-relative addressing), you can calculate the runtime address of any other point by subtracting the known distance.
Key Takeaway
symbol<T> exists because shellcode cannot use absolute addresses for data access. It translates compile-time-known relative offsets into runtime-correct pointers using a simple anchor-point calculation. This is critical for accessing any embedded strings or data structures within the shellcode blob.
Knowledge Check
Q1: Why can't shellcode use string literals like normal programs?
Q2: In the symbol<T> calculation, what value stays constant regardless of where the shellcode is loaded?