Module 1: Memory Scanner Evasion Goals
Why decrypting your entire payload is a death sentence — and how ShellGhost avoids it.
Module Objective
Understand the fundamental problem that ShellGhost solves: memory scanners can detect fully decrypted shellcode in process memory. This module explains why traditional decrypt-then-execute approaches fail against modern EDR memory scanning, introduces the concept of minimal decryption surface, and frames the threat model that drives ShellGhost's design by lem0nSec.
1. The Shellcode Lifecycle in Memory
Every shellcode loader must eventually place raw executable machine code into memory and run it. Regardless of how cleverly the shellcode is encrypted on disk, there comes a moment when the decrypted bytes exist in a readable, executable memory region. This is the decryption window — and it is the single most exploitable moment in any loader's lifecycle.
Traditional Loader: The Decryption Window
Safe from static scan
Full payload exposed
Running decrypted
Payload detected
In a traditional loader, the full shellcode is decrypted into a contiguous memory region before execution begins. From the moment decryption completes until the shellcode finishes running (which could be minutes, hours, or indefinitely for a C2 beacon), the entire payload sits in plaintext in memory. Any memory scan during this window will find it.
2. How Memory Scanners Work
Modern EDR solutions perform periodic and event-triggered scans of process memory. Understanding their capabilities is essential to appreciating why ShellGhost's approach matters.
| Scanner Type | Trigger | What It Examines | Detection Method |
|---|---|---|---|
| Periodic Scan | Timer-based (every N seconds) | All committed private memory pages | Signature matching, YARA rules |
| Event-Triggered | Suspicious API call detected | Memory around the suspicious allocation | Pattern matching, entropy analysis |
| ETW-Based | Allocation events via ETW providers | Newly allocated executable pages | Heuristic analysis of page contents |
| On-Demand | Analyst requests scan | Full process memory dump | Offline YARA, custom signatures |
The Fundamental Problem
If a memory scanner examines your process at any point after decryption, it will find the complete shellcode payload. Encryption only protects the payload before execution. Once decrypted, every byte of shellcode — including easily-signatured sequences like the Metasploit framework's cld; and rsp, 0xFFFFFFFFFFFFFFF0 prologue — is visible to any scanner.
3. Existing Approaches and Their Limits
Several techniques attempt to reduce the decryption window. Each has trade-offs:
Sleep Encryption (e.g., Ekko, Foliage)
Encrypt the shellcode region during sleep, decrypt before resuming. The payload is still fully decrypted during active execution. A scan during the active window catches everything. Also, the encrypt/decrypt transitions themselves create detectable patterns (VirtualProtect calls, timer objects).
Module Stomping / Phantom DLL Hollowing
Overwrite a legitimate DLL's .text section with shellcode. The memory appears backed by a legitimate file on disk, which avoids unbacked-memory heuristics. However, the shellcode content itself is still fully readable if scanned, and mismatches between the file on disk and the memory contents can be detected.
Page Guard / No-Access Tricks
Mark shellcode pages as PAGE_NOACCESS when not executing, flip to PAGE_EXECUTE_READ on access. Creates detectable VirtualProtect call patterns and still requires full decryption of the page being executed.
ShellGhost's Approach: Minimal Decryption Surface
ShellGhost takes a radically different approach. Instead of decrypting the entire payload and protecting it during sleep, ShellGhost never decrypts more than a single instruction at a time. The decrypted instruction exists in memory only for the duration of that instruction's execution. Before and after that instant, every byte of the shellcode region contains 0xCC (INT3 breakpoint opcodes) or encrypted data. A memory scan at any point will see nothing but breakpoints.
4. The Minimal Decryption Surface Concept
The decryption surface is the number of shellcode bytes that exist in plaintext in memory at any given instant. Traditional loaders have a decryption surface equal to the entire shellcode size. ShellGhost reduces this to effectively one instruction (1–15 bytes).
| Technique | Decryption Surface | Exposure Duration |
|---|---|---|
| Traditional decrypt-then-execute | Entire payload (thousands of bytes) | Entire execution lifetime |
| Sleep encryption (Ekko-style) | Entire payload during active phase | Active execution periods |
| Page-level toggling | One memory page (4096 bytes) | Per-page execution time |
| ShellGhost | 1 instruction (1–15 bytes) | Single instruction execution |
By minimizing the decryption surface to a single instruction, ShellGhost ensures that at no point does a recognizable shellcode pattern exist in memory. Even if a scanner reads the shellcode region mid-execution, it sees a sea of 0xCC bytes with at most a few bytes of one instruction that look different — meaningless without the surrounding context.
5. Threat Model
ShellGhost is designed to defeat a specific set of threats. Understanding what it protects against (and what it does not) is critical for realistic expectations.
What ShellGhost Defeats
- Memory signature scanning — YARA rules, pattern matching, and byte-sequence signatures cannot match against a region filled with 0xCC
- Periodic memory dumps — a dump at any point shows only INT3 opcodes in the shellcode region
- Post-mortem forensics — if the process is suspended for analysis, the shellcode region contains no meaningful content
- Entropy-based heuristics on payload — a region of identical 0xCC bytes has zero entropy
What ShellGhost Does NOT Defeat
- Behavioral monitoring — the shellcode's actions (network connections, file operations) are still visible
- VEH registration detection — calling
AddVectoredExceptionHandleris observable via API hooks or ETW - Excessive exception monitoring — thousands of EXCEPTION_BREAKPOINT events per second is anomalous
- Instruction-level tracing — a debugger stepping through the VEH handler can observe each decrypted instruction
6. The High-Level ShellGhost Flow
Before diving into implementation details in later modules, here is the conceptual overview of how ShellGhost operates:
ShellGhost Execution Model
Fill with 0xCC bytes
Exception handler
Entry at .text end
Decrypt current instr
Hits next 0xCC
Cycle repeats
- Preprocessing: A Python script (
ShellGhost_mapping.py) disassembles the shellcode, encrypts each instruction independently using RC4 via SystemFunction032, and generates C arrays ofCRYPT_BYTES_QUOTAstructs containing each instruction's RVA (offset) and byte count (quota). - Allocation: A memory region is allocated with
PAGE_READWRITE(RW). The region is filled entirely with0xCC(INT3) bytes. The encrypted instruction data and mapping structs are compiled into the binary. - VEH Registration: A Vectored Exception Handler is registered. This handler will intercept all breakpoint exceptions.
- Thread Creation: A new thread is created via
CreateThread()with its entry point set to null bytes at the end of the.textsegment (found byResolveEndofTextSegment()). This avoids the IoC of a thread entry point in private memory. The first0xCCtriggers EXCEPTION_BREAKPOINT. - Decrypt & Execute: The VEH handler catches each breakpoint, re-encrypts the previous instruction (if any), decrypts the current instruction using SystemFunction032, toggles the page to
PAGE_EXECUTE_READ(RX) via VirtualProtect, and resumes execution. One EXCEPTION_BREAKPOINT per instruction. - Next Instruction: After the decrypted instruction executes, the CPU hits the next
0xCC, triggering another EXCEPTION_BREAKPOINT, and the cycle repeats.
7. Why This Course Exists
ShellGhost combines several advanced Windows internals concepts: software breakpoints, vectored exception handling, CONTEXT structure manipulation, shellcode mapping preprocessing, per-instruction encryption via SystemFunction032, and RW/RX memory toggling. Each concept is well-documented individually, but their combination into a coherent evasion technique requires understanding how they interact.
Course Structure
| Module | Topic | Why It Matters |
|---|---|---|
| 1 (this) | Memory Scanner Evasion Goals | Understand the problem and threat model |
| 2 | Software Breakpoints & INT3 | The mechanism that triggers per-byte handling |
| 3 | Vectored Exception Handling | The interception mechanism for breakpoint events |
| 4 | The ShellGhost Concept | How all pieces combine into the evasion technique |
| 5 | SystemFunction032 & Shellcode Mapping | Per-instruction encryption and the preprocessing pipeline |
| 6 | VEH Handler Implementation | The actual C code that makes it work |
| 7 | Background: Trap Flag & Single-Stepping | General x86 knowledge for context (not used by ShellGhost) |
| 8 | Full Chain & Detection | Complete flow, performance, and detection analysis |
Knowledge Check
Q1: What is the primary weakness of traditional decrypt-then-execute shellcode loaders?
Q2: What is the "decryption surface" of ShellGhost at any given instant?
Q3: Which of the following is NOT defeated by ShellGhost's technique?