Difficulty: Beginner

Module 1: Memory Scanner Evasion Goals

Why decrypting your entire payload is a death sentence — and how ShellGhost avoids it.

Module Objective

Understand the fundamental problem that ShellGhost solves: memory scanners can detect fully decrypted shellcode in process memory. This module explains why traditional decrypt-then-execute approaches fail against modern EDR memory scanning, introduces the concept of minimal decryption surface, and frames the threat model that drives ShellGhost's design by lem0nSec.

1. The Shellcode Lifecycle in Memory

Every shellcode loader must eventually place raw executable machine code into memory and run it. Regardless of how cleverly the shellcode is encrypted on disk, there comes a moment when the decrypted bytes exist in a readable, executable memory region. This is the decryption window — and it is the single most exploitable moment in any loader's lifecycle.

Traditional Loader: The Decryption Window

Encrypted on Disk
Safe from static scan
Decrypt to Memory
Full payload exposed
Execute Shellcode
Running decrypted
Memory Scan
Payload detected

In a traditional loader, the full shellcode is decrypted into a contiguous memory region before execution begins. From the moment decryption completes until the shellcode finishes running (which could be minutes, hours, or indefinitely for a C2 beacon), the entire payload sits in plaintext in memory. Any memory scan during this window will find it.

2. How Memory Scanners Work

Modern EDR solutions perform periodic and event-triggered scans of process memory. Understanding their capabilities is essential to appreciating why ShellGhost's approach matters.

Scanner TypeTriggerWhat It ExaminesDetection Method
Periodic ScanTimer-based (every N seconds)All committed private memory pagesSignature matching, YARA rules
Event-TriggeredSuspicious API call detectedMemory around the suspicious allocationPattern matching, entropy analysis
ETW-BasedAllocation events via ETW providersNewly allocated executable pagesHeuristic analysis of page contents
On-DemandAnalyst requests scanFull process memory dumpOffline YARA, custom signatures

The Fundamental Problem

If a memory scanner examines your process at any point after decryption, it will find the complete shellcode payload. Encryption only protects the payload before execution. Once decrypted, every byte of shellcode — including easily-signatured sequences like the Metasploit framework's cld; and rsp, 0xFFFFFFFFFFFFFFF0 prologue — is visible to any scanner.

3. Existing Approaches and Their Limits

Several techniques attempt to reduce the decryption window. Each has trade-offs:

Sleep Encryption (e.g., Ekko, Foliage)

Encrypt the shellcode region during sleep, decrypt before resuming. The payload is still fully decrypted during active execution. A scan during the active window catches everything. Also, the encrypt/decrypt transitions themselves create detectable patterns (VirtualProtect calls, timer objects).

Module Stomping / Phantom DLL Hollowing

Overwrite a legitimate DLL's .text section with shellcode. The memory appears backed by a legitimate file on disk, which avoids unbacked-memory heuristics. However, the shellcode content itself is still fully readable if scanned, and mismatches between the file on disk and the memory contents can be detected.

Page Guard / No-Access Tricks

Mark shellcode pages as PAGE_NOACCESS when not executing, flip to PAGE_EXECUTE_READ on access. Creates detectable VirtualProtect call patterns and still requires full decryption of the page being executed.

ShellGhost's Approach: Minimal Decryption Surface

ShellGhost takes a radically different approach. Instead of decrypting the entire payload and protecting it during sleep, ShellGhost never decrypts more than a single instruction at a time. The decrypted instruction exists in memory only for the duration of that instruction's execution. Before and after that instant, every byte of the shellcode region contains 0xCC (INT3 breakpoint opcodes) or encrypted data. A memory scan at any point will see nothing but breakpoints.

4. The Minimal Decryption Surface Concept

The decryption surface is the number of shellcode bytes that exist in plaintext in memory at any given instant. Traditional loaders have a decryption surface equal to the entire shellcode size. ShellGhost reduces this to effectively one instruction (1–15 bytes).

TechniqueDecryption SurfaceExposure Duration
Traditional decrypt-then-executeEntire payload (thousands of bytes)Entire execution lifetime
Sleep encryption (Ekko-style)Entire payload during active phaseActive execution periods
Page-level togglingOne memory page (4096 bytes)Per-page execution time
ShellGhost1 instruction (1–15 bytes)Single instruction execution

By minimizing the decryption surface to a single instruction, ShellGhost ensures that at no point does a recognizable shellcode pattern exist in memory. Even if a scanner reads the shellcode region mid-execution, it sees a sea of 0xCC bytes with at most a few bytes of one instruction that look different — meaningless without the surrounding context.

5. Threat Model

ShellGhost is designed to defeat a specific set of threats. Understanding what it protects against (and what it does not) is critical for realistic expectations.

What ShellGhost Defeats

What ShellGhost Does NOT Defeat

6. The High-Level ShellGhost Flow

Before diving into implementation details in later modules, here is the conceptual overview of how ShellGhost operates:

ShellGhost Execution Model

1. Alloc RW, Map SC
Fill with 0xCC bytes
2. Register VEH
Exception handler
3. CreateThread
Entry at .text end
4. VEH: Re-encrypt prev
Decrypt current instr
5. RW→RX, Execute
Hits next 0xCC
6. Next BP fires
Cycle repeats
  1. Preprocessing: A Python script (ShellGhost_mapping.py) disassembles the shellcode, encrypts each instruction independently using RC4 via SystemFunction032, and generates C arrays of CRYPT_BYTES_QUOTA structs containing each instruction's RVA (offset) and byte count (quota).
  2. Allocation: A memory region is allocated with PAGE_READWRITE (RW). The region is filled entirely with 0xCC (INT3) bytes. The encrypted instruction data and mapping structs are compiled into the binary.
  3. VEH Registration: A Vectored Exception Handler is registered. This handler will intercept all breakpoint exceptions.
  4. Thread Creation: A new thread is created via CreateThread() with its entry point set to null bytes at the end of the .text segment (found by ResolveEndofTextSegment()). This avoids the IoC of a thread entry point in private memory. The first 0xCC triggers EXCEPTION_BREAKPOINT.
  5. Decrypt & Execute: The VEH handler catches each breakpoint, re-encrypts the previous instruction (if any), decrypts the current instruction using SystemFunction032, toggles the page to PAGE_EXECUTE_READ (RX) via VirtualProtect, and resumes execution. One EXCEPTION_BREAKPOINT per instruction.
  6. Next Instruction: After the decrypted instruction executes, the CPU hits the next 0xCC, triggering another EXCEPTION_BREAKPOINT, and the cycle repeats.

7. Why This Course Exists

ShellGhost combines several advanced Windows internals concepts: software breakpoints, vectored exception handling, CONTEXT structure manipulation, shellcode mapping preprocessing, per-instruction encryption via SystemFunction032, and RW/RX memory toggling. Each concept is well-documented individually, but their combination into a coherent evasion technique requires understanding how they interact.

Course Structure

ModuleTopicWhy It Matters
1 (this)Memory Scanner Evasion GoalsUnderstand the problem and threat model
2Software Breakpoints & INT3The mechanism that triggers per-byte handling
3Vectored Exception HandlingThe interception mechanism for breakpoint events
4The ShellGhost ConceptHow all pieces combine into the evasion technique
5SystemFunction032 & Shellcode MappingPer-instruction encryption and the preprocessing pipeline
6VEH Handler ImplementationThe actual C code that makes it work
7Background: Trap Flag & Single-SteppingGeneral x86 knowledge for context (not used by ShellGhost)
8Full Chain & DetectionComplete flow, performance, and detection analysis

Knowledge Check

Q1: What is the primary weakness of traditional decrypt-then-execute shellcode loaders?

A) They cannot encrypt shellcode on disk
B) They require kernel-mode drivers
C) The entire decrypted payload is visible in memory during execution
D) They cannot execute position-independent code

Q2: What is the "decryption surface" of ShellGhost at any given instant?

A) One instruction (1–15 bytes)
B) One memory page (4096 bytes)
C) The entire shellcode
D) One function at a time

Q3: Which of the following is NOT defeated by ShellGhost's technique?

A) YARA-based memory signature scanning
B) Post-mortem process memory dumps
C) Entropy analysis of the shellcode region
D) Behavioral monitoring of shellcode network connections