Difficulty: Beginner

Module 1: Memory Scanner Evasion Goals

Why decrypting your entire payload is a death sentence — and how ShellGhost avoids it.

Module Objective

Understand the fundamental problem that ShellGhost solves: memory scanners can detect fully decrypted shellcode in process memory. This module explains why traditional decrypt-then-execute approaches fail against modern EDR memory scanning, introduces the concept of minimal decryption surface, and frames the threat model that drives ShellGhost's design by lem0nSec.

1. The Shellcode Lifecycle in Memory

Every shellcode loader must eventually place raw executable machine code into memory and run it. Regardless of how cleverly the shellcode is encrypted on disk, there comes a moment when the decrypted bytes exist in a readable, executable memory region. This is the decryption window — and it is the single most exploitable moment in any loader's lifecycle.

Traditional Loader: The Decryption Window

Encrypted on Disk
Safe from static scan

→

Decrypt to Memory
Full payload exposed

→

Execute Shellcode
Running decrypted

→

Memory Scan
Payload detected

In a traditional loader, the full shellcode is decrypted into a contiguous memory region before execution begins. From the moment decryption completes until the shellcode finishes running (which could be minutes, hours, or indefinitely for a C2 beacon), the entire payload sits in plaintext in memory. Any memory scan during this window will find it.

2. How Memory Scanners Work

Modern EDR solutions perform periodic and event-triggered scans of process memory. Understanding their capabilities is essential to appreciating why ShellGhost's approach matters.

Scanner Type	Trigger	What It Examines	Detection Method
Periodic Scan	Timer-based (every N seconds)	All committed private memory pages	Signature matching, YARA rules
Event-Triggered	Suspicious API call detected	Memory around the suspicious allocation	Pattern matching, entropy analysis
ETW-Based	Allocation events via ETW providers	Newly allocated executable pages	Heuristic analysis of page contents
On-Demand	Analyst requests scan	Full process memory dump	Offline YARA, custom signatures

The Fundamental Problem

If a memory scanner examines your process at any point after decryption, it will find the complete shellcode payload. Encryption only protects the payload before execution. Once decrypted, every byte of shellcode — including easily-signatured sequences like the Metasploit framework's cld; and rsp, 0xFFFFFFFFFFFFFFF0 prologue — is visible to any scanner.

3. Existing Approaches and Their Limits

Several techniques attempt to reduce the decryption window. Each has trade-offs:

Sleep Encryption (e.g., Ekko, Foliage)

Encrypt the shellcode region during sleep, decrypt before resuming. The payload is still fully decrypted during active execution. A scan during the active window catches everything. Also, the encrypt/decrypt transitions themselves create detectable patterns (VirtualProtect calls, timer objects).

Module Stomping / Phantom DLL Hollowing

Overwrite a legitimate DLL's .text section with shellcode. The memory appears backed by a legitimate file on disk, which avoids unbacked-memory heuristics. However, the shellcode content itself is still fully readable if scanned, and mismatches between the file on disk and the memory contents can be detected.

Page Guard / No-Access Tricks

Mark shellcode pages as PAGE_NOACCESS when not executing, flip to PAGE_EXECUTE_READ on access. Creates detectable VirtualProtect call patterns and still requires full decryption of the page being executed.

ShellGhost's Approach: Minimal Decryption Surface

ShellGhost takes a radically different approach. Instead of decrypting the entire payload and protecting it during sleep, ShellGhost never decrypts more than a single instruction at a time. The decrypted instruction exists in memory only for the duration of that instruction's execution. Before and after that instant, every byte of the shellcode region contains 0xCC (INT3 breakpoint opcodes) or encrypted data. A memory scan at any point will see nothing but breakpoints.

4. The Minimal Decryption Surface Concept

The decryption surface is the number of shellcode bytes that exist in plaintext in memory at any given instant. Traditional loaders have a decryption surface equal to the entire shellcode size. ShellGhost reduces this to effectively one instruction (1–15 bytes).

Technique	Decryption Surface	Exposure Duration
Traditional decrypt-then-execute	Entire payload (thousands of bytes)	Entire execution lifetime
Sleep encryption (Ekko-style)	Entire payload during active phase	Active execution periods
Page-level toggling	One memory page (4096 bytes)	Per-page execution time
ShellGhost	1 instruction (1–15 bytes)	Single instruction execution

By minimizing the decryption surface to a single instruction, ShellGhost ensures that at no point does a recognizable shellcode pattern exist in memory. Even if a scanner reads the shellcode region mid-execution, it sees a sea of 0xCC bytes with at most a few bytes of one instruction that look different — meaningless without the surrounding context.

5. Threat Model

ShellGhost is designed to defeat a specific set of threats. Understanding what it protects against (and what it does not) is critical for realistic expectations.

What ShellGhost Defeats

Memory signature scanning — YARA rules, pattern matching, and byte-sequence signatures cannot match against a region filled with 0xCC
Periodic memory dumps — a dump at any point shows only INT3 opcodes in the shellcode region
Post-mortem forensics — if the process is suspended for analysis, the shellcode region contains no meaningful content
Entropy-based heuristics on payload — a region of identical 0xCC bytes has zero entropy

What ShellGhost Does NOT Defeat

Behavioral monitoring — the shellcode's actions (network connections, file operations) are still visible
VEH registration detection — calling AddVectoredExceptionHandler is observable via API hooks or ETW
Excessive exception monitoring — thousands of EXCEPTION_BREAKPOINT events per second is anomalous
Instruction-level tracing — a debugger stepping through the VEH handler can observe each decrypted instruction

6. The High-Level ShellGhost Flow

Before diving into implementation details in later modules, here is the conceptual overview of how ShellGhost operates:

ShellGhost Execution Model

1. Alloc RW, Map SC
Fill with 0xCC bytes

→

2. Register VEH
Exception handler

→

3. CreateThread
Entry at .text end

→

4. VEH: Re-encrypt prev
Decrypt current instr

→

5. RW→RX, Execute
Hits next 0xCC

→

6. Next BP fires
Cycle repeats

Preprocessing: A Python script (ShellGhost_mapping.py) disassembles the shellcode, encrypts each instruction independently using RC4 via SystemFunction032, and generates C arrays of CRYPT_BYTES_QUOTA structs containing each instruction's RVA (offset) and byte count (quota).
Allocation: A memory region is allocated with PAGE_READWRITE (RW). The region is filled entirely with 0xCC (INT3) bytes. The encrypted instruction data and mapping structs are compiled into the binary.
VEH Registration: A Vectored Exception Handler is registered. This handler will intercept all breakpoint exceptions.
Thread Creation: A new thread is created via CreateThread() with its entry point set to null bytes at the end of the .text segment (found by ResolveEndofTextSegment()). This avoids the IoC of a thread entry point in private memory. The first 0xCC triggers EXCEPTION_BREAKPOINT.
Decrypt & Execute: The VEH handler catches each breakpoint, re-encrypts the previous instruction (if any), decrypts the current instruction using SystemFunction032, toggles the page to PAGE_EXECUTE_READ (RX) via VirtualProtect, and resumes execution. One EXCEPTION_BREAKPOINT per instruction.
Next Instruction: After the decrypted instruction executes, the CPU hits the next 0xCC, triggering another EXCEPTION_BREAKPOINT, and the cycle repeats.

7. Why This Course Exists

ShellGhost combines several advanced Windows internals concepts: software breakpoints, vectored exception handling, CONTEXT structure manipulation, shellcode mapping preprocessing, per-instruction encryption via SystemFunction032, and RW/RX memory toggling. Each concept is well-documented individually, but their combination into a coherent evasion technique requires understanding how they interact.

Course Structure

Module	Topic	Why It Matters
1 (this)	Memory Scanner Evasion Goals	Understand the problem and threat model
2	Software Breakpoints & INT3	The mechanism that triggers per-byte handling
3	Vectored Exception Handling	The interception mechanism for breakpoint events
4	The ShellGhost Concept	How all pieces combine into the evasion technique
5	SystemFunction032 & Shellcode Mapping	Per-instruction encryption and the preprocessing pipeline
6	VEH Handler Implementation	The actual C code that makes it work
7	Background: Trap Flag & Single-Stepping	General x86 knowledge for context (not used by ShellGhost)
8	Full Chain & Detection	Complete flow, performance, and detection analysis

Knowledge Check

Q1: What is the primary weakness of traditional decrypt-then-execute shellcode loaders?

A) They cannot encrypt shellcode on disk

B) They require kernel-mode drivers

C) The entire decrypted payload is visible in memory during execution

D) They cannot execute position-independent code

Q2: What is the "decryption surface" of ShellGhost at any given instant?

A) One instruction (1–15 bytes)

B) One memory page (4096 bytes)

C) The entire shellcode

D) One function at a time

Q3: Which of the following is NOT defeated by ShellGhost's technique?

A) YARA-based memory signature scanning

B) Post-mortem process memory dumps

C) Entropy analysis of the shellcode region

D) Behavioral monitoring of shellcode network connections

Next: Software Breakpoints & INT3 →