Difficulty: Beginner

Module 2: Software Breakpoints & INT3

The single-byte opcode that stops the CPU — and the foundation of ShellGhost's per-instruction control.

Module Objective

This module explains the INT3 instruction (opcode 0xCC), how the CPU processes it, how debuggers use software breakpoints, and how the exception dispatch mechanism delivers EXCEPTION_BREAKPOINT to user-mode handlers. This is the trigger mechanism that ShellGhost exploits to gain control before each shellcode instruction executes.

1. The INT3 Instruction

The x86/x64 architecture defines a special one-byte instruction specifically for debugger breakpoints: INT3, encoded as the single byte 0xCC. When the CPU executes this byte, it immediately raises a breakpoint exception (interrupt vector 3) without executing any further instructions.

Why 0xCC Is Special

The generic software interrupt instruction INT n is two bytes: 0xCD followed by the interrupt number. For example, INT 3 encoded generically would be 0xCD 0x03. However, Intel specifically assigned the single-byte opcode 0xCC to INT 3 because debuggers need to replace a single byte in the instruction stream to set a breakpoint without disturbing the alignment of subsequent instructions.

EncodingBytesSizePurpose
INT3 (compact)0xCC1 byteDebugger breakpoint — designed for byte-level patching
INT 3 (generic)0xCD 0x032 bytesSame interrupt, but takes 2 bytes — less useful for debugging
INT 0x2E (legacy syscall)0xCD 0x2E2 bytesLegacy Windows system call mechanism (pre-SYSENTER)
x86 ASM; A simple function with a breakpoint inserted
original_code:
    push rbp            ; 0x55
    mov rbp, rsp        ; 0x48 0x89 0xE5
    sub rsp, 0x20       ; 0x48 0x83 0xEC 0x20
    ...

; After debugger sets breakpoint at the 'mov' instruction:
patched_code:
    push rbp            ; 0x55
    int3                ; 0xCC  (replaced first byte of 'mov rbp, rsp')
    db 0x89, 0xE5       ; remaining bytes of original instruction
    sub rsp, 0x20       ; 0x48 0x83 0xEC 0x20
    ...

2. How Debuggers Use INT3

When a debugger (WinDbg, x64dbg, Visual Studio) sets a software breakpoint, it performs a simple three-step operation:

  1. Save the original byte at the target address
  2. Write 0xCC to that address
  3. Wait for the CPU to hit it and raise EXCEPTION_BREAKPOINT

When the breakpoint fires, the debugger:

  1. Catches the EXCEPTION_BREAKPOINT exception
  2. Restores the original byte to the target address
  3. Decrements the instruction pointer (RIP) by 1 (because the CPU advanced past the 0xCC byte)
  4. Allows the user to inspect state, then resumes execution from the restored instruction

The RIP Adjustment

When the CPU executes 0xCC, it increments RIP past the one-byte instruction before raising the exception. However, on Windows, the kernel's exception dispatch logic (KiDispatchException) automatically decrements RIP by 1 for EXCEPTION_BREAKPOINT exceptions before delivering the exception to user-mode handlers. This means that by the time a VEH handler receives the exception, ContextRecord->Rip already points back to the address of the 0xCC byte. Debuggers rely on this kernel behavior, and ShellGhost uses ContextRecord->Rip directly without any further adjustment.

C// How a debugger sets a software breakpoint (simplified)
BYTE original_byte;
LPVOID breakpoint_addr = (LPVOID)0x00007FF6A0001000;

// Save and patch
ReadProcessMemory(hProcess, breakpoint_addr, &original_byte, 1, NULL);
BYTE int3 = 0xCC;
WriteProcessMemory(hProcess, breakpoint_addr, &int3, 1, NULL);

// ... CPU hits 0xCC, raises EXCEPTION_BREAKPOINT ...
// KiDispatchException already decremented RIP by 1 for us

// In debug event handler: restore original byte, RIP already correct
WriteProcessMemory(hProcess, breakpoint_addr, &original_byte, 1, NULL);
// context.Rip already points at breakpoint_addr (kernel adjusted it)
SetThreadContext(hThread, &context);

3. CPU Exception Flow for INT3

When the CPU encounters 0xCC, a precise sequence of events unfolds at the hardware and OS levels:

INT3 Exception Dispatch Chain

CPU: Execute 0xCC
Trap to kernel
KiBreakpointTrap
IDT vector #3
KiDispatchException
Kernel dispatcher
KiUserExceptionDispatcher
Transition to user-mode
RtlDispatchException
User-mode dispatch
StepComponentAction
1CPU HardwareDetects 0xCC opcode, saves context (RIP, RSP, RFLAGS, etc.) on the kernel stack, transitions to Ring 0 via IDT entry for vector 3
2KiBreakpointTrapKernel trap handler (ntoskrnl) receives control. Builds a KTRAP_FRAME from saved registers
3KiDispatchExceptionCreates an EXCEPTION_RECORD with ExceptionCode = 0x80000003 (EXCEPTION_BREAKPOINT). Checks if a kernel debugger wants it first
4KiUserExceptionDispatcherIf no kernel debugger handles it, the kernel transitions back to user mode, calling ntdll!KiUserExceptionDispatcher
5RtlDispatchExceptionWalks the Vectored Exception Handler (VEH) list first, then Structured Exception Handler (SEH) chain. The first handler that returns EXCEPTION_CONTINUE_EXECUTION wins

Key Insight for ShellGhost

The exception dispatch mechanism guarantees that Vectored Exception Handlers are called before SEH handlers. By registering a VEH handler, ShellGhost intercepts every 0xCC breakpoint exception before any debugger or other handler sees it. This gives ShellGhost first-priority control over the exception — and the ability to modify the thread context (including RIP and RFLAGS) before execution resumes.

4. The EXCEPTION_BREAKPOINT Code

The exception code for an INT3 breakpoint is defined in the Windows headers as:

C// From winnt.h / ntstatus.h
#define EXCEPTION_BREAKPOINT    0x80000003L
#define STATUS_BREAKPOINT       0x80000003L

// The exception record delivered to handlers:
typedef struct _EXCEPTION_RECORD {
    DWORD ExceptionCode;        // 0x80000003 for INT3
    DWORD ExceptionFlags;       // 0 for first-chance
    struct _EXCEPTION_RECORD *ExceptionRecord;
    PVOID ExceptionAddress;     // Address of the 0xCC byte
    DWORD NumberParameters;     // 1 for breakpoint
    ULONG_PTR ExceptionInformation[EXCEPTION_MAXIMUM_PARAMETERS];
} EXCEPTION_RECORD;

ExceptionAddress Detail

For EXCEPTION_BREAKPOINT on Windows, the kernel (KiDispatchException) automatically decrements RIP by 1 before dispatching to user-mode handlers. This means by the time a VEH handler sees the exception, ExceptionAddress and ContextRecord->Rip already point to the address of the 0xCC byte itself. No manual subtraction is needed. ShellGhost uses ContextRecord->Rip directly to identify which instruction to decrypt.

5. INT3 vs Other Exception Mechanisms

ShellGhost specifically uses INT3 because of its unique properties compared to other exception-generating mechanisms:

MechanismOpcode / TriggerException CodeSizeShellGhost Suitability
INT30xCC0x800000031 byteIdeal — replaces any byte without alignment issues
INT 1 (ICEBP)0xF10x800000041 byteCould work but raises SINGLE_STEP — conflates with trap flag
UD20x0F 0x0B0xC000001D2 bytesToo large — cannot replace a single byte
Access violationPAGE_NOACCESS0xC0000005N/APage-granular only, not byte-granular
Hardware breakpointDR0-DR30x80000004N/ALimited to 4 breakpoints per thread

Why Exactly 0xCC?

The 0xCC opcode has two critical properties for ShellGhost: (1) it is exactly one byte, meaning it can replace the first byte of any x86/x64 instruction without corrupting adjacent instructions, and (2) it generates a distinct exception code (0x80000003) that is easily identifiable by the VEH handler. ShellGhost uses a one-exception cycle: each EXCEPTION_BREAKPOINT handler both re-encrypts the previous instruction and decrypts the current one. The single-byte size of 0xCC is what makes per-instruction stepping through the buffer possible.

6. Filling a Region with 0xCC

ShellGhost's preparation step fills the entire shellcode execution region with 0xCC. Here is what that looks like in practice:

C// Allocate writable memory and fill with INT3
SIZE_T shellcode_size = 512;  // size of actual shellcode

LPVOID exec_mem = VirtualAlloc(
    NULL,
    shellcode_size,
    MEM_COMMIT | MEM_RESERVE,
    PAGE_READWRITE       // RW initially, toggled to RX for execution
);

// Fill entire region with 0xCC (INT3 breakpoints)
memset(exec_mem, 0xCC, shellcode_size);

// Memory dump at this point:
// 00007FF6`A0000000  CC CC CC CC CC CC CC CC
// 00007FF6`A0000008  CC CC CC CC CC CC CC CC
// 00007FF6`A0000010  CC CC CC CC CC CC CC CC
// ... (all 0xCC for shellcode_size bytes)

A memory scanner examining this region sees nothing but INT3 opcodes. There is no encrypted data, no entropy anomaly, and no signatured byte sequence. The real shellcode bytes (encrypted with RC4) are stored in a separate data buffer that is never made executable.

Two Separate Buffers

ShellGhost maintains two buffers: (1) the execution buffer, which is allocated as RW and toggled to RX for execution (filled with 0xCC), and (2) the encrypted shellcode data, containing per-instruction encrypted bytes generated by the preprocessing script. The VEH handler reads from the encrypted data, decrypts one full instruction using SystemFunction032, writes it into the execution buffer, toggles to RX, and after execution, re-encrypts it in the next handler invocation. The encrypted data is never executable, and the execution buffer never contains more than one decrypted instruction at a time.

7. From Breakpoint to Handler: The ShellGhost Connection

Putting it all together, here is how INT3 connects to ShellGhost's execution model:

  1. A new thread starts at the first 0xCC in the execution buffer
  2. CPU raises EXCEPTION_BREAKPOINT (0x80000003)
  3. Windows dispatches to VEH handlers first (ShellGhost's handler is registered with highest priority)
  4. ShellGhost's handler sees ExceptionCode == 0x80000003
  5. Handler re-encrypts the previously executed instruction (if any) back to 0xCC
  6. Handler uses ContextRecord->Rip (already pointing at the 0xCC, adjusted by the kernel) to identify the current instruction
  7. Handler decrypts the full instruction from the encrypted data using SystemFunction032 (RC4)
  8. Handler writes the decrypted instruction bytes to the execution buffer
  9. Handler toggles the memory page from RW to RX via VirtualProtect
  10. Handler returns EXCEPTION_CONTINUE_EXECUTION — the CPU resumes and executes the real instruction, then hits the next 0xCC

The next module covers Vectored Exception Handling in detail — the mechanism that makes step 3 possible.

Knowledge Check

Q1: What is the opcode encoding for the single-byte INT3 breakpoint instruction?

A) 0xCD 0x03
B) 0xCC
C) 0x0F 0x0B
D) 0xF1

Q2: What exception code does INT3 generate?

A) 0x80000003 (EXCEPTION_BREAKPOINT)
B) 0x80000004 (EXCEPTION_SINGLE_STEP)
C) 0xC0000005 (EXCEPTION_ACCESS_VIOLATION)
D) 0xC000001D (EXCEPTION_ILLEGAL_INSTRUCTION)

Q3: Why does ShellGhost use INT3 (0xCC) instead of UD2 (0x0F 0x0B)?

A) UD2 cannot be caught by exception handlers
B) UD2 requires kernel-mode handling
C) INT3 is exactly one byte, so it can replace any instruction's first byte without corrupting adjacent instructions
D) UD2 is slower to execute