Module 2: Software Breakpoints & INT3
The single-byte opcode that stops the CPU — and the foundation of ShellGhost's per-instruction control.
Module Objective
This module explains the INT3 instruction (opcode 0xCC), how the CPU processes it, how debuggers use software breakpoints, and how the exception dispatch mechanism delivers EXCEPTION_BREAKPOINT to user-mode handlers. This is the trigger mechanism that ShellGhost exploits to gain control before each shellcode instruction executes.
1. The INT3 Instruction
The x86/x64 architecture defines a special one-byte instruction specifically for debugger breakpoints: INT3, encoded as the single byte 0xCC. When the CPU executes this byte, it immediately raises a breakpoint exception (interrupt vector 3) without executing any further instructions.
Why 0xCC Is Special
The generic software interrupt instruction INT n is two bytes: 0xCD followed by the interrupt number. For example, INT 3 encoded generically would be 0xCD 0x03. However, Intel specifically assigned the single-byte opcode 0xCC to INT 3 because debuggers need to replace a single byte in the instruction stream to set a breakpoint without disturbing the alignment of subsequent instructions.
| Encoding | Bytes | Size | Purpose |
|---|---|---|---|
INT3 (compact) | 0xCC | 1 byte | Debugger breakpoint — designed for byte-level patching |
INT 3 (generic) | 0xCD 0x03 | 2 bytes | Same interrupt, but takes 2 bytes — less useful for debugging |
INT 0x2E (legacy syscall) | 0xCD 0x2E | 2 bytes | Legacy Windows system call mechanism (pre-SYSENTER) |
x86 ASM; A simple function with a breakpoint inserted
original_code:
push rbp ; 0x55
mov rbp, rsp ; 0x48 0x89 0xE5
sub rsp, 0x20 ; 0x48 0x83 0xEC 0x20
...
; After debugger sets breakpoint at the 'mov' instruction:
patched_code:
push rbp ; 0x55
int3 ; 0xCC (replaced first byte of 'mov rbp, rsp')
db 0x89, 0xE5 ; remaining bytes of original instruction
sub rsp, 0x20 ; 0x48 0x83 0xEC 0x20
...
2. How Debuggers Use INT3
When a debugger (WinDbg, x64dbg, Visual Studio) sets a software breakpoint, it performs a simple three-step operation:
- Save the original byte at the target address
- Write
0xCCto that address - Wait for the CPU to hit it and raise EXCEPTION_BREAKPOINT
When the breakpoint fires, the debugger:
- Catches the EXCEPTION_BREAKPOINT exception
- Restores the original byte to the target address
- Decrements the instruction pointer (RIP) by 1 (because the CPU advanced past the 0xCC byte)
- Allows the user to inspect state, then resumes execution from the restored instruction
The RIP Adjustment
When the CPU executes 0xCC, it increments RIP past the one-byte instruction before raising the exception. However, on Windows, the kernel's exception dispatch logic (KiDispatchException) automatically decrements RIP by 1 for EXCEPTION_BREAKPOINT exceptions before delivering the exception to user-mode handlers. This means that by the time a VEH handler receives the exception, ContextRecord->Rip already points back to the address of the 0xCC byte. Debuggers rely on this kernel behavior, and ShellGhost uses ContextRecord->Rip directly without any further adjustment.
C// How a debugger sets a software breakpoint (simplified)
BYTE original_byte;
LPVOID breakpoint_addr = (LPVOID)0x00007FF6A0001000;
// Save and patch
ReadProcessMemory(hProcess, breakpoint_addr, &original_byte, 1, NULL);
BYTE int3 = 0xCC;
WriteProcessMemory(hProcess, breakpoint_addr, &int3, 1, NULL);
// ... CPU hits 0xCC, raises EXCEPTION_BREAKPOINT ...
// KiDispatchException already decremented RIP by 1 for us
// In debug event handler: restore original byte, RIP already correct
WriteProcessMemory(hProcess, breakpoint_addr, &original_byte, 1, NULL);
// context.Rip already points at breakpoint_addr (kernel adjusted it)
SetThreadContext(hThread, &context);
3. CPU Exception Flow for INT3
When the CPU encounters 0xCC, a precise sequence of events unfolds at the hardware and OS levels:
INT3 Exception Dispatch Chain
Trap to kernel
IDT vector #3
Kernel dispatcher
Transition to user-mode
User-mode dispatch
| Step | Component | Action |
|---|---|---|
| 1 | CPU Hardware | Detects 0xCC opcode, saves context (RIP, RSP, RFLAGS, etc.) on the kernel stack, transitions to Ring 0 via IDT entry for vector 3 |
| 2 | KiBreakpointTrap | Kernel trap handler (ntoskrnl) receives control. Builds a KTRAP_FRAME from saved registers |
| 3 | KiDispatchException | Creates an EXCEPTION_RECORD with ExceptionCode = 0x80000003 (EXCEPTION_BREAKPOINT). Checks if a kernel debugger wants it first |
| 4 | KiUserExceptionDispatcher | If no kernel debugger handles it, the kernel transitions back to user mode, calling ntdll!KiUserExceptionDispatcher |
| 5 | RtlDispatchException | Walks the Vectored Exception Handler (VEH) list first, then Structured Exception Handler (SEH) chain. The first handler that returns EXCEPTION_CONTINUE_EXECUTION wins |
Key Insight for ShellGhost
The exception dispatch mechanism guarantees that Vectored Exception Handlers are called before SEH handlers. By registering a VEH handler, ShellGhost intercepts every 0xCC breakpoint exception before any debugger or other handler sees it. This gives ShellGhost first-priority control over the exception — and the ability to modify the thread context (including RIP and RFLAGS) before execution resumes.
4. The EXCEPTION_BREAKPOINT Code
The exception code for an INT3 breakpoint is defined in the Windows headers as:
C// From winnt.h / ntstatus.h
#define EXCEPTION_BREAKPOINT 0x80000003L
#define STATUS_BREAKPOINT 0x80000003L
// The exception record delivered to handlers:
typedef struct _EXCEPTION_RECORD {
DWORD ExceptionCode; // 0x80000003 for INT3
DWORD ExceptionFlags; // 0 for first-chance
struct _EXCEPTION_RECORD *ExceptionRecord;
PVOID ExceptionAddress; // Address of the 0xCC byte
DWORD NumberParameters; // 1 for breakpoint
ULONG_PTR ExceptionInformation[EXCEPTION_MAXIMUM_PARAMETERS];
} EXCEPTION_RECORD;
ExceptionAddress Detail
For EXCEPTION_BREAKPOINT on Windows, the kernel (KiDispatchException) automatically decrements RIP by 1 before dispatching to user-mode handlers. This means by the time a VEH handler sees the exception, ExceptionAddress and ContextRecord->Rip already point to the address of the 0xCC byte itself. No manual subtraction is needed. ShellGhost uses ContextRecord->Rip directly to identify which instruction to decrypt.
5. INT3 vs Other Exception Mechanisms
ShellGhost specifically uses INT3 because of its unique properties compared to other exception-generating mechanisms:
| Mechanism | Opcode / Trigger | Exception Code | Size | ShellGhost Suitability |
|---|---|---|---|---|
| INT3 | 0xCC | 0x80000003 | 1 byte | Ideal — replaces any byte without alignment issues |
| INT 1 (ICEBP) | 0xF1 | 0x80000004 | 1 byte | Could work but raises SINGLE_STEP — conflates with trap flag |
| UD2 | 0x0F 0x0B | 0xC000001D | 2 bytes | Too large — cannot replace a single byte |
| Access violation | PAGE_NOACCESS | 0xC0000005 | N/A | Page-granular only, not byte-granular |
| Hardware breakpoint | DR0-DR3 | 0x80000004 | N/A | Limited to 4 breakpoints per thread |
Why Exactly 0xCC?
The 0xCC opcode has two critical properties for ShellGhost: (1) it is exactly one byte, meaning it can replace the first byte of any x86/x64 instruction without corrupting adjacent instructions, and (2) it generates a distinct exception code (0x80000003) that is easily identifiable by the VEH handler. ShellGhost uses a one-exception cycle: each EXCEPTION_BREAKPOINT handler both re-encrypts the previous instruction and decrypts the current one. The single-byte size of 0xCC is what makes per-instruction stepping through the buffer possible.
6. Filling a Region with 0xCC
ShellGhost's preparation step fills the entire shellcode execution region with 0xCC. Here is what that looks like in practice:
C// Allocate writable memory and fill with INT3
SIZE_T shellcode_size = 512; // size of actual shellcode
LPVOID exec_mem = VirtualAlloc(
NULL,
shellcode_size,
MEM_COMMIT | MEM_RESERVE,
PAGE_READWRITE // RW initially, toggled to RX for execution
);
// Fill entire region with 0xCC (INT3 breakpoints)
memset(exec_mem, 0xCC, shellcode_size);
// Memory dump at this point:
// 00007FF6`A0000000 CC CC CC CC CC CC CC CC
// 00007FF6`A0000008 CC CC CC CC CC CC CC CC
// 00007FF6`A0000010 CC CC CC CC CC CC CC CC
// ... (all 0xCC for shellcode_size bytes)
A memory scanner examining this region sees nothing but INT3 opcodes. There is no encrypted data, no entropy anomaly, and no signatured byte sequence. The real shellcode bytes (encrypted with RC4) are stored in a separate data buffer that is never made executable.
Two Separate Buffers
ShellGhost maintains two buffers: (1) the execution buffer, which is allocated as RW and toggled to RX for execution (filled with 0xCC), and (2) the encrypted shellcode data, containing per-instruction encrypted bytes generated by the preprocessing script. The VEH handler reads from the encrypted data, decrypts one full instruction using SystemFunction032, writes it into the execution buffer, toggles to RX, and after execution, re-encrypts it in the next handler invocation. The encrypted data is never executable, and the execution buffer never contains more than one decrypted instruction at a time.
7. From Breakpoint to Handler: The ShellGhost Connection
Putting it all together, here is how INT3 connects to ShellGhost's execution model:
- A new thread starts at the first
0xCCin the execution buffer - CPU raises EXCEPTION_BREAKPOINT (
0x80000003) - Windows dispatches to VEH handlers first (ShellGhost's handler is registered with highest priority)
- ShellGhost's handler sees
ExceptionCode == 0x80000003 - Handler re-encrypts the previously executed instruction (if any) back to
0xCC - Handler uses
ContextRecord->Rip(already pointing at the 0xCC, adjusted by the kernel) to identify the current instruction - Handler decrypts the full instruction from the encrypted data using SystemFunction032 (RC4)
- Handler writes the decrypted instruction bytes to the execution buffer
- Handler toggles the memory page from RW to RX via
VirtualProtect - Handler returns EXCEPTION_CONTINUE_EXECUTION — the CPU resumes and executes the real instruction, then hits the next
0xCC
The next module covers Vectored Exception Handling in detail — the mechanism that makes step 3 possible.
Knowledge Check
Q1: What is the opcode encoding for the single-byte INT3 breakpoint instruction?
Q2: What exception code does INT3 generate?
Q3: Why does ShellGhost use INT3 (0xCC) instead of UD2 (0x0F 0x0B)?