Module 7: Background: x86 Trap Flag & Single-Stepping
General x86 knowledge for context — understanding the EFLAGS TF bit and EXCEPTION_SINGLE_STEP.
Important: ShellGhost Does NOT Use the Trap Flag
This module covers the x86/x64 trap flag (TF) as general background knowledge. While single-stepping is a well-known technique used by debuggers and some other evasion tools, ShellGhost itself does not use the trap flag or EXCEPTION_SINGLE_STEP. ShellGhost's one-exception model relies solely on EXCEPTION_BREAKPOINT: each breakpoint handler both re-encrypts the previous instruction and decrypts the current one. The next 0xCC in the buffer naturally signals instruction completion. This module is included because understanding TF is valuable x86 knowledge that helps you appreciate why ShellGhost's simpler approach works and how it compares to trap-flag-based alternatives.
1. The EFLAGS Register
The EFLAGS register (RFLAGS on x64, though only the lower 32 bits are used for flags) is a 32-bit register containing status flags, control flags, and system flags. The trap flag is a system flag at bit position 8:
TextEFLAGS Register Layout (selected bits):
Bit Name Description
0 CF Carry Flag
2 PF Parity Flag
4 AF Auxiliary Carry Flag
6 ZF Zero Flag
7 SF Sign Flag
8 TF Trap Flag <-- Used by debuggers (NOT by ShellGhost)
9 IF Interrupt Enable Flag
10 DF Direction Flag
11 OF Overflow Flag
12-13 IOPL I/O Privilege Level
14 NT Nested Task
16 RF Resume Flag <-- Important for single-step
17 VM Virtual-8086 Mode
18 AC Alignment Check
21 ID CPUID Available
TF (Bit 8) Behavior
When TF is set to 1 in EFLAGS, the CPU executes exactly one instruction and then raises a debug exception (interrupt vector 1, #DB). The CPU automatically clears TF before delivering the exception, preventing infinite single-step loops. After the exception handler runs and returns, execution continues normally (without single-stepping) unless the handler sets TF again.
C// Setting the trap flag via CONTEXT manipulation in a VEH handler
#define TRAP_FLAG 0x100 // Bit 8 = 0x100 in hex
// Set TF: enable single-step
ctx->EFlags |= TRAP_FLAG;
// Clear TF: disable single-step
ctx->EFlags &= ~TRAP_FLAG;
// Check TF: is single-stepping active?
BOOL is_stepping = (ctx->EFlags & TRAP_FLAG) != 0;
2. The Single-Step Exception Flow
When the CPU encounters a set TF, the following sequence occurs at the hardware level:
Trap Flag Exception Flow
Via CONTEXT restore
Full instruction
Auto-cleared
Vector 1
Kernel handler
0x80000004 to VEH
| Step | What Happens | Detail |
|---|---|---|
| 1 | CONTEXT restored with TF=1 | The kernel loads the modified CONTEXT (from VEH handler return) into the CPU registers, including EFLAGS with TF set |
| 2 | CPU executes one instruction | The instruction at RIP executes fully (all bytes decoded and executed, RIP advanced past the instruction) |
| 3 | CPU clears TF | Before raising the exception, the CPU clears bit 8 of EFLAGS. This is a hardware behavior, not OS behavior. |
| 4 | #DB exception raised | The CPU traps to the kernel via IDT entry 1 (KiDebugTrapOrFault) |
| 5 | Kernel dispatches to user mode | The kernel creates an EXCEPTION_RECORD with ExceptionCode = 0x80000004 (EXCEPTION_SINGLE_STEP) |
| 6 | VEH handler called | ShellGhost's handler receives the single-step exception and performs re-encryption |
3. EXCEPTION_SINGLE_STEP Details
C// From winnt.h / ntstatus.h
#define EXCEPTION_SINGLE_STEP 0x80000004L
#define STATUS_SINGLE_STEP 0x80000004L
// When a single-step exception is delivered:
// ExceptionRecord->ExceptionCode = 0x80000004
// ExceptionRecord->ExceptionAddress = RIP AFTER the executed instruction
// ExceptionRecord->ExceptionFlags = 0 (first-chance)
// ContextRecord->Rip = address of NEXT instruction
// ContextRecord->EFlags = TF already cleared by CPU
RIP Points to the Next Instruction
Unlike EXCEPTION_BREAKPOINT where RIP is only 1 byte past the trap instruction, EXCEPTION_SINGLE_STEP delivers RIP pointing to the next instruction that would execute. This is because the CPU fully executed the previous instruction (advancing RIP by the instruction's length) before raising the exception. ShellGhost uses the difference between the new RIP and the previous instruction address to determine how many bytes were consumed, and thus how many bytes need to be re-encrypted to 0xCC.
4. TF vs Debugger Single-Step
Debuggers like WinDbg use the exact same mechanism for their "step into" (F11 / t) command. Understanding this parallel illuminates how ShellGhost works:
| Aspect | Debugger Step-Into | Hypothetical TF-Based Tool | ShellGhost (Actual) |
|---|---|---|---|
| How TF is set | Debugger modifies CONTEXT via SetThreadContext | VEH handler sets TF in CONTEXT | TF is not used |
| How instruction end is detected | EXCEPTION_SINGLE_STEP | EXCEPTION_SINGLE_STEP | Next 0xCC triggers EXCEPTION_BREAKPOINT |
| Exceptions per instruction | 1 (single-step) | 2 (breakpoint + single-step) | 1 (breakpoint only) |
| Performance | User-controlled | 2 kernel transitions per instruction | 1 kernel transition per instruction |
Debugger Interference
If a debugger is attached to a process running ShellGhost, the debugger intercepts EXCEPTION_BREAKPOINT events via the debug port before they reach user-mode VEH handlers. This means ShellGhost's VEH handler may never receive the breakpoint exceptions if a debugger is attached. This is both a limitation (cannot debug ShellGhost easily) and an incidental anti-debug behavior. Note that since ShellGhost does not use TF, there is no single-step conflict with the debugger.
5. The Resume Flag (RF)
Bit 16 of EFLAGS is the Resume Flag (RF). It plays an important role in single-step scenarios that ShellGhost must account for:
C#define RESUME_FLAG 0x10000 // Bit 16
// The Resume Flag prevents repeated #DB exceptions on the same
// instruction when hardware breakpoints (DR0-DR3) are in use.
// When RF=1, the CPU suppresses #DB for one instruction.
RF and ShellGhost
When the CPU delivers a #DB (single-step) exception, it sets RF in the saved EFLAGS on the stack. This ensures that when execution resumes, if a hardware breakpoint also matches the next instruction, the CPU does not immediately raise another #DB for the hardware breakpoint. ShellGhost generally does not use hardware breakpoints (DR0-DR3), so RF is not a concern for the normal execution cycle. However, if an EDR tool has set hardware breakpoints on the process, RF ensures they do not interfere with ShellGhost's TF-based single-stepping.
6. Edge Cases with the Trap Flag
Several x86/x64 instructions interact with TF in special ways that ShellGhost must handle:
| Instruction | TF Behavior | Impact on ShellGhost |
|---|---|---|
STI (Set Interrupt Flag) | Single-step is delayed until after the instruction following STI | Rare in shellcode; minimal impact |
MOV SS / POP SS | Single-step is suppressed for one instruction after SS load | Very rare in x64 shellcode; minimal impact |
REP-prefixed instructions | Single-step fires after each iteration, not after the entire REP loop | ShellGhost gets a single-step per iteration — re-encryption happens each time |
IRET | Loads new EFLAGS from stack, potentially changing TF | Not used in user-mode shellcode |
POPF / POPFQ | Can set or clear TF from stack value | If shellcode uses POPF, it could clear TF — ShellGhost loses control |
PUSHF / PUSHFQ | Pushes EFLAGS with VM and RF cleared (bits 16-17 masked), but TF is preserved | If shellcode reads EFLAGS via PUSHF while TF=1, it would see TF=1 in the pushed value |
The POPFQ Risk
If the shellcode contains a POPFQ instruction that loads an EFLAGS value with TF=0, the trap flag is cleared and ShellGhost loses its single-step notification. The next instruction executes without triggering EXCEPTION_SINGLE_STEP, and the re-encryption step is skipped. However, the instruction after that will still be 0xCC (triggering a breakpoint), so ShellGhost recovers at the cost of one instruction that is not re-encrypted immediately. In practice, most shellcode does not use POPFQ.
7. PUSHF and Anti-Debug Detection
The PUSHF/PUSHFQ instructions have an important security consideration. Per the Intel SDM, PUSHF applies a mask of 00FCFFFFH to the pushed EFLAGS value, which clears the VM (bit 17) and RF (bit 16) flags. However, TF (bit 8) is NOT cleared by PUSHF — it is preserved in the pushed value:
x86 ASM; Anti-debug technique (CAN detect single-stepping)
pushfq ; Push EFLAGS onto stack
pop rax ; RAX = EFLAGS value
test rax, 0x100 ; Test trap flag bit
jnz being_debugged ; If TF=1, someone is single-stepping us
; Per Intel SDM: PUSHF clears VM and RF (bits 16-17) via mask 00FCFFFFH
; but TF (bit 8) is PRESERVED in the pushed value.
; This means PUSHF CAN reveal active single-stepping.
ShellGhost Avoids This Entirely
Since ShellGhost does not use the trap flag at all (it relies solely on EXCEPTION_BREAKPOINT from 0xCC bytes), the PUSHF anti-debug detection is irrelevant. If shellcode contains PUSHF-based anti-debug checks, they will see TF=0 because ShellGhost never sets it. This is one of the advantages of ShellGhost's one-exception model over a trap-flag-based approach.
8. TF Lifecycle in a Hypothetical Two-Exception Model
For educational context, here is how the trap flag lifecycle would work in a hypothetical two-exception-per-instruction model (breakpoint + single-step). Note: ShellGhost does NOT use this approach.
TextHypothetical two-exception model (NOT how ShellGhost works):
Phase 1: EXCEPTION_BREAKPOINT handler
+--> Handler sets: ctx->EFlags |= 0x100 (TF = 1)
+--> Handler returns EXCEPTION_CONTINUE_EXECUTION
+--> Kernel calls NtContinue, loads CONTEXT into CPU registers
+--> EFLAGS now has TF = 1
Phase 2: CPU executes one instruction
+--> CPU fully executes the instruction
+--> CPU auto-clears TF (EFLAGS bit 8 = 0)
+--> CPU raises #DB (debug exception, vector 1)
Phase 3: EXCEPTION_SINGLE_STEP handler
+--> Handler re-encrypts the previous instruction
+--> Handler returns EXCEPTION_CONTINUE_EXECUTION
Phase 4: CPU resumes at next 0xCC -> cycle repeats
This approach generates TWO exceptions per instruction.
ShellGhost instead uses a ONE-exception model where
each EXCEPTION_BREAKPOINT handler does BOTH the re-encryption
of the previous instruction AND decryption of the current one.
ShellGhost's Simpler Approach
ShellGhost avoids the trap flag entirely. After executing a decrypted instruction, the CPU naturally hits the next 0xCC and raises EXCEPTION_BREAKPOINT. That breakpoint handler re-encrypts the previous instruction and decrypts the current one. This one-exception model is simpler, generates half the exceptions, and avoids all TF-related edge cases and detection vectors. Understanding the TF mechanism is still valuable because it is used by debuggers and other tools, but ShellGhost's approach is more elegant for this specific use case.
Knowledge Check
Q1: At which bit position in the EFLAGS register is the Trap Flag (TF) located?
Q2: What exception code does the CPU generate when TF triggers a single-step?
Q3: What does the CPU do with the Trap Flag after delivering the single-step exception?