Difficulty: Advanced

Module 7: Background: x86 Trap Flag & Single-Stepping

General x86 knowledge for context — understanding the EFLAGS TF bit and EXCEPTION_SINGLE_STEP.

Important: ShellGhost Does NOT Use the Trap Flag

This module covers the x86/x64 trap flag (TF) as general background knowledge. While single-stepping is a well-known technique used by debuggers and some other evasion tools, ShellGhost itself does not use the trap flag or EXCEPTION_SINGLE_STEP. ShellGhost's one-exception model relies solely on EXCEPTION_BREAKPOINT: each breakpoint handler both re-encrypts the previous instruction and decrypts the current one. The next 0xCC in the buffer naturally signals instruction completion. This module is included because understanding TF is valuable x86 knowledge that helps you appreciate why ShellGhost's simpler approach works and how it compares to trap-flag-based alternatives.

1. The EFLAGS Register

The EFLAGS register (RFLAGS on x64, though only the lower 32 bits are used for flags) is a 32-bit register containing status flags, control flags, and system flags. The trap flag is a system flag at bit position 8:

TextEFLAGS Register Layout (selected bits):

Bit  Name   Description
 0   CF     Carry Flag
 2   PF     Parity Flag
 4   AF     Auxiliary Carry Flag
 6   ZF     Zero Flag
 7   SF     Sign Flag
 8   TF     Trap Flag          <-- Used by debuggers (NOT by ShellGhost)
 9   IF     Interrupt Enable Flag
10   DF     Direction Flag
11   OF     Overflow Flag
12-13 IOPL  I/O Privilege Level
14   NT     Nested Task
16   RF     Resume Flag         <-- Important for single-step
17   VM     Virtual-8086 Mode
18   AC     Alignment Check
21   ID     CPUID Available

TF (Bit 8) Behavior

When TF is set to 1 in EFLAGS, the CPU executes exactly one instruction and then raises a debug exception (interrupt vector 1, #DB). The CPU automatically clears TF before delivering the exception, preventing infinite single-step loops. After the exception handler runs and returns, execution continues normally (without single-stepping) unless the handler sets TF again.

C// Setting the trap flag via CONTEXT manipulation in a VEH handler
#define TRAP_FLAG  0x100   // Bit 8 = 0x100 in hex

// Set TF: enable single-step
ctx->EFlags |= TRAP_FLAG;

// Clear TF: disable single-step
ctx->EFlags &= ~TRAP_FLAG;

// Check TF: is single-stepping active?
BOOL is_stepping = (ctx->EFlags & TRAP_FLAG) != 0;

2. The Single-Step Exception Flow

When the CPU encounters a set TF, the following sequence occurs at the hardware level:

Trap Flag Exception Flow

TF Set in EFLAGS
Via CONTEXT restore

→

CPU Executes 1 Instr
Full instruction

→

CPU Clears TF
Auto-cleared

→

#DB Exception
Vector 1

→

KiDebugTrapOrFault
Kernel handler

→

EXCEPTION_SINGLE_STEP
0x80000004 to VEH

Step	What Happens	Detail
1	CONTEXT restored with TF=1	The kernel loads the modified CONTEXT (from VEH handler return) into the CPU registers, including EFLAGS with TF set
2	CPU executes one instruction	The instruction at RIP executes fully (all bytes decoded and executed, RIP advanced past the instruction)
3	CPU clears TF	Before raising the exception, the CPU clears bit 8 of EFLAGS. This is a hardware behavior, not OS behavior.
4	#DB exception raised	The CPU traps to the kernel via IDT entry 1 (`KiDebugTrapOrFault`)
5	Kernel dispatches to user mode	The kernel creates an EXCEPTION_RECORD with `ExceptionCode = 0x80000004` (EXCEPTION_SINGLE_STEP)
6	VEH handler called	ShellGhost's handler receives the single-step exception and performs re-encryption

3. EXCEPTION_SINGLE_STEP Details

C// From winnt.h / ntstatus.h
#define EXCEPTION_SINGLE_STEP   0x80000004L
#define STATUS_SINGLE_STEP      0x80000004L

// When a single-step exception is delivered:
// ExceptionRecord->ExceptionCode    = 0x80000004
// ExceptionRecord->ExceptionAddress = RIP AFTER the executed instruction
// ExceptionRecord->ExceptionFlags   = 0 (first-chance)
// ContextRecord->Rip               = address of NEXT instruction
// ContextRecord->EFlags            = TF already cleared by CPU

RIP Points to the Next Instruction

Unlike EXCEPTION_BREAKPOINT where RIP is only 1 byte past the trap instruction, EXCEPTION_SINGLE_STEP delivers RIP pointing to the next instruction that would execute. This is because the CPU fully executed the previous instruction (advancing RIP by the instruction's length) before raising the exception. ShellGhost uses the difference between the new RIP and the previous instruction address to determine how many bytes were consumed, and thus how many bytes need to be re-encrypted to 0xCC.

4. TF vs Debugger Single-Step

Debuggers like WinDbg use the exact same mechanism for their "step into" (F11 / t) command. Understanding this parallel illuminates how ShellGhost works:

Aspect	Debugger Step-Into	Hypothetical TF-Based Tool	ShellGhost (Actual)
How TF is set	Debugger modifies CONTEXT via `SetThreadContext`	VEH handler sets TF in CONTEXT	TF is not used
How instruction end is detected	EXCEPTION_SINGLE_STEP	EXCEPTION_SINGLE_STEP	Next `0xCC` triggers EXCEPTION_BREAKPOINT
Exceptions per instruction	1 (single-step)	2 (breakpoint + single-step)	1 (breakpoint only)
Performance	User-controlled	2 kernel transitions per instruction	1 kernel transition per instruction

Debugger Interference

If a debugger is attached to a process running ShellGhost, the debugger intercepts EXCEPTION_BREAKPOINT events via the debug port before they reach user-mode VEH handlers. This means ShellGhost's VEH handler may never receive the breakpoint exceptions if a debugger is attached. This is both a limitation (cannot debug ShellGhost easily) and an incidental anti-debug behavior. Note that since ShellGhost does not use TF, there is no single-step conflict with the debugger.

5. The Resume Flag (RF)

Bit 16 of EFLAGS is the Resume Flag (RF). It plays an important role in single-step scenarios that ShellGhost must account for:

C#define RESUME_FLAG  0x10000  // Bit 16

// The Resume Flag prevents repeated #DB exceptions on the same
// instruction when hardware breakpoints (DR0-DR3) are in use.
// When RF=1, the CPU suppresses #DB for one instruction.

RF and ShellGhost

When the CPU delivers a #DB (single-step) exception, it sets RF in the saved EFLAGS on the stack. This ensures that when execution resumes, if a hardware breakpoint also matches the next instruction, the CPU does not immediately raise another #DB for the hardware breakpoint. ShellGhost generally does not use hardware breakpoints (DR0-DR3), so RF is not a concern for the normal execution cycle. However, if an EDR tool has set hardware breakpoints on the process, RF ensures they do not interfere with ShellGhost's TF-based single-stepping.

6. Edge Cases with the Trap Flag

Several x86/x64 instructions interact with TF in special ways that ShellGhost must handle:

Instruction	TF Behavior	Impact on ShellGhost
`STI` (Set Interrupt Flag)	Single-step is delayed until after the instruction following STI	Rare in shellcode; minimal impact
`MOV SS` / `POP SS`	Single-step is suppressed for one instruction after SS load	Very rare in x64 shellcode; minimal impact
`REP`-prefixed instructions	Single-step fires after each iteration, not after the entire REP loop	ShellGhost gets a single-step per iteration — re-encryption happens each time
`IRET`	Loads new EFLAGS from stack, potentially changing TF	Not used in user-mode shellcode
`POPF` / `POPFQ`	Can set or clear TF from stack value	If shellcode uses POPF, it could clear TF — ShellGhost loses control
`PUSHF` / `PUSHFQ`	Pushes EFLAGS with VM and RF cleared (bits 16-17 masked), but TF is preserved	If shellcode reads EFLAGS via PUSHF while TF=1, it would see TF=1 in the pushed value

The POPFQ Risk

If the shellcode contains a POPFQ instruction that loads an EFLAGS value with TF=0, the trap flag is cleared and ShellGhost loses its single-step notification. The next instruction executes without triggering EXCEPTION_SINGLE_STEP, and the re-encryption step is skipped. However, the instruction after that will still be 0xCC (triggering a breakpoint), so ShellGhost recovers at the cost of one instruction that is not re-encrypted immediately. In practice, most shellcode does not use POPFQ.

7. PUSHF and Anti-Debug Detection

The PUSHF/PUSHFQ instructions have an important security consideration. Per the Intel SDM, PUSHF applies a mask of 00FCFFFFH to the pushed EFLAGS value, which clears the VM (bit 17) and RF (bit 16) flags. However, TF (bit 8) is NOT cleared by PUSHF — it is preserved in the pushed value:

x86 ASM; Anti-debug technique (CAN detect single-stepping)
pushfq                   ; Push EFLAGS onto stack
pop rax                  ; RAX = EFLAGS value
test rax, 0x100          ; Test trap flag bit
jnz being_debugged       ; If TF=1, someone is single-stepping us

; Per Intel SDM: PUSHF clears VM and RF (bits 16-17) via mask 00FCFFFFH
; but TF (bit 8) is PRESERVED in the pushed value.
; This means PUSHF CAN reveal active single-stepping.

ShellGhost Avoids This Entirely

Since ShellGhost does not use the trap flag at all (it relies solely on EXCEPTION_BREAKPOINT from 0xCC bytes), the PUSHF anti-debug detection is irrelevant. If shellcode contains PUSHF-based anti-debug checks, they will see TF=0 because ShellGhost never sets it. This is one of the advantages of ShellGhost's one-exception model over a trap-flag-based approach.

8. TF Lifecycle in a Hypothetical Two-Exception Model

For educational context, here is how the trap flag lifecycle would work in a hypothetical two-exception-per-instruction model (breakpoint + single-step). Note: ShellGhost does NOT use this approach.

TextHypothetical two-exception model (NOT how ShellGhost works):

Phase 1: EXCEPTION_BREAKPOINT handler
  +--> Handler sets:  ctx->EFlags |= 0x100   (TF = 1)
  +--> Handler returns EXCEPTION_CONTINUE_EXECUTION
  +--> Kernel calls NtContinue, loads CONTEXT into CPU registers
  +--> EFLAGS now has TF = 1

Phase 2: CPU executes one instruction
  +--> CPU fully executes the instruction
  +--> CPU auto-clears TF (EFLAGS bit 8 = 0)
  +--> CPU raises #DB (debug exception, vector 1)

Phase 3: EXCEPTION_SINGLE_STEP handler
  +--> Handler re-encrypts the previous instruction
  +--> Handler returns EXCEPTION_CONTINUE_EXECUTION

Phase 4: CPU resumes at next 0xCC -> cycle repeats

This approach generates TWO exceptions per instruction.
ShellGhost instead uses a ONE-exception model where
each EXCEPTION_BREAKPOINT handler does BOTH the re-encryption
of the previous instruction AND decryption of the current one.

ShellGhost's Simpler Approach

ShellGhost avoids the trap flag entirely. After executing a decrypted instruction, the CPU naturally hits the next 0xCC and raises EXCEPTION_BREAKPOINT. That breakpoint handler re-encrypts the previous instruction and decrypts the current one. This one-exception model is simpler, generates half the exceptions, and avoids all TF-related edge cases and detection vectors. Understanding the TF mechanism is still valuable because it is used by debuggers and other tools, but ShellGhost's approach is more elegant for this specific use case.

Knowledge Check

Q1: At which bit position in the EFLAGS register is the Trap Flag (TF) located?

A) Bit 0 (0x001)

B) Bit 16 (0x10000)

C) Bit 8 (0x100)

D) Bit 9 (0x200)

Q2: What exception code does the CPU generate when TF triggers a single-step?

A) 0x80000004 (EXCEPTION_SINGLE_STEP)

B) 0x80000003 (EXCEPTION_BREAKPOINT)

C) 0xC0000005 (EXCEPTION_ACCESS_VIOLATION)

D) 0xC000001D (EXCEPTION_ILLEGAL_INSTRUCTION)

Q3: What does the CPU do with the Trap Flag after delivering the single-step exception?

A) Keeps it set for the next instruction

B) Sets it to an undefined value

C) Toggles it to the opposite state

D) Automatically clears it (sets TF = 0)

← Prev: VEH Handler Implementation (How ShellGhost Actually Works) Next: Full Chain & Detection →