Module 7: Junk Code & Anti-Analysis
Dead code insertion strategies, opaque predicates, register shuffling, control flow obfuscation, and how they defeat static analysis and slow emulation.
Module Objective
Understand the categories of junk instructions Shoggoth generates, how recursive garbage generation works, the role of opaque predicates in control flow obfuscation, why junk code is essential beyond mere padding, and how these techniques impact static analysis, disassembly, and emulation-based detection.
1. Why Junk Code Matters
Register randomization and operation chain variation make each Shoggoth output structurally different, but the logical flow of the decoder is still recognizable: locate data, loop over blocks, apply operations, advance pointer. An analyst or automated tool could identify this pattern through control flow analysis or symbolic execution even if the exact bytes differ.
Junk code serves multiple purposes beyond simple padding:
| Purpose | Mechanism | Impact on Detection |
|---|---|---|
| Break pattern continuity | Insert meaningless instructions between real decoder operations | No sequence of real decoder opcodes appears consecutively, defeating byte-pattern signatures |
| Inflate code size variably | Random amount of junk per insertion point | Output size varies unpredictably, defeating size-based heuristics |
| Confuse disassemblers | Jump-over blocks create unreachable bytes that linear disassemblers misinterpret | IDA/Ghidra may produce incorrect disassembly listings |
| Slow emulation | Many additional instructions must be emulated with no functional benefit | AV emulators with time/instruction budgets may exhaust limits before reaching the decrypted payload |
| Increase analysis cost | Analysts must separate junk from real instructions — a tedious, error-prone task | Significantly increases manual reverse engineering time |
2. Categories of Junk Instructions
Shoggoth generates several categories of garbage instructions, each with distinct characteristics:
2.1 Side-Effect-Free Single Instructions
The simplest form: individual instructions that execute but have no net effect on the decoder’s functional state. These are selected to avoid modifying any register or flag that the real decoder depends on:
ASM; Side-effect-free junk instruction examples
nop ; explicit no-op
xchg rax, rax ; 3-byte NOP encoding (REX.W + 0x90)
lea rax, [rax] ; load effective address of itself
mov rbx, rbx ; register self-move
pushfq ; push/pop flags (net zero)
popfq ;
push rcx ; push/pop a register (net zero)
pop rcx ;
test r8, r8 ; sets flags but decoder doesn't use them here
cmp r9, 0 ; sets flags, no data effect
Register Safety
Junk instructions must not clobber registers that hold live decoder state (data pointer, loop counter, block value, keys). Shoggoth tracks which registers are “in use” by the decoder at each insertion point and only generates junk that uses free registers or self-canceling operations on in-use registers (like push reg; pop reg).
2.2 Self-Canceling Instruction Pairs
Pairs of instructions that undo each other, leaving all state unchanged:
ASM; Self-canceling pairs
push rbx ; save rbx
pop rbx ; restore rbx (net effect: none)
add r12, 0x1337 ; modify r12
sub r12, 0x1337 ; restore r12
xor r10, 0xDEAD ; flip bits
xor r10, 0xDEAD ; flip them back
inc r15 ; increment
dec r15 ; decrement back
rol r8, 13 ; rotate left
ror r8, 13 ; rotate right (undo)
These pairs are particularly effective because individually each instruction does something meaningful — it modifies a register or flags. An analyst cannot simply filter out NOP-like instructions; they must trace the data flow to realize the operations cancel out.
2.3 Jump-Over Blocks
One of the most disruptive junk code categories: a short JMP instruction that skips over a block of random bytes. The skipped bytes never execute but are present in the binary:
ASM; Jump-over block: JMP skips random bytes
jmp short skip_garbage ; EB XX (short jump)
db 0x48, 0x89, 0xE5 ; random bytes (look like "mov rbp, rsp")
db 0xFF, 0x15, 0x00 ; random bytes (look like "call [rip+0]")
db 0x8B, 0x45, 0xFC ; random bytes (look like "mov eax, [rbp-4]")
skip_garbage:
; real decoder instructions continue here
Disassembler Confusion
Linear disassemblers (those that disassemble bytes sequentially) will attempt to decode the random bytes as instructions, producing nonsensical disassembly. Even recursive disassemblers can be confused if the random bytes happen to contain valid instruction prefixes that change the interpretation of subsequent bytes. The random data is specifically chosen to sometimes resemble valid instruction sequences, maximizing confusion.
2.4 Fake Function Calls
Shoggoth can generate patterns that mimic function call conventions but are actually no-ops:
ASM; Fake function call pattern
push r11 ; “save caller-saved register”
push r10 ; “save another register”
call fake_func ; call that immediately returns
pop r10 ; “restore registers”
pop r11
jmp continue
fake_func:
ret ; immediately returns
continue:
; real code resumes
To an analyst or automated tool, this looks like a legitimate function call with register preservation. Determining that fake_func does nothing requires following the call target and analyzing its behavior. This adds branching complexity to the control flow graph and increases analysis time.
3. Recursive Garbage Generation
Shoggoth does not simply insert a flat list of junk instructions. The garbage generator is recursive — junk blocks can contain nested junk blocks, creating multi-level obfuscation structures:
C++// Conceptual recursive garbage generator
void insertGarbage(x86::Assembler& a, std::mt19937& rng, int depth) {
if (depth <= 0) return;
std::uniform_int_distribution<int> typeDist(0, 4);
std::uniform_int_distribution<int> countDist(1, 3);
int count = countDist(rng);
for (int i = 0; i < count; i++) {
int type = typeDist(rng);
switch (type) {
case 0: // Single NOP-equivalent
emitSingleJunk(a, rng);
break;
case 1: // Self-canceling pair
emitCancelingPair(a, rng);
break;
case 2: // Jump-over with random bytes
emitJumpOver(a, rng);
break;
case 3: // Fake function call
emitFakeCall(a, rng);
break;
case 4: // Nested: recursively generate more junk
insertGarbage(a, rng, depth - 1);
break;
}
}
}
The recursion depth is bounded to prevent the output from growing unboundedly, but even at depth 2-3, the nesting creates complex control flow patterns that significantly complicate analysis.
Recursive Junk Code Structure
Jump-over block
xor [ptr], key
Fake call + self-cancel
add ptr, 8
NOP + nested junk
4. Opaque Predicates
Implementation Note
The following opaque predicate techniques are presented as general polymorphic engine concepts. Shoggoth’s current implementation uses simpler conditional jump obfuscation with immediately-bound labels rather than these advanced predicate forms. These techniques are included because they are educationally valuable and represent the broader state of the art in polymorphic engine design.
An opaque predicate is a conditional branch whose outcome is known at generation time (always true or always false) but is difficult for an analyst or automated tool to determine statically. They inject fake control flow edges into the program’s control flow graph:
ASM; Opaque predicate: always-true condition
push rax
mov rax, 7
imul rax, rax ; rax = 49
and rax, 1 ; 49 is odd, so rax = 1
test rax, rax ; ZF = 0 (always)
pop rax
jnz real_path ; ALWAYS taken (49 & 1 = 1)
; Dead code: never reached but present in binary
db 0xCC, 0xCC, 0xCC ; fake INT3 breakpoints
jmp some_decoy
real_path:
; Real decoder continues here
The key insight is that the condition 7 * 7 = 49, and 49 & 1 = 1 (odd numbers always have bit 0 set). A human analyst can reason this through, but automated tools face the opaque predicate problem — determining whether an arbitrary computation always produces the same boolean result is undecidable in the general case.
| Opaque Predicate Type | Example | Why It’s Hard to Resolve |
|---|---|---|
| Arithmetic invariant | x*x + x is always even | Requires algebraic reasoning about integer properties |
| Pointer aliasing | Two pointers that always differ but require alias analysis to prove | Alias analysis is PSPACE-hard in general |
| Number theory | x*(x+1)*(x+2) is always divisible by 6 | Requires inductive proof over integers |
| Correlated branches | Branch on X at point A, branch on same X at point B | Requires tracking value flow across basic blocks |
Impact on Control Flow Graphs
Each opaque predicate adds one or two fake edges to the control flow graph (CFG). The “dead” path can contain arbitrary bytes — fake instructions, partial instruction encodings, or even bytes that look like important data. This pollutes the CFG with unreachable nodes and false cross-references, making automated analysis tools produce larger, more confusing output.
5. Register Shuffling
Implementation Note
In Shoggoth’s actual implementation, register randomization occurs at code generation time via a Fisher-Yates shuffle (the MixupArrayRegs function), not through mid-execution XCHG instructions. Each generation randomly assigns registers before any code is emitted. The XCHG-based mid-decoder shuffling shown below is a theoretical enhancement that would add an additional layer of anti-analysis capability.
Beyond the initial random register assignment, a polymorphic engine could shuffle register contents mid-decoder. This would mean the register holding the data pointer might change from R12 to RBX partway through the decoder, making it impossible to track a single register throughout the code:
ASM; Register shuffle mid-decoder (theoretical enhancement)
; Before shuffle: R12 = data pointer, R14 = counter
xchg r12, rbx ; Now RBX = data pointer
xchg r14, rdi ; Now RDI = counter
; All subsequent code uses RBX and RDI instead
; Analyst tracking R12 loses the trail
This approach would be particularly effective against automated analysis that tracks “tainted” data through register assignments. After a shuffle, the analysis must propagate taint through the xchg and update all subsequent references — a step that simple pattern-matching tools often miss.
6. Anti-Emulation Properties
While junk code’s primary purpose is anti-signature and anti-analysis, it also provides anti-emulation benefits:
| Anti-Emulation Effect | How Junk Code Achieves It |
|---|---|
| Instruction budget exhaustion | AV emulators often limit execution to N instructions (e.g., 10,000). Junk instructions consume this budget without doing useful work, so the emulator times out before the payload is decrypted. |
| Memory access patterns | Junk instructions that access the stack or registers create complex memory access patterns that full-system emulators must faithfully simulate. |
| Flag state complexity | Junk instructions that modify FLAGS create complex flag state that the emulator must track through every instruction to maintain correctness. |
| Branch prediction noise | Jump-over blocks and opaque predicates create branches that the emulator must evaluate, adding to the instruction count and analysis complexity. |
Emulation Budgets Are Real
AV emulators operate under strict time and instruction budgets because they must scan every file that is accessed. A typical budget might be 5-50 milliseconds or 10,000-100,000 emulated instructions per file. If the decoder executes 50,000 junk instructions before reaching the first real decryption operation, the emulator exhausts its budget and gives up — never seeing the decrypted payload.
7. Junk Code Insertion Points in Shoggoth
Shoggoth strategically places junk code at every point where it will not break the decoder logic:
Insertion Points Map
- Before any decoder code — junk preamble at the very start of execution
- Between register setup instructions — between the LEA, MOV instructions that initialize the decoder
- Inside the decryption loop, between each operation — between XOR, ADD, SUB, etc. within the per-block processing
- Around the loop control — between the pointer advance, counter decrement, and conditional branch
- Between Stage 2 and Stage 1 decoders — junk interlude between the two decoder stubs
- Inside the RC4 KSA loop — between S-box permutation steps
- Inside the RC4 PRGA loop — between keystream generation and XOR application
- After all decryption — before the final jump to the decrypted payload
At each insertion point, the recursive garbage generator is called with a random depth and count, producing a variable amount of junk. The total junk volume per output is unpredictable, which means the output size, instruction count, and basic block count all vary between generations.
Knowledge Check
Q1: Why are jump-over blocks particularly effective at confusing disassemblers?
Q2: What is an opaque predicate?
7*7=49 is always odd, so a branch on its parity is always-taken — but proving this statically requires algebraic reasoning that many tools cannot perform.Q3: How does junk code help defeat AV emulation?