Module 2: Polymorphic Engines — Concepts
What polymorphism means in the malware context, how it differs from metamorphism, and the historical evolution of self-mutating code.
Module Objective
Understand the formal definition of polymorphism in executable code, distinguish it from metamorphism, learn the historical lineage from early DOS viruses to modern tools like Shoggoth, and grasp the core components every polymorphic engine must implement.
1. Defining Polymorphism
In biology, polymorphism refers to organisms of the same species existing in multiple forms. In the context of executable code, polymorphism means that a program can change its own binary representation while preserving its original functionality. Each instance of a polymorphic program looks different at the byte level, but they all perform the same computation when executed.
More precisely, a polymorphic engine takes an input payload P and produces an output E(P) such that:
- Functional equivalence: Executing
E(P)eventually produces the same behavior as executingP - Structural variance: Given two outputs
E1(P)andE2(P)from different runs, the byte content ofE1andE2differs substantially - Self-contained decryption:
E(P)includes all code necessary to reverse the transformation and recoverPat runtime
The output typically consists of two parts: a decoder stub (variable code that performs decryption) and an encrypted payload (the original code, transformed to be unreadable without the decoder). The polymorphic property applies primarily to the decoder stub — it is the part that must change between generations to avoid signature detection.
2. Polymorphic vs Metamorphic
These two terms are often confused. While both involve code mutation, they operate at fundamentally different levels:
| Property | Polymorphic | Metamorphic |
|---|---|---|
| What changes | The decoder stub (wrapper around encrypted payload) | The entire program body, including functional code |
| Payload state | Encrypted at rest, decrypted at runtime | Always in cleartext — the code itself is rewritten |
| Mutation timing | At generation time (before delivery) | At propagation time (virus rewrites itself when copying) |
| Complexity | Moderate — generate variable decoder + encrypt payload | Very high — must rewrite arbitrary code while preserving semantics |
| Detection difficulty | Hard (no stable decoder signature) but emulation can reveal payload | Very hard (no encrypted payload to recover; code is always “valid”) |
| Historical examples | MtE, SMEG, Shoggoth | Win32.Simile, Win32.ZMist, Regswap |
Shoggoth Is Polymorphic, Not Metamorphic
Shoggoth encrypts the payload and generates a variable decoder stub. It does not rewrite the payload’s own instructions. The payload is treated as an opaque blob of bytes that gets encrypted and later decrypted at runtime. The polymorphism applies to the decoder stub and junk code surrounding it.
3. Historical Evolution
Polymorphic engines have a four-decade lineage in the virus world. Understanding this history contextualizes Shoggoth as a modern implementation of well-established principles.
3.1 The Early Era (1990–1992)
The concept of self-encrypting code appeared in the late 1980s, but the first true polymorphic engine was the Mutation Engine (MtE) released in 1992 by the virus author known as Dark Avenger. MtE was distributed as a linkable object file — any virus author could incorporate it. It generated random decryption loops using different registers, instruction orderings, and interspersed junk instructions.
MtE demonstrated that polymorphism could be packaged as a reusable library rather than being tightly coupled to a specific virus. This modular approach directly parallels how Shoggoth functions today: a standalone tool that takes any payload and wraps it in polymorphic packaging.
3.2 The Arms Race (1993–2000)
After MtE, virus authors and AV vendors entered an escalating arms race. Engines like SMEG (Simulated Metamorphic Encryption Generator) and TPE (Trident Polymorphic Engine) pushed the boundaries with more registers, more junk instruction types, and more complex control flow in the decoder. AV vendors responded with generic decryption (GD) — running the code in an emulator until the payload was decrypted, then scanning the cleartext result.
3.3 Metamorphic Leap (2000–2010)
Some virus authors moved beyond polymorphism to full metamorphism. Win32.Simile (2002) could rewrite its own code using instruction substitution, register reassignment, and code transposition — all without encrypting anything. Win32.ZMist (by the author Z0mbie) took this further with code integration, embedding itself within the host program’s control flow graph.
3.4 Modern Offensive Tools (2010–Present)
The red team and offensive security community adopted polymorphic techniques for payload generation. Tools like msfvenom (Metasploit) offer encoder chains (shikata_ga_nai, xor_dynamic), Veil generates polymorphic payloads in various languages, and Shoggoth brings true asmjit-based x86-64 polymorphic encoding with multi-stage encryption and sophisticated junk code generation.
Historical Timeline of Polymorphic Engines
MtE
SMEG / TPE
Simile (metamorphic)
shikata_ga_nai
Shoggoth
4. Core Components of a Polymorphic Engine
Every polymorphic engine, from MtE to Shoggoth, must implement a set of core components. The sophistication of each component determines the engine’s evasion capability:
Component Architecture
| Component | Purpose | Shoggoth Implementation |
|---|---|---|
| Random Number Generator | Source of entropy for all decisions | C++ <random> with optional seed for reproducibility |
| Encryption Engine | Transforms payload bytes into unreadable ciphertext | Two-stage: RC4 + random block cipher (XOR/ADD/SUB/ROL/ROR/NOT/NEG/INC/DEC) |
| Decoder Generator | Emits machine code that reverses the encryption at runtime | asmjit x86::Assembler with CodeHolder and JitRuntime |
| Register Allocator | Randomly assigns CPU registers to decoder variables | Random selection from available GPR pool (RAX-R15, excluding RSP and RBP) |
| Junk Code Generator | Inserts meaningless instructions to break pattern matching | Recursive generation: jump-over blocks, side-effect-free ops, fake calls |
| Output Assembler | Combines decoder + encrypted payload into final output | Flat PIC binary blob, optionally with PE/COFF loader prepended |
5. The Decoder Stub Anatomy
The decoder stub is the heart of any polymorphic output. It must accomplish a fixed set of tasks, but the way it accomplishes them varies. Here is a conceptual pseudocode for a polymorphic decoder:
Pseudocode// Conceptual decoder stub structure (varies each generation)
//
// 1. Locate encrypted payload (RIP-relative addressing)
// 2. Set up decryption loop counter
// 3. For each block/byte of encrypted data:
// a. Apply inverse operations (reverse order of encryption)
// b. Advance pointer
// 4. Transfer control to decrypted payload
//
// What changes between generations:
// - Which registers hold the pointer, counter, key
// - Whether the loop counts up or down
// - Which instructions implement each operation
// - Where junk instructions are inserted
// - Whether operations are inlined or use subroutines
The invariant is the algorithm (decrypt the payload, then jump to it). Everything else — the registers, the instruction encodings, the ordering of setup operations, the junk code — is variable. This separation of algorithm from implementation is the fundamental principle of polymorphism.
6. Equivalent Instruction Substitution
A key technique in polymorphic engines is replacing instructions with functionally equivalent alternatives. The x86-64 instruction set is rich with redundancies that engines can exploit:
| Original Instruction | Equivalent Alternatives |
|---|---|
mov rax, 0 | xor rax, rax / sub rax, rax / and rax, 0 / push 0; pop rax |
mov rax, rbx | push rbx; pop rax / lea rax, [rbx] / xchg rax, rbx (if rbx no longer needed) |
add rax, 1 | inc rax / sub rax, -1 / lea rax, [rax+1] |
test rax, rax | cmp rax, 0 / or rax, rax / and rax, rax |
xor [ptr], key | mov tmp, [ptr]; xor tmp, key; mov [ptr], tmp |
Each substitution changes the binary encoding (different opcodes, different lengths) while preserving the computational result. When combined with register randomization and junk code insertion, the resulting machine code looks completely different between generations.
Side-Effect Awareness
Not all substitutions are truly equivalent. Some instructions affect CPU flags differently. For example, lea rax, [rax+1] does not modify the FLAGS register, while inc rax modifies OF, SF, ZF, AF, PF (but not CF), and add rax, 1 modifies all arithmetic flags. A correct polymorphic engine must track flag dependencies to avoid breaking the decoder logic.
7. Why Shoggoth Chose asmjit
Historical polymorphic engines often assembled machine code manually — writing raw bytes into a buffer, computing jump offsets by hand, and managing relocations manually. This is error-prone and difficult to maintain. Shoggoth uses asmjit, a mature C++ library for runtime x86/x64 code generation, which provides:
- Assembler-level abstraction: Write
a.xor_(rax, rbx)instead of computing opcode bytes manually - Automatic encoding: asmjit handles REX prefixes, ModR/M bytes, SIB bytes, and displacement encoding
- Label management: Forward references and branch target resolution are handled automatically
- CodeHolder: A container that accumulates generated code, manages sections, and handles relocations
- JitRuntime: Allocates executable memory and copies the generated code for execution (used in testing)
This allows Shoggoth to express its polymorphic transformations at a high level while asmjit handles the low-level encoding details. We will explore asmjit in depth in Module 3.
Knowledge Check
Q1: What is the key difference between polymorphic and metamorphic code?
Q2: What was historically significant about MtE (Mutation Engine)?
Q3: Why must a polymorphic engine track CPU flag side effects during instruction substitution?
lea rax,[rax+1] does not modify FLAGS while add rax,1 modifies CF, OF, SF, ZF, AF, PF. If the decoder uses a conditional branch that depends on a flag, substituting the wrong instruction can break the decryption logic entirely.