Module 1: Signature-Based Detection & Why Polymorphism
Understanding how static analysis catches malware and why simple encoding schemes fail to provide lasting evasion.
Module Objective
Learn the mechanics behind signature-based detection, understand how YARA rules and pattern-matching engines work, see why XOR encoding and single-key encryption are trivially defeated, and build the conceptual foundation for why polymorphic engines like Shoggoth exist.
1. How Signature-Based Detection Works
Antivirus and EDR products rely heavily on signature-based detection as a first line of defense. A signature is a sequence of bytes, a pattern, or a set of conditions that uniquely identifies a known piece of malicious code. When a file is scanned, the detection engine searches for these known patterns within the binary data.
At its core, signature matching is a string search problem. The scanner maintains a database of thousands (sometimes millions) of signatures and compares every scanned file against this database. If a match is found, the file is flagged as malicious. The key advantage of this approach is its speed and low false-positive rate — a byte-exact match on a known malware sample is almost always a true positive.
| Detection Method | How It Works | Strengths | Weaknesses |
|---|---|---|---|
| Byte Signatures | Exact byte sequences extracted from known malware samples | Extremely fast, near-zero false positives | Trivially defeated by changing a single byte |
| Wildcard Signatures | Patterns with wildcards (e.g., EB ?? 90 90 ?? FF) allowing variable bytes | Tolerates minor variations | Wider patterns increase scan time and false positives |
| Heuristic Signatures | Behavioral patterns like “allocates RWX memory then writes MZ header” | Catches unknown variants | Higher false-positive rate, computationally expensive |
| YARA Rules | Flexible pattern language combining byte patterns, strings, conditions, and metadata | Highly expressive, community-maintained rulesets | Rules must be written and maintained; evasion is possible |
2. YARA Rules in Practice
YARA is the de facto standard for writing malware signatures in the security industry. A YARA rule consists of strings to search for (hex patterns, text strings, or regular expressions) and a condition that determines when the rule matches. Understanding YARA is essential to understanding what polymorphic engines must defeat.
Consider a simple YARA rule that targets a hypothetical shellcode loader:
YARArule ShellcodeLoader_Generic {
meta:
author = "Analyst"
description = "Detects generic shellcode loader pattern"
strings:
$api1 = "VirtualAlloc" ascii
$api2 = "VirtualProtect" ascii
$stub = { 48 89 5C 24 08 48 89 6C 24 10 48 89 74 24 18 }
$xor_loop = { 80 34 ?? ?? 48 FF C? 48 3B ?? 75 }
condition:
uint16(0) == 0x5A4D and
all of ($api*) and
($stub or $xor_loop)
}
This rule looks for a PE file containing API import strings, a known function prologue byte pattern, and a characteristic XOR decryption loop. If all conditions match, the file is flagged. The rule is effective against static payloads but falls apart against polymorphic output because the byte patterns, register choices, and instruction sequences change every time.
Key Insight
YARA rules target invariants — byte sequences that remain the same across samples. A polymorphic engine eliminates invariants by ensuring that no two outputs share the same byte sequences in their decoder stubs, encryption keys, or instruction layouts.
3. Why Simple XOR Encoding Fails
The most basic attempt at evading signatures is XOR encoding: take the payload, XOR every byte with a single key, and prepend a small decoder loop. This was sufficient in the 1990s but is trivially defeated today for several reasons:
3.1 The Decoder Stub Is Static
Even though the encrypted payload changes when the key changes, the decoder loop itself remains identical. A YARA rule can simply target the decoder stub rather than the payload:
ASM; Classic single-byte XOR decoder - always the same bytes
jmp short get_address ; EB XX
get_address:
pop rsi ; 5E
xor rcx, rcx ; 48 31 C9
mov cl, PAYLOAD_LEN ; B1 XX
decode_loop:
xor byte [rsi + rcx], KEY ; 80 74 0E XX
loop decode_loop ; E2 FA
jmp rsi ; FF E6
The opcodes EB, 5E, 48 31 C9, 80 74 0E, E2 FA, FF E6 form a reliable signature. Changing the XOR key changes the operand byte but not the instruction opcodes.
3.2 Statistical Analysis Breaks XOR
Single-byte XOR is vulnerable to frequency analysis. In most payloads, the null byte (0x00) appears frequently. When XORed with key K, null bytes become K. An analyst can find the most frequent byte in the encrypted blob, assume it corresponds to 0x00, and recover the key instantly.
Brute Force Is Trivial
A single-byte XOR key has only 256 possible values. Even without frequency analysis, an automated tool can try all 256 keys in microseconds, check if the decrypted output contains known strings or valid instructions, and recover the payload. Multi-byte XOR with a short repeating key is only marginally better — Kasiski examination and index-of-coincidence analysis can determine the key length and break it.
4. Why Multi-Layer Static Encryption Still Fails
A natural progression from single-byte XOR is to layer multiple encryption operations: first XOR with key A, then ADD with key B, then ROL by 3 bits. This makes frequency analysis harder, but it does not solve the fundamental problem:
- The decoder is still static. No matter how complex the encryption, the decoder stub that reverses the operations is a fixed sequence of instructions. Analysts write signatures for the decoder, not the encrypted blob.
- The structure is predictable. Decoder stub + encrypted payload is a recognizable pattern. The entropy profile (low-entropy decoder followed by high-entropy encrypted data) is itself a detection signal.
- Emulation defeats it. Modern AV engines include CPU emulators that can execute the decoder stub in a sandbox, observe the decrypted output, and scan that for signatures. The encryption becomes irrelevant — the emulator peels it off automatically.
Why Static Encryption Fails Against Emulation
5. The Case for Polymorphism
Polymorphism solves the fundamental weakness of static encoding by ensuring that the decoder stub itself changes with every generation. A polymorphic engine does not merely change the encryption key — it changes the instructions used to perform decryption, the registers allocated, the order of operations, and even inserts meaningless junk code between real instructions.
The goal is to eliminate all static invariants from the output. Two runs of a polymorphic engine with the same input payload produce outputs that:
- Use different encryption keys and algorithms
- Have different decoder stub instructions (even though they perform the same logical operation)
- Use different CPU registers
- Contain different amounts and types of junk code
- Have different binary sizes and layouts
- Share zero common byte sequences longer than a few bytes
| Property | Static Encoder | Polymorphic Engine |
|---|---|---|
| Decoder stub | Identical every time | Different instructions, registers, layout each time |
| Encryption | Same algorithm, different key | Different algorithm chain, different keys, different order |
| Junk code | None | Randomly generated NOP-equivalent instructions inserted throughout |
| Register usage | Hardcoded registers | Randomly selected from available pool |
| Signature resilience | Trivially signatured on decoder | No stable byte pattern to signature |
| Emulation resilience | None | Junk code + opaque predicates slow emulation; can exceed time budgets |
6. What Shoggoth Brings to the Table
Shoggoth by frkngksl is a modern polymorphic encryptor implemented in C++ using the asmjit library for runtime x86-64 code generation. It takes three input types (raw shellcode, PE files, COFF/BOF files) and produces position-independent encrypted output that self-decrypts at runtime.
What makes Shoggoth a true polymorphic engine rather than a simple encoder:
Shoggoth’s Polymorphic Properties
- Two-stage encryption — RC4 stream cipher plus a random block cipher using operations chosen from XOR, ADD, SUB, ROL, ROR, NOT, NEG, INC, DEC
- Dynamic stub assembly — asmjit generates x86-64 machine code at runtime, choosing different registers and instruction sequences each time
- Recursive junk code — garbage instructions are inserted between real decoder operations, including jump-over blocks, side-effect-free computations, and fake function calls
- PIC output — all output is position-independent code that runs from any memory address without fixups
- Deterministic replay — an optional seed parameter allows reproducible generation for testing while still producing polymorphic output by default
Over the next seven modules, we will dissect every component of this engine: the asmjit code generation library, the encryption pipeline, the decoder stub construction, the junk code strategies, and the final output packaging for each supported format.
Knowledge Check
Q1: Why does a single-byte XOR encoder fail against modern detection?
Q2: What is the primary advantage of a polymorphic engine over a static encoder?
Q3: How does an AV emulator defeat static encryption?