Module 4: Shoggoth Architecture Overview
The end-to-end encoder pipeline: from raw input to polymorphic position-independent output, covering the three operational modes and the role of each component.
Module Objective
Understand Shoggoth’s complete architecture: how input payloads are processed, how the PIC loaders are merged for PE and COFF modes, the two encryption stages, the decoder stub generation pipeline, garbage code insertion points, and the final output assembly. By the end of this module, you will have a mental model of the entire data flow.
1. The Three Operational Modes
Shoggoth supports three input modes, each handling a different payload type. The mode determines how the input is pre-processed before encryption:
| Mode | Flag | Input | Pre-Processing | Output |
|---|---|---|---|---|
| Raw | --mode raw | Shellcode (.bin) | None — input is used as-is | PIC blob (decoder stub + encrypted shellcode) |
| PE | --mode pe | x64 PE executable (.exe) | PIC PE loader prepended to the PE file | PIC blob (decoder stub + encrypted [loader + PE]) |
| COFF | --mode coff | x64 COFF/BOF (.o) | PIC COFF loader prepended to the COFF object | PIC blob (decoder stub + encrypted [loader + COFF]) |
In raw mode, the input shellcode is treated as an opaque byte sequence. Shoggoth encrypts it and generates a decoder stub that decrypts and jumps to it. No loader is needed because the input is already position-independent shellcode.
In PE mode, the input is a standard x64 PE executable. Since PE files require a loader to process imports, relocations, and sections, Shoggoth prepends a PIC PE loader — a self-contained piece of shellcode that can parse the PE headers, map sections, resolve imports by walking the PEB/LDR structures, apply relocations, and transfer control to the entry point.
In COFF mode, the input is a COFF object file (commonly used as Beacon Object Files / BOFs in Cobalt Strike). A PIC COFF loader is prepended that handles symbol resolution and section mapping for COFF objects.
2. PIC Loaders
The PE and COFF loaders are critical components that enable Shoggoth to handle non-shellcode inputs. They are compiled from C source code using MinGW with specific constraints to ensure position-independence:
PIC Loader Constraints
- No global variables — all state is kept on the stack or in registers
- No C runtime — compiled with
-nostdlib -nostartfiles - Only .text section extracted — the compiled output is stripped to only the executable code
- Runtime API resolution — Windows API addresses are resolved at runtime by walking the PEB → LDR → InMemoryOrderModuleList to find kernel32.dll, then parsing its export table
- No absolute addresses — all references are RIP-relative or stack-relative
The loaders are pre-compiled and stored in the stub/ directory of the Shoggoth source tree. At encryption time, the appropriate loader binary is read, the input payload is appended to it, and the combined blob becomes the data that gets encrypted.
C++// Conceptual: PE mode payload assembly
// 1. Read the PIC PE loader stub
std::vector<uint8_t> loader = readFile("stub/PELoader.bin");
// 2. Read the input PE file
std::vector<uint8_t> peFile = readFile(inputPath);
// 3. Concatenate: loader + PE file = combined payload
std::vector<uint8_t> payload;
payload.insert(payload.end(), loader.begin(), loader.end());
payload.insert(payload.end(), peFile.begin(), peFile.end());
// 4. Now encrypt this combined payload
// The decoder stub will decrypt it, then execution starts
// at the loader, which parses and maps the PE file
3. The Encryption Pipeline
After the payload is assembled (raw shellcode, or loader + PE/COFF), Shoggoth applies a two-stage encryption pipeline. Each stage uses a different algorithm with randomly generated parameters:
Shoggoth Encryption Pipeline
shellcode / loader+PE / loader+COFF
Random key, stream cipher
Random ops on 8-byte blocks
3.1 Stage 1: RC4 Stream Cipher
The first encryption stage applies the RC4 stream cipher with a randomly generated key. RC4 was chosen for several reasons: it produces output of the same length as the input (no padding), it has a simple implementation that can be expressed in few x86 instructions, and it provides good byte-level diffusion (changing one key byte affects the entire output).
3.2 Stage 2: Random Block Cipher
The second stage divides the data into 8-byte blocks and applies a randomly selected chain of arithmetic/bitwise operations. For each encryption run, Shoggoth randomly selects which operations to apply and in what order from the pool: ADD, SUB, XOR, NOT, NEG, INC, DEC, ROL, ROR. Each operation uses a randomly generated key/shift value.
Optional Stage Control
Shoggoth provides flags to skip either encryption stage: --dont-do-first-encryption skips RC4, and --dont-do-second-encryption skips the block cipher. There is also --encrypt-only-decryptor which applies the second stage only to the RC4 decryptor stub (not the entire payload). These options are useful for testing or when layering Shoggoth with other tools.
4. Decoder Stub Generation
For each encryption stage, Shoggoth uses asmjit to generate a corresponding decoder stub — position-independent x86-64 machine code that reverses the encryption at runtime. The decoder stubs are the polymorphic heart of the system:
| Stub | Decrypts | Algorithm | Polymorphic Properties |
|---|---|---|---|
| Block Cipher Decoder | Stage 2 encryption | Inverse operations on 8-byte blocks (reverse order: if encrypted with ADD then XOR, decoder does XOR then SUB) | Random registers, random junk code, random operation sequence (matches encryption) |
| RC4 Decoder | Stage 1 encryption | RC4 KSA + PRGA implementation in x86-64 assembly | Random registers, junk code insertion between RC4 steps |
Each decoder stub is generated fresh using asmjit, with random register assignments and junk code inserted at multiple points. The stubs include RIP-relative addressing to locate the encrypted data that follows them in memory.
5. Garbage Code Insertion Points
Shoggoth inserts garbage (junk) instructions at multiple points in the pipeline to further break pattern matching. Junk code is inserted:
- Before the block cipher decoder — garbage instructions at the very start of execution
- Within the block cipher decoder loop — between real decryption instructions
- Between the two decoder stubs — after block cipher decryption, before RC4 decryption
- Within the RC4 decoder — between KSA and PRGA steps
- After the final decoder — before the jump to the decrypted payload
The junk code generator recursively produces instructions that have no net effect on the decoder’s functional state. These include jump-over blocks (a short JMP that skips random bytes), side-effect-free operations (push/pop pairs, XOR reg with self then XOR again), and fake function call patterns. Module 7 covers this in detail.
6. Final Output Assembly
The final output is assembled by concatenating the generated components in execution order:
Final PIC Output Structure
(block cipher)
(RC4)
When executed, the flow is:
- CPU executes junk preamble (no-ops effectively)
- Stage 2 decoder runs: decrypts the block cipher layer, revealing the RC4-encrypted payload (and the RC4 decoder stub, if
--encrypt-only-decryptorwas not used) - Junk interlude executes (more no-ops)
- Stage 1 decoder runs: decrypts the RC4 layer, revealing the cleartext payload
- Control transfers to the decrypted payload (shellcode, or the PIC PE/COFF loader)
7. Command-Line Interface
Shoggoth’s CLI exposes control over every stage of the pipeline:
Shell# Basic usage: encrypt raw shellcode
Shoggoth.exe -i payload.bin -o encrypted.bin -m raw
# Encrypt a PE file with a specific seed for reproducibility
Shoggoth.exe -i implant.exe -o encrypted.bin -m pe -s 12345
# Encrypt a COFF/BOF with custom RC4 key
Shoggoth.exe -i beacon.o -o encrypted.bin -m coff -k AABBCCDD
# Skip RC4 stage, only use block cipher
Shoggoth.exe -i payload.bin -o encrypted.bin -m raw --dont-do-first-encryption
# Encrypt COFF with BOF arguments
Shoggoth.exe -i beacon.o -o encrypted.bin -m coff --coff-arg 0x00000001...
| Flag | Required | Description |
|---|---|---|
-i / --input | Yes | Path to input payload file |
-o / --output | Yes | Path for encrypted output file |
-m / --mode | Yes | Encryption mode: raw, pe, or coff |
-s / --seed | No | RNG seed for deterministic output (useful for testing) |
-k / --key | No | Custom RC4 key in hex (default: randomly generated) |
--coff-arg | No | BOF arguments in beacon_generate.py format |
--dont-do-first-encryption | No | Skip Stage 1 (RC4) |
--dont-do-second-encryption | No | Skip Stage 2 (block cipher) |
--encrypt-only-decryptor | No | Stage 2 encrypts only the RC4 decoder, not the full payload |
8. Source Code Organization
The Shoggoth repository is organized into distinct directories, each handling a specific concern:
Repository Structure
| Directory | Contents | Role |
|---|---|---|
src/ | Main encryptor C++ source | Core engine: encryption, asmjit stub generation, junk code, CLI |
PELoader/ | PIC PE loader C source | Compiled to position-independent shellcode that loads PE files from memory |
COFFLoader/ | PIC COFF loader C source | Compiled to position-independent shellcode that loads COFF/BOF files |
stub/ | Pre-compiled loader binaries | Ready-to-use .bin files for PE and COFF loaders |
COFFArgGenerator/ | Python script | beacon_generate.py for formatting BOF arguments |
Knowledge Check
Q1: In PE mode, what does Shoggoth prepend to the PE file before encryption?
-nostdlib. This loader resolves API addresses by walking the PEB/LDR structures, maps PE sections, applies relocations, and calls the PE entry point. It contains no global variables and uses only RIP-relative addressing.Q2: What is the correct order of decryption when the output executes?
Q3: What does the --seed flag do?
--seed flag initializes the C++ random number generator with a fixed value. Since all random decisions (key generation, register selection, junk code placement, operation selection) flow from this RNG, the same seed with the same input produces byte-identical output. This is invaluable for debugging and testing while still being fully polymorphic by default (random seed from system entropy).