zhongwei/gh-cyberkaida-reverse-engineering-assistant-reva

Files

Zhongwei Li 31df8711f2 Initial commit

2025-11-29 18:17:15 +08:00

18 KiB

Raw Blame History

Cryptographic Pattern Recognition Reference

This document provides pattern recognition guides for identifying and analyzing cryptographic implementations in compiled binaries. These patterns are generic and apply across different algorithms - focus on conceptual characteristics, not specific implementations.

General Crypto Recognition

Crypto Presence Indicators

High-confidence indicators:

Crypto-related strings: "encrypt", "decrypt", "cipher", "AES", "RSA", "key", "hash"
Crypto library imports: CryptEncrypt, OpenSSL functions, libcrypto
Large constant arrays (256+ bytes) with seemingly random data
Heavy use of XOR operations
Bitwise rotation patterns: (x << n) | (x >> (32-n))
Fixed iteration counts (rounds): 8, 10, 12, 14, 16, 32, 64, 80
Modular arithmetic on large integers

Medium-confidence indicators:

Nested loops with array manipulations
Byte-level array indexing patterns
S-box style lookups: output = table[input]
State transformation in fixed-size blocks

What to check:

search-strings-regex pattern="(encrypt|decrypt|crypto|cipher|AES|DES|RSA|RC4|key|hash|salt|iv)"
get-symbols includeExternal=true → Look for crypto API imports
search-decompilation pattern="(xor|sbox|round|permut)"

Block Cipher Patterns

Conceptual Characteristics

Core concept: Transform fixed-size data blocks through multiple rounds of substitution and permutation.

Key identifying features:

Fixed block size: Data processed in chunks (64 bits, 128 bits, etc.)
Round structure: Outer loop with fixed iteration count
Substitution: Table lookups (S-boxes) replacing input bytes
Permutation: Bit shuffling, rotation, mixing operations
Key schedule: Function deriving per-round keys from master key

Generic code structure:

// Simplified conceptual pattern
void block_cipher_encrypt(uint8_t* data, uint8_t* key) {
    uint8_t round_keys[NUM_ROUNDS][KEY_SIZE];
    generate_round_keys(key, round_keys);

    for (int round = 0; round < NUM_ROUNDS; round++) {
        substitute_bytes(data);      // S-box lookups
        permute_bits(data);          // Bit shuffling
        mix_columns(data);           // Linear transformation
        add_round_key(data, round_keys[round]);  // XOR with round key
    }
}

Substitution-Permutation Network (SPN)

What it is: Most modern block ciphers (AES, PRESENT, etc.)

Recognition pattern:

Loop structure:
  for round in 0..NUM_ROUNDS:
    1. SubBytes (S-box lookup)
    2. ShiftRows/PermuteBits (positional change)
    3. MixColumns (linear transformation)
    4. AddRoundKey (XOR with round key)

Characteristics:
  - Large constant arrays (S-boxes, typically 256 bytes)
  - Heavy XOR usage
  - Byte/word array indexing
  - State array (16+ bytes)

AES-specific signatures:

S-box starting: 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5...
Round counts: 10 (AES-128), 12 (AES-192), 14 (AES-256)
128-bit state (16 bytes, often as 4x4 matrix)
Rcon (round constant) array for key expansion

DES-specific signatures:

64-bit blocks (8 bytes)
16 rounds
Permutation tables (IP, FP)
S-box arrays (8 boxes of 64 entries each)
Feistel structure (see below)

Feistel Network

What it is: Older block cipher design (DES, Blowfish, TEA)

Recognition pattern:

Loop structure:
  Split data into left and right halves
  for round in 0..NUM_ROUNDS:
    temp = right
    right = left XOR F(right, round_key[round])
    left = temp
  Swap and combine halves

Characteristics:
  - Data split in half
  - Swap operation each round
  - F-function (round function) operating on half the data
  - Other half XORed with F-function output

Telltale code patterns:

// Feistel structure
uint32_t left = data[0];
uint32_t right = data[1];

for (int i = 0; i < rounds; i++) {
    uint32_t temp = right;
    right = left ^ f_function(right, key[i]);
    left = temp;
}

Block Cipher Investigation Strategy

Count rounds: Outer loop iterations → Indicates cipher type and key size
Measure block size: How much data processed per iteration → 64-bit (DES) or 128-bit (AES)
Identify S-boxes: Large constant arrays → read-memory and compare to known S-boxes
Check key schedule: Look for function deriving multiple round keys from master key
Recognize structure: SPN (parallel operations) vs Feistel (swap pattern)

Useful tools:

get-decompilation limit=50 includeIncomingReferences=true
read-memory at constant array addresses
rename-variables: var_1 → sbox, var_2 → state, var_3 → round_key

Stream Cipher Patterns

Conceptual Characteristics

Core concept: Generate pseudo-random keystream from key, then XOR with plaintext.

Key identifying features:

State-based generation: Internal state evolves to produce keystream bytes
Simple combination: ciphertext = plaintext XOR keystream
No fixed blocks: Can encrypt arbitrary lengths
Smaller code: Less complex than block ciphers (no large S-boxes)
Initialization: State setup from key (KSA - Key Scheduling Algorithm)

Generic code structure:

// Simplified conceptual pattern
void stream_cipher(uint8_t* data, size_t len, uint8_t* key) {
    uint8_t state[STATE_SIZE];
    initialize_state(state, key);  // KSA

    for (size_t i = 0; i < len; i++) {
        uint8_t keystream_byte = generate_next_byte(state);  // PRGA
        data[i] ^= keystream_byte;
    }
}

RC4 Pattern (Most Common in CTFs)

Recognition pattern:

Initialization (KSA):
  state = [0, 1, 2, ..., 255]  // 256-byte array
  j = 0
  for i in 0..255:
    j = (j + state[i] + key[i % key_len]) % 256
    swap(state[i], state[j])

Keystream generation (PRGA):
  i = 0; j = 0
  for each byte:
    i = (i + 1) % 256
    j = (j + state[i]) % 256
    swap(state[i], state[j])
    keystream_byte = state[(state[i] + state[j]) % 256]
    output ^= keystream_byte

Telltale signs:

256-byte state array
Swap operations: temp = a[i]; a[i] = a[j]; a[j] = temp
Modulo 256 (% 256 or & 0xFF)
Index computations with running totals
Two-phase structure (init, then generate)

ChaCha/Salsa Pattern

Recognition pattern:

512-bit state (16 words of 32 bits)
Quarter-round function (ARX: Add-Rotate-XOR)
Magic constants: "expand 32-byte k" or "expand 16-byte k"
20 rounds (10 double-rounds) for ChaCha20
Heavy use of 32-bit rotation

Stream Cipher Investigation Strategy

Find state initialization: Look for array setup from key
Identify update function: How state evolves (swap, ARX, LCG, etc.)
Locate XOR operation: Simple output = input ^ keystream
Check for reuse: Is same keystream used multiple times? (weakness)
Analyze state size: 256 bytes (RC4), 64 bytes (ChaCha), variable (custom)

Useful tools:

search-decompilation pattern="swap|xor"
get-decompilation to see state evolution loop
rename-variables: var_1 → state, var_2 → keystream, var_3 → index

Public Key Cryptography Patterns

Conceptual Characteristics

Core concept: Asymmetric encryption using mathematical trapdoor functions.

Key identifying features:

Large integer arithmetic: Numbers hundreds or thousands of bits
Modular exponentiation: result = base^exponent mod modulus
Very slow: Orders of magnitude slower than symmetric crypto
Multi-precision arithmetic: Arrays representing big integers

Generic code structure:

// Simplified modular exponentiation (square-and-multiply)
bigint modexp(bigint base, bigint exponent, bigint modulus) {
    bigint result = 1;
    while (exponent > 0) {
        if (exponent & 1) {
            result = (result * base) % modulus;  // Multiply
        }
        base = (base * base) % modulus;  // Square
        exponent >>= 1;
    }
    return result;
}

RSA Pattern

Recognition pattern:

Key components:
  - Large modulus N (1024, 2048, 4096+ bits)
  - Public exponent e (often 65537 = 0x10001)
  - Private exponent d

Encryption: c = m^e mod N
Decryption: m = c^d mod N

Operations:
  - Modular exponentiation (square-and-multiply)
  - Multi-precision multiplication
  - Barrett or Montgomery reduction for modulo

Telltale signs:

Very large buffers (128, 256, 512 bytes+)
Magic constant 0x10001 (common RSA public exponent)
Bit-by-bit processing of exponent
Slow execution (many iterations)
Functions for add/subtract/multiply on arrays

Elliptic Curve Pattern

Recognition pattern:

Point addition/doubling operations
Affine or projective coordinates (x, y) or (x, y, z)
Field arithmetic (modular arithmetic over prime field)
Curve parameters (a, b, p, G, n)
Scalar multiplication (point added to itself k times)

Public Key Investigation Strategy

Identify big integer operations: Look for array-based arithmetic
Find exponentiation pattern: Square-and-multiply loop
Extract parameters: Modulus, exponent values from constants
Check key size: Buffer sizes indicate security level
Look for weak parameters: Small exponents, factorable moduli (CTF tricks)

CTF-specific weaknesses:

Small modulus (factorizable)
Small private exponent (Wiener's attack)
Reused primes across multiple keys
Textbook RSA (no padding, malleable)

Hash Function Patterns

Conceptual Characteristics

Core concept: One-way compression of arbitrary data to fixed-size digest.

Key identifying features:

Initialization constants: Fixed magic numbers unique to algorithm
Block processing: Data processed in chunks (512 bits typical)
State accumulation: Running state updated with each block
Padding: Append bits to make input multiple of block size
Heavy mixing: Lots of bitwise operations (irreversible)

Generic code structure:

// Simplified hash structure
void hash(uint8_t* data, size_t len, uint8_t* digest) {
    uint32_t state[STATE_SIZE];
    initialize_state(state);  // Magic constants

    // Process each block
    for (each block in data) {
        process_block(state, block);  // Compression function
    }

    finalize(state, digest);  // Output transformation
}

MD5/SHA Recognition

MD5 initialization constants:

state[0] = 0x67452301;
state[1] = 0xefcdab89;
state[2] = 0x98badcfe;
state[3] = 0x10325476;

SHA-1 initialization constants:

state[0] = 0x67452301;
state[1] = 0xefcdab89;
state[2] = 0x98badcfe;
state[3] = 0x10325476;
state[4] = 0xc3d2e1f0;

SHA-256 initialization constants:

// First 32 bits of fractional parts of square roots of first 8 primes
state[0] = 0x6a09e667;
state[1] = 0xbb67ae85;
state[2] = 0x3c6ef372;
// ... 5 more

Telltale signs:

Characteristic initialization constants (search for these!)
Fixed round counts: 64 (MD5, SHA-256), 80 (SHA-1, SHA-512)
Bitwise rotations: (x << n) | (x >> (32-n))
Message schedule expansion (W array)
Mixing functions (F, G, H functions in MD5)

Hash Investigation Strategy

Search for magic constants: Hash functions have unique initializers
Count rounds: 64 or 80 iterations → Specific hash function
Check block size: 512 bits (MD5, SHA-1, SHA-256) or 1024 bits (SHA-512)
Identify mixing operations: AND, OR, XOR, NOT, rotation patterns
Find padding logic: Append 0x80, then zeros, then length

Useful tools:

search-decompilation pattern="0x67452301|0xefcdab89|0x98badcfe"
get-decompilation to see round structure
read-memory at initialization constants

Simple Obfuscation Patterns

XOR Cipher

What it is: Trivial encryption used for obfuscation, not security.

Recognition pattern:

Single-byte key:
  for (i = 0; i < len; i++)
    data[i] ^= 0x42;  // Fixed constant

Multi-byte key:
  for (i = 0; i < len; i++)
    data[i] ^= key[i % keylen];  // Repeating key

Rolling key (LCG-based):
  key = seed;
  for (i = 0; i < len; i++) {
    data[i] ^= key;
    key = (key * A + C) % M;  // Linear congruential generator
  }

Telltale signs:

Very short functions (5-10 lines)
XOR with constants or simple patterns
Often applied to strings or config data
No complex state or multiple rounds

Breaking approach:

Single-byte: Brute-force (256 possibilities)
Multi-byte: Frequency analysis or known-plaintext
Rolling key: If LCG parameters known, reproduce sequence

Substitution Cipher

Recognition pattern:

Simple substitution:
  for (i = 0; i < len; i++)
    output[i] = substitution_table[input[i]];

Caesar cipher (special case):
  for (i = 0; i < len; i++)
    output[i] = (input[i] + shift) % 256;

Breaking approach:

Frequency analysis (if sufficient ciphertext)
Known-plaintext attack
Brute-force substitution table

Custom Cipher Pattern

What it is: Challenge-specific encryption scheme not based on standards.

Recognition indicators:

No match to known crypto patterns
Unusual operations or data flow
Mix of arithmetic, XOR, bit shifts in non-standard way
Often simpler than real crypto (for solvability)

Investigation strategy:

Document operations: What transformations are applied, in what order?
Identify invertibility: Can operations be reversed?
Look for weaknesses:
- Reduced keyspace (brute-forceable)
- Linear operations (algebraically solvable)
- Repeated patterns (exploitable structure)
Known-plaintext: If you have plaintext-ciphertext pairs, work backwards
Replicate in Python: Reproduce encryption logic, then reverse it

Common CTF custom cipher weaknesses:

Insufficient mixing (partially recoverable plaintext)
Weak key derivation (predictable)
Reversible operations (decrypt by inverting)
Small state space (brute-forceable)

Recognition Workflow

Step 1: Initial Detection

1. Search for crypto strings
   search-strings-regex pattern="(encrypt|decrypt|aes|rsa|md5|sha|key)"

2. Check for crypto API imports
   get-symbols includeExternal=true → Look for OpenSSL, Windows Crypto API

3. Search for crypto patterns in code
   search-decompilation pattern="(xor|sbox|round)"

Step 2: Pattern Matching

4. Get decompilation of suspected function
   get-decompilation includeIncomingReferences=true

5. Compare to pattern categories:
   - Block cipher? (rounds, S-boxes, fixed blocks)
   - Stream cipher? (state, swap, XOR)
   - Hash? (magic constants, compression)
   - Public key? (big integers, modexp)
   - Simple obfuscation? (short, simple XOR)

Step 3: Detailed Analysis

6. Read constant arrays
   read-memory at suspected S-box/constant locations

7. Compare to known values
   - AES S-box: 63 7c 77 7b...
   - MD5 init: 67452301 efcdab89...
   - RSA exponent: 0x10001

8. Count iterations
   - 10/12/14 rounds → AES
   - 16 rounds → DES
   - 64/80 rounds → Hash function

Step 4: Verification

9. Rename variables for clarity
   rename-variables: var_1 → sbox, var_2 → key, var_3 → state

10. Document findings
    set-bookmark type="Analysis" category="Crypto"
    set-decompilation-comment line=N "AES encryption round"

11. Cross-check with usage
    find-cross-references → See where crypto is called, what data it processes

CTF-Specific Patterns

Key Management Anti-Patterns

Hardcoded keys (most common):

uint8_t key[] = {0x41, 0x42, 0x43, ...};  // Key in .data section
encrypt(data, key);

Finding: read-memory at key array address

Weak derivation:

// Time-based (predictable)
srand(time(NULL));
for (i = 0; i < keylen; i++)
    key[i] = rand() % 256;

// Constant seed (always same key)
srand(12345);
...

Finding: Analyze RNG initialization, predict or replicate

User input as key (brute-force candidate):

scanf("%s", key);  // Short password
if (strlen(key) < 8) ...

Finding: Small keyspace, brute-forceable

Implementation Bugs to Exploit

ECB mode (block patterns visible):

for (i = 0; i < len; i += BLOCK_SIZE)
    encrypt_block(data + i, key);  // No chaining

Weakness: Identical plaintext blocks → identical ciphertext blocks

IV reuse or zero IV:

uint8_t iv[16] = {0};  // Should be random!

Weakness: Breaks CBC security, enables attacks

Reduced rounds (weak variant):

#define ROUNDS 4  // Should be 10+ for AES

Weakness: May be breakable with cryptanalysis tools

Debug backdoor:

if (strcmp(password, "DEBUG") == 0)
    return decrypt_without_key(data);

Finding: Search for debug strings, test/admin backdoors

Using This Reference

Quick Lookup Process

Identify general category: Block/stream/hash/public-key/simple
Match to specific pattern: Compare code structure to examples
Verify with evidence: Check constants, round counts, operations
Document in Ghidra: Rename, retype, comment for clarity
Investigate weaknesses: Look for CTF-specific anti-patterns

Example Investigation Flow

Observation: Function with loop, array access, XOR

1. Compare to patterns:
   - Block cipher? (Check for S-boxes, rounds)
   - Stream cipher? (Check for swap, state evolution)
   - Simple XOR? (Check function length)

2. Verify:
   - Read memory at constant array (if exists)
   - Count loop iterations
   - Check for characteristic operations

3. Identify:
   - Found 256-byte array with specific pattern
   - Swap operations in initialization
   - Simple XOR in second phase

4. Conclude: RC4 stream cipher

5. Improve:
   rename-variables: state, keystream, plaintext
   set-comment: "RC4 encryption with hardcoded key"

6. Exploit:
   Extract key from initialization
   Replicate RC4 in Python to decrypt

Progressive Refinement

First pass: "This looks like crypto (XOR, loops, constants)" Second pass: "Probably a block cipher (rounds, S-box pattern)" Third pass: "Matches AES pattern (S-box signature, 10/12/14 rounds)" Fourth pass: "AES-128 with hardcoded key at 0x405000" Fifth pass: "Extracted key, successfully decrypted flag"

Each pass narrows down understanding and guides next investigation steps.

Remember

Patterns are guidelines, not rigid rules - CTF challenges may have variations
Constants are your friends - Magic numbers uniquely identify algorithms
Structure reveals intent - Loop patterns indicate algorithm type
CTF crypto is about implementation - Look for weaknesses, not mathematical breaks
Document as you learn - Rename variables to reflect your understanding
Verify with evidence - Don't guess - compare constants, count rounds, check operations

Use this reference alongside the conceptual framework in SKILL.md to systematically identify and analyze cryptographic implementations.

18 KiB Raw Blame History

Cryptographic Pattern Recognition Reference

General Crypto Recognition

Crypto Presence Indicators

Block Cipher Patterns

Conceptual Characteristics

Substitution-Permutation Network (SPN)

Feistel Network

Block Cipher Investigation Strategy

Stream Cipher Patterns

Conceptual Characteristics

RC4 Pattern (Most Common in CTFs)

ChaCha/Salsa Pattern

Stream Cipher Investigation Strategy

Public Key Cryptography Patterns

Conceptual Characteristics

RSA Pattern

Elliptic Curve Pattern

Public Key Investigation Strategy

Hash Function Patterns

Conceptual Characteristics

MD5/SHA Recognition

Hash Investigation Strategy

Simple Obfuscation Patterns

XOR Cipher

Substitution Cipher

Custom Cipher Pattern

Recognition Workflow

Step 1: Initial Detection

Step 2: Pattern Matching

Step 3: Detailed Analysis

Step 4: Verification

CTF-Specific Patterns

Key Management Anti-Patterns

Implementation Bugs to Exploit

Using This Reference

Quick Lookup Process

Example Investigation Flow

Progressive Refinement

Remember

18 KiB

Raw Blame History