gh-cyberkaida-reverse-engin…/skills/ctf-rev/patterns.md

# CTF Reverse Engineering Pattern Recognition

This document provides pattern recognition guides for common CTF reverse engineering challenges. Focus on **identifying patterns quickly** to guide your solution strategy.

## Cryptographic Patterns

### Simple XOR Patterns

**Recognition Signature:**
```
Single-byte XOR:
  for (i = 0; i < len; i++)
    output[i] = input[i] ^ 0xKEY;

Multi-byte XOR (repeating key):
  for (i = 0; i < len; i++)
    output[i] = input[i] ^ key[i % keylen];

Rolling XOR:
  xor_val = seed;
  for (i = 0; i < len; i++) {
    output[i] = input[i] ^ xor_val;
    xor_val = next_value(xor_val);  // Linear congruential or similar
  }
```

**What to look for:**
- Very short functions (5-15 lines decompiled)
- XOR operation in loop
- Constant value or small array
- Modulo operation for key index (`i % keylen`)

**ReVa detection:**
```
search-decompilation pattern="\\^" caseSensitive=false
→ Find XOR operations

get-decompilation of suspicious function
→ Look for loop with XOR

read-memory at key location
→ Extract XOR key
```

**Solution approach:**
- XOR is self-inverse: `decrypt(x) = encrypt(x)`
- If you have ciphertext + key: plaintext = ciphertext XOR key
- If you have plaintext + ciphertext: key = plaintext XOR ciphertext
- If you have partial known plaintext: derive key, decrypt rest

### Base64 and Variants

**Recognition Signature:**
```
Character lookup table (64-character alphabet):
  Standard: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
  Custom: May use different alphabet

Bit manipulation:
  3 bytes → 4 encoded characters
  Shifting and masking: (data >> 18) & 0x3F

Padding:
  '=' characters or custom padding
```

**What to look for:**
- 64-character string constant (lookup table)
- Bit shifting: `>> 6`, `>> 12`, `>> 18`
- Masking: `& 0x3F` (6 bits)
- 3-to-4 or 4-to-3 byte conversion ratio
- Padding logic

**ReVa detection:**
```
search-strings-regex pattern="[A-Za-z0-9+/]{64}"
→ Find base64 alphabet

search-decompilation pattern="& 0x3f"
→ Find 6-bit masking (base64 characteristic)

get-decompilation of encoding function
→ Confirm 3→4 byte transformation
```

**Solution approach:**
- If standard base64: use standard decoder
- If custom alphabet: map custom → standard, then decode
- Reverse engineering: identify alphabet, implement decoder

### Block Cipher Patterns (AES, DES, etc.)

**Recognition Signature:**
```
AES characteristics:
  - 128-bit (16-byte) blocks
  - 10, 12, or 14 rounds (for 128, 192, 256-bit keys)
  - S-box: 256-byte constant array starting 63 7c 77 7b f2 6b 6f c5...
  - Mix columns, shift rows operations
  - Key schedule expansion

DES characteristics:
  - 64-bit (8-byte) blocks
  - 16 rounds
  - Permutation tables (IP, FP, E, P, S-boxes)
  - Feistel structure (split, swap, repeat)
```

**What to look for:**
```
Nested loops:
  for (round = 0; round < NUM_ROUNDS; round++)
    for (i = 0; i < BLOCK_SIZE; i++)
      state[i] = transform(state[i], key[round]);

Large constant arrays:
  uint8_t sbox[256] = {0x63, 0x7c, 0x77, ...};

Block processing:
  Fixed-size chunks (16 bytes for AES, 8 for DES)

Key schedule:
  Function deriving round keys from master key
```

**ReVa detection:**
```
search-decompilation pattern="(for.*round|for.*0x10)"
→ Find round loops

read-memory at constant arrays
→ Compare first bytes to known S-boxes:
   AES: 63 7c 77 7b f2 6b 6f c5
   DES S1: 0e 04 0d 01 02 0f 0b 08

get-decompilation with focus on nested loops
→ Count iterations (round count indicates key size)
```

**Solution approach:**
- Identify algorithm by S-box or constants
- Extract key from memory or key schedule
- Use standard implementation to decrypt
- For custom implementations, replicate in Python/C

### Stream Cipher Patterns (RC4, etc.)

**Recognition Signature:**
```
RC4 characteristics:
  KSA (Key Scheduling Algorithm):
    for i = 0 to 255: S[i] = i
    for i = 0 to 255: swap S[i] with S[(S[i] + key[i % keylen]) % 256]

  PRGA (Pseudo-Random Generation Algorithm):
    i = 0, j = 0
    while generating:
      i = (i + 1) % 256
      j = (j + S[i]) % 256
      swap(S[i], S[j])
      output = S[(S[i] + S[j]) % 256]
```

**What to look for:**
```
State array initialization:
  for (i = 0; i < 256; i++) state[i] = i;

Swap operations:
  temp = arr[i];
  arr[i] = arr[j];
  arr[j] = temp;

Modulo arithmetic:
  (i + 1) % 256
  index & 0xFF  (equivalent to % 256)

Simple XOR with keystream:
  output[i] = input[i] ^ keystream[i];
```

**ReVa detection:**
```
search-decompilation pattern="(swap|temp.*=.*\\[)"
→ Find array swap operations

get-decompilation of initialization
→ Look for 0-255 loop filling array

find-cross-references to state array
→ Trace usage through KSA and PRGA
```

**Solution approach:**
- Extract key from initialization
- Replicate KSA to generate initial state
- Replicate PRGA to generate keystream
- XOR ciphertext with keystream to decrypt

### Hash Function Patterns

**Recognition Signature:**
```
MD5/SHA characteristics:
  - Fixed initialization vectors (magic constants)
  - Block processing (512 bits / 64 bytes)
  - Multiple rounds (64 for MD5/SHA-256, 80 for SHA-1)
  - Bitwise operations: rotations, XOR, AND, OR, NOT
  - Padding: append 0x80, then zeros, then length

Magic constants:
  MD5: 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476
  SHA-1: adds 0xc3d2e1f0
  SHA-256: Eight 32-bit constants derived from square roots
```

**What to look for:**
```
Characteristic constants:
  Search for 0x67452301 (MD5/SHA-1 IV)

Fixed round counts:
  for (round = 0; round < 64; round++)  // MD5, SHA-256
  for (round = 0; round < 80; round++)  // SHA-1

Bitwise rotation macros:
  ROTL(x, n) = (x << n) | (x >> (32-n))

Message schedule (W array):
  Expands 16 input words to 64/80 words

Padding logic:
  Append 0x80, zeros, then 64-bit length
```

**ReVa detection:**
```
search-decompilation pattern="0x67452301"
→ Find MD5/SHA initialization

read-memory at round constants
→ Identify specific hash variant

get-decompilation of hash function
→ Count rounds, identify structure
```

**Solution approach:**
- Hash functions are one-way (cannot decrypt)
- If you find hash of flag: need to brute force or use known input
- If you find comparison: extract expected hash, try common flags
- Check for weak hash (MD5, SHA-1) or short input (brute-forceable)

## Encoding Patterns

### Character Substitution

**Recognition Signature:**
```
Lookup table mapping:
  output[i] = table[input[i]];

Caesar cipher (shift):
  output[i] = (input[i] - 'A' + shift) % 26 + 'A';

Custom alphabet:
  const char* alphabet = "ZYXWVUTSRQPONMLKJIHGFEDCBAzyxwvutsrqponmlkjihgfedcba";
  output[i] = alphabet[input[i] - 'A'];
```

**What to look for:**
- Character array constants (alphabets, substitution tables)
- Character-by-character processing loops
- Range checks: `if (c >= 'A' && c <= 'Z')`
- Arithmetic on character codes: `c - 'A'`, `c + shift`

**ReVa detection:**
```
search-strings-regex pattern="[A-Z]{26}"
→ Find alphabet strings

search-decompilation pattern="(- 'A'|% 26)"
→ Find character arithmetic

get-decompilation of encoding function
→ Identify substitution pattern
```

**Solution approach:**
- Extract substitution table or shift value
- Build reverse mapping
- Apply to encoded data

### Binary-to-Text Encodings

**Recognition Signature:**
```
Hex encoding:
  "0123456789abcdef"
  nibble_high = (byte >> 4) & 0xF;
  nibble_low = byte & 0xF;

Binary/ASCII:
  Converting to "01011010" strings

Custom encodings:
  Mapping bytes to multi-character sequences
```

**What to look for:**
- Hex digit strings
- Bit extraction: `>> 4`, `& 0xF`, `& 1`
- Character code generation loops
- 1-to-2 or 1-to-8 byte expansion

**ReVa detection:**
```
search-decompilation pattern="(>> 4|& 0xf)"
→ Find nibble extraction (hex encoding)

get-strings to find encoding alphabets
→ Check for hex, binary digit strings
```

**Solution approach:**
- Identify encoding scheme
- Implement decoder
- Apply to encoded flag

## Input Validation Patterns

### Character-by-Character Comparison

**Recognition Signature:**
```
Direct comparison:
  for (i = 0; i < len; i++)
    if (input[i] != expected[i])
      return 0;
  return 1;

Comparison with transformation:
  for (i = 0; i < len; i++)
    if (transform(input[i]) != expected[i])
      return 0;
```

**What to look for:**
- Loop over input length
- Comparison inside loop: `!=`, `==`
- Early return on mismatch
- Success after full loop completion

**ReVa detection:**
```
search-decompilation pattern="(if.*!=|if.*==)"
→ Find comparison operations

get-decompilation of validation function
→ Identify loop structure

read-memory at expected value array
→ Extract expected bytes
```

**Solution approach:**
- If direct comparison: read expected array, that's the flag
- If transformed comparison: reverse transformation
- If complex transformation: trace each character

### Checksum Validation

**Recognition Signature:**
```
Sum check:
  sum = 0;
  for (i = 0; i < len; i++)
    sum += input[i];
  return (sum == EXPECTED_SUM);

XOR check:
  xor = 0;
  for (i = 0; i < len; i++)
    xor ^= input[i];
  return (xor == EXPECTED_XOR);

Custom accumulation:
  result = SEED;
  for (i = 0; i < len; i++)
    result = (result * MULT + input[i]) % MOD;
  return (result == EXPECTED);
```

**What to look for:**
- Accumulator variable (sum, product, xor)
- Loop updating accumulator
- Final comparison to constant
- May be combined with other checks

**ReVa detection:**
```
search-decompilation pattern="(\\+=|\\*=|\\^=)"
→ Find accumulator updates

get-decompilation of validation
→ Identify accumulation pattern

read-memory at expected value
→ Extract target checksum
```

**Solution approach:**
- Single checksum: underconstrained (many solutions)
- Multiple checksums: may uniquely identify input
- Extract all constraints, solve as system of equations

### Constraint-Based Validation

**Recognition Signature:**
```
Multiple independent checks:
  if (input[0] + input[1] != 0x64) return 0;
  if (input[0] - input[1] != 0x14) return 0;
  if (input[2] ^ 0x42 != 0x33) return 0;
  if (input[3] * 2 == input[4]) return 0;
  return 1;

Relational constraints:
  if (input[i] != input[j] + 5) return 0;
```

**What to look for:**
- Multiple if-statements with comparisons
- Arithmetic operations on input elements
- Relationships between different input positions
- Constants in comparisons

**ReVa detection:**
```
get-decompilation of validation function
→ Identify all comparison statements

set-decompilation-comment on each constraint
→ Document relationships

Extract to external solver:
→ List all constraints, solve with z3 or similar
```

**Solution approach:**
- Extract all constraints
- Frame as system of equations
- Solve using constraint solver (z3, SMT)
- Verify solution satisfies all constraints

## Algorithm Patterns

### Mathematical Sequences

**Recognition Signature:**
```
Fibonacci:
  a = 0, b = 1;
  while (...) {
    next = a + b;
    a = b;
    b = next;
  }

Factorial:
  result = 1;
  for (i = 1; i <= n; i++)
    result *= i;

Prime checking:
  for (i = 2; i < sqrt(n); i++)
    if (n % i == 0) return 0;
  return 1;
```

**What to look for:**
- Iterative or recursive patterns
- Arithmetic progressions
- Number theory operations (modulo, divisibility)
- Known sequence generation

**ReVa detection:**
```
search-decompilation pattern="(fibonacci|factorial|prime)"
→ Find named functions (if not stripped)

get-decompilation of suspicious function
→ Identify mathematical pattern

Recognize by structure:
→ Two-variable update (Fibonacci)
→ Multiplication accumulator (factorial)
→ Modulo divisibility (prime check)
```

**Solution approach:**
- Recognize the algorithm
- Understand how it validates input
- Derive required input or replicate logic

### Matrix Operations

**Recognition Signature:**
```
Matrix multiplication:
  for (i = 0; i < rows; i++)
    for (j = 0; j < cols; j++)
      for (k = 0; k < inner; k++)
        result[i][j] += a[i][k] * b[k][j];

Linear transformations:
  output[i] = matrix[i][0] * input[0] + matrix[i][1] * input[1] + ...;
```

**What to look for:**
- Triple-nested loops (matrix multiply)
- 2D array indexing: `array[i][j]` or `array[i * width + j]`
- Accumulator in inner loop
- Linear combinations of input

**ReVa detection:**
```
search-decompilation pattern="\\[.*\\]\\[.*\\]"
→ Find 2D array access

get-decompilation showing nested loops
→ Count loop depth (3 = likely matrix multiply)

read-memory at matrix constants
→ Extract transformation matrix
```

**Solution approach:**
- Extract matrix
- Invert matrix (if square and invertible)
- Apply inverse to expected output to get required input

### State Machine Patterns

**Recognition Signature:**
```
Explicit state variable:
  int state = STATE_INIT;
  while (running) {
    switch (state) {
      case STATE_INIT: /* ... */ state = STATE_READY; break;
      case STATE_READY: /* ... */ state = STATE_PROCESS; break;
      case STATE_PROCESS: /* ... */ state = STATE_DONE; break;
    }
  }

Implicit state (position in input):
  for (i = 0; i < len; i++) {
    if (/* condition based on i and input */)
      /* different processing for different positions */
  }
```

**What to look for:**
- State variable with multiple values
- Large switch statement on state
- State transitions (state = NEW_STATE)
- Different behavior based on current state

**ReVa detection:**
```
search-decompilation pattern="(case|switch)"
→ Find switch statements

get-decompilation of state machine
→ Map state transitions

rename-variables to clarify states
→ current_state, next_state, etc.
```

**Solution approach:**
- Map state transition graph
- Identify accepting states (success)
- Determine input sequence that reaches accepting state

## Obfuscation Patterns

### Control Flow Obfuscation

**Recognition Signature:**
```
Opaque predicates (always true/false):
  if (x * x >= 0)  // Always true
    real_code();
  else
    never_executed();

Dispatcher loops:
  while (1) {
    switch (dispatch_value) {
      case 0: /* block A */; dispatch_value = 5; break;
      case 5: /* block B */; dispatch_value = 2; break;
      case 2: /* block C */; dispatch_value = -1; break;
      case -1: return;
    }
  }
```

**What to look for:**
- Unnecessary conditionals
- Complex control flow with simple logic
- Dispatcher-based execution (case jumps)
- Dead code branches

**ReVa detection:**
```
get-decompilation of obfuscated function
→ Look for unusual control flow

set-bookmark type="Warning" for suspicious patterns
→ Mark opaque predicates, dispatchers

Focus on data flow, ignore control flow complexity
→ Track input transformation regardless of jumps
```

**Solution approach:**
- Ignore obfuscation, trace data flow
- Use dynamic analysis to observe actual execution path
- Simplify manually or with deobfuscation tools

### String Obfuscation

**Recognition Signature:**
```
Stack strings (character-by-character):
  str[0] = 'f'; str[1] = 'l'; str[2] = 'a'; str[3] = 'g';

Encrypted strings (decrypted at runtime):
  decrypt_string(encrypted_data, key, output);

Computed strings:
  for (i = 0; i < len; i++)
    str[i] = base[i] ^ key;
```

**What to look for:**
- Character assignments to array
- String decryption functions
- XOR or arithmetic on character arrays
- Strings not visible in static string list

**ReVa detection:**
```
get-strings may not show obfuscated strings
→ Use decompilation to find construction

search-decompilation pattern="\\[0\\] = "
→ Find character-by-character assignments

find-cross-references to decryption functions
→ Locate where strings are revealed
```

**Solution approach:**
- Identify deobfuscation routine
- Extract encrypted data and key
- Decrypt manually or use dynamic analysis to observe decrypted string

### Anti-Debugging (CTF Context)

**Recognition Signature:**
```
Debugger detection:
  if (ptrace(PTRACE_TRACEME, 0, 1, 0) < 0) exit(1);  // Linux
  if (IsDebuggerPresent()) exit(1);  // Windows

Timing checks:
  start = time();
  /* short operation */
  end = time();
  if (end - start > THRESHOLD) exit(1);  // Detected breakpoint delay

Self-modification:
  Decrypt code section at runtime
  Execute decrypted code
  Re-encrypt afterwards
```

**What to look for:**
- Debugger detection APIs
- Timing measurements
- Memory protection changes
- Code modification at runtime

**ReVa detection:**
```
get-symbols includeExternal=true
→ Look for: ptrace, IsDebuggerPresent, time, gettimeofday

search-decompilation pattern="(ptrace|IsDebugger|time)"
→ Find anti-debug checks

find-cross-references to VirtualProtect, mprotect
→ Identify self-modifying code
```

**Solution approach:**
- Patch out anti-debug checks (NOP the exit)
- Use anti-anti-debugging tools
- Analyze in sandbox that hides debugger
- For CTF, often acceptable to patch binary

## Common CTF Tricks

### Flag Format Validation

**Pattern:**
```
Check prefix:
  if (strncmp(input, "flag{", 5) != 0) return 0;

Check suffix:
  if (input[len-1] != '}') return 0;

Check length:
  if (strlen(input) != EXPECTED_LEN) return 0;
```

**What to look for:**
- String comparison with literal "flag{" or "CTF{"
- Bracket/brace checks
- Length validation

**ReVa detection:**
```
search-strings-regex pattern="(flag\\{|CTF\\{)"
→ Find flag format strings

get-decompilation of validation
→ Extract format requirements
```

**Solution approach:**
- Note format requirements
- Focus on solving for content between delimiters
- Reconstruct full flag with proper format

### Multi-Stage Validation

**Pattern:**
```
Stage 1: Check format (flag{...})
Stage 2: Check length (must be 32 characters)
Stage 3: Check checksum (sum must equal X)
Stage 4: Check encryption (encrypted content matches Y)
```

**What to look for:**
- Multiple validation functions called in sequence
- Early exits on failure
- Progressive constraints

**ReVa detection:**
```
find-cross-references to validation function
→ See if called from multi-stage validator

get-decompilation of main validator
→ Identify call sequence

Analyze each stage separately
→ Understand cumulative constraints
```

**Solution approach:**
- Solve each stage's constraints
- Combine solutions (flag must satisfy ALL stages)
- Work backwards from most constrained to least

### Hidden Success Path

**Pattern:**
```
Obvious failure message:
  printf("Wrong!\n");

Hidden success logic:
  if (/* complex condition */)
    system("cat /flag.txt");  // No message, just action
```

**What to look for:**
- Success action without visible message
- File access (cat flag, open flag.txt)
- Network communication of flag
- Success indicated by lack of "Wrong" message

**ReVa detection:**
```
search-strings-regex pattern="(flag|/flag|flag\\.txt)"
→ Find flag file references

find-cross-references to flag file
→ Locate success path

get-decompilation of success condition
→ Understand requirements
```

**Solution approach:**
- Don't rely on "Correct!" message
- Look for flag output actions
- Check for file reads, network sends
- Success may be silent

## Using These Patterns

### Pattern Matching Workflow

1. **Observe code structure**
   - Loops, conditionals, function calls
   - Data types, array sizes
   - Constants and literals

2. **Compare to pattern catalog**
   - Does this match a crypto pattern?
   - Is this an encoding scheme?
   - Looks like input validation?

3. **Verify with specific checks**
   ```
   Hypothesis: This is AES
   Check 1: read-memory at constant array → Matches AES S-box? ✓
   Check 2: Count loop iterations → 10, 12, or 14? ✓
   Check 3: Block size 16 bytes? ✓
   Conclusion: AES confirmed
   ```

4. **Apply pattern-specific solution**
   - AES → Extract key, decrypt
   - XOR → Extract key, XOR again
   - Constraint validation → Extract constraints, solve

### Quick Reference Decision Tree

```
Does it have loops with XOR?
  → Check Simple XOR Patterns

Does it have large constant arrays?
  → Check Block Cipher or Hash Patterns

Does it have swap operations and modulo?
  → Check Stream Cipher Patterns

Does it have character-by-character comparison?
  → Check Input Validation Patterns

Does it have 64-character lookup table?
  → Check Base64 Pattern

Does it have mathematical operations (factorial, fibonacci)?
  → Check Algorithm Patterns

Is control flow overly complex?
  → Check Obfuscation Patterns
```

### Combining Patterns

Real challenges often combine multiple patterns:

**Example: Crypto + Validation**
```
Input → Format Check (flag{...}) → XOR Decode → AES Decrypt → Compare to Expected
```

**Solve:**
1. Extract format requirements
2. Identify XOR key
3. Identify AES key
4. Extract expected value
5. Work backwards: AES_decrypt(XOR_decode(expected)) with known keys

**Example: Encoding + Constraint**
```
Input → Base64 Decode → Constraint Check (sum == X, product == Y)
```

**Solve:**
1. Extract constraints on decoded values
2. Solve constraints
3. Base64 encode solution

## Remember

Patterns are **recognition shortcuts**, not rigid rules:
- Use patterns to quickly identify challenge type
- Adapt pattern solutions to specific implementation
- If pattern doesn't fit, analyze from first principles
- Document your pattern matches with bookmarks/comments
- Build your own pattern library from experience

When you recognize a pattern, you skip hours of analysis and jump directly to solution strategy.