Files
gh-cyberkaida-reverse-engin…/skills/ctf-rev/patterns.md
2025-11-29 18:17:15 +08:00

21 KiB

CTF Reverse Engineering Pattern Recognition

This document provides pattern recognition guides for common CTF reverse engineering challenges. Focus on identifying patterns quickly to guide your solution strategy.

Cryptographic Patterns

Simple XOR Patterns

Recognition Signature:

Single-byte XOR:
  for (i = 0; i < len; i++)
    output[i] = input[i] ^ 0xKEY;

Multi-byte XOR (repeating key):
  for (i = 0; i < len; i++)
    output[i] = input[i] ^ key[i % keylen];

Rolling XOR:
  xor_val = seed;
  for (i = 0; i < len; i++) {
    output[i] = input[i] ^ xor_val;
    xor_val = next_value(xor_val);  // Linear congruential or similar
  }

What to look for:

  • Very short functions (5-15 lines decompiled)
  • XOR operation in loop
  • Constant value or small array
  • Modulo operation for key index (i % keylen)

ReVa detection:

search-decompilation pattern="\\^" caseSensitive=false
→ Find XOR operations

get-decompilation of suspicious function
→ Look for loop with XOR

read-memory at key location
→ Extract XOR key

Solution approach:

  • XOR is self-inverse: decrypt(x) = encrypt(x)
  • If you have ciphertext + key: plaintext = ciphertext XOR key
  • If you have plaintext + ciphertext: key = plaintext XOR ciphertext
  • If you have partial known plaintext: derive key, decrypt rest

Base64 and Variants

Recognition Signature:

Character lookup table (64-character alphabet):
  Standard: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
  Custom: May use different alphabet

Bit manipulation:
  3 bytes → 4 encoded characters
  Shifting and masking: (data >> 18) & 0x3F

Padding:
  '=' characters or custom padding

What to look for:

  • 64-character string constant (lookup table)
  • Bit shifting: >> 6, >> 12, >> 18
  • Masking: & 0x3F (6 bits)
  • 3-to-4 or 4-to-3 byte conversion ratio
  • Padding logic

ReVa detection:

search-strings-regex pattern="[A-Za-z0-9+/]{64}"
→ Find base64 alphabet

search-decompilation pattern="& 0x3f"
→ Find 6-bit masking (base64 characteristic)

get-decompilation of encoding function
→ Confirm 3→4 byte transformation

Solution approach:

  • If standard base64: use standard decoder
  • If custom alphabet: map custom → standard, then decode
  • Reverse engineering: identify alphabet, implement decoder

Block Cipher Patterns (AES, DES, etc.)

Recognition Signature:

AES characteristics:
  - 128-bit (16-byte) blocks
  - 10, 12, or 14 rounds (for 128, 192, 256-bit keys)
  - S-box: 256-byte constant array starting 63 7c 77 7b f2 6b 6f c5...
  - Mix columns, shift rows operations
  - Key schedule expansion

DES characteristics:
  - 64-bit (8-byte) blocks
  - 16 rounds
  - Permutation tables (IP, FP, E, P, S-boxes)
  - Feistel structure (split, swap, repeat)

What to look for:

Nested loops:
  for (round = 0; round < NUM_ROUNDS; round++)
    for (i = 0; i < BLOCK_SIZE; i++)
      state[i] = transform(state[i], key[round]);

Large constant arrays:
  uint8_t sbox[256] = {0x63, 0x7c, 0x77, ...};

Block processing:
  Fixed-size chunks (16 bytes for AES, 8 for DES)

Key schedule:
  Function deriving round keys from master key

ReVa detection:

search-decompilation pattern="(for.*round|for.*0x10)"
→ Find round loops

read-memory at constant arrays
→ Compare first bytes to known S-boxes:
   AES: 63 7c 77 7b f2 6b 6f c5
   DES S1: 0e 04 0d 01 02 0f 0b 08

get-decompilation with focus on nested loops
→ Count iterations (round count indicates key size)

Solution approach:

  • Identify algorithm by S-box or constants
  • Extract key from memory or key schedule
  • Use standard implementation to decrypt
  • For custom implementations, replicate in Python/C

Stream Cipher Patterns (RC4, etc.)

Recognition Signature:

RC4 characteristics:
  KSA (Key Scheduling Algorithm):
    for i = 0 to 255: S[i] = i
    for i = 0 to 255: swap S[i] with S[(S[i] + key[i % keylen]) % 256]

  PRGA (Pseudo-Random Generation Algorithm):
    i = 0, j = 0
    while generating:
      i = (i + 1) % 256
      j = (j + S[i]) % 256
      swap(S[i], S[j])
      output = S[(S[i] + S[j]) % 256]

What to look for:

State array initialization:
  for (i = 0; i < 256; i++) state[i] = i;

Swap operations:
  temp = arr[i];
  arr[i] = arr[j];
  arr[j] = temp;

Modulo arithmetic:
  (i + 1) % 256
  index & 0xFF  (equivalent to % 256)

Simple XOR with keystream:
  output[i] = input[i] ^ keystream[i];

ReVa detection:

search-decompilation pattern="(swap|temp.*=.*\\[)"
→ Find array swap operations

get-decompilation of initialization
→ Look for 0-255 loop filling array

find-cross-references to state array
→ Trace usage through KSA and PRGA

Solution approach:

  • Extract key from initialization
  • Replicate KSA to generate initial state
  • Replicate PRGA to generate keystream
  • XOR ciphertext with keystream to decrypt

Hash Function Patterns

Recognition Signature:

MD5/SHA characteristics:
  - Fixed initialization vectors (magic constants)
  - Block processing (512 bits / 64 bytes)
  - Multiple rounds (64 for MD5/SHA-256, 80 for SHA-1)
  - Bitwise operations: rotations, XOR, AND, OR, NOT
  - Padding: append 0x80, then zeros, then length

Magic constants:
  MD5: 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476
  SHA-1: adds 0xc3d2e1f0
  SHA-256: Eight 32-bit constants derived from square roots

What to look for:

Characteristic constants:
  Search for 0x67452301 (MD5/SHA-1 IV)

Fixed round counts:
  for (round = 0; round < 64; round++)  // MD5, SHA-256
  for (round = 0; round < 80; round++)  // SHA-1

Bitwise rotation macros:
  ROTL(x, n) = (x << n) | (x >> (32-n))

Message schedule (W array):
  Expands 16 input words to 64/80 words

Padding logic:
  Append 0x80, zeros, then 64-bit length

ReVa detection:

search-decompilation pattern="0x67452301"
→ Find MD5/SHA initialization

read-memory at round constants
→ Identify specific hash variant

get-decompilation of hash function
→ Count rounds, identify structure

Solution approach:

  • Hash functions are one-way (cannot decrypt)
  • If you find hash of flag: need to brute force or use known input
  • If you find comparison: extract expected hash, try common flags
  • Check for weak hash (MD5, SHA-1) or short input (brute-forceable)

Encoding Patterns

Character Substitution

Recognition Signature:

Lookup table mapping:
  output[i] = table[input[i]];

Caesar cipher (shift):
  output[i] = (input[i] - 'A' + shift) % 26 + 'A';

Custom alphabet:
  const char* alphabet = "ZYXWVUTSRQPONMLKJIHGFEDCBAzyxwvutsrqponmlkjihgfedcba";
  output[i] = alphabet[input[i] - 'A'];

What to look for:

  • Character array constants (alphabets, substitution tables)
  • Character-by-character processing loops
  • Range checks: if (c >= 'A' && c <= 'Z')
  • Arithmetic on character codes: c - 'A', c + shift

ReVa detection:

search-strings-regex pattern="[A-Z]{26}"
→ Find alphabet strings

search-decompilation pattern="(- 'A'|% 26)"
→ Find character arithmetic

get-decompilation of encoding function
→ Identify substitution pattern

Solution approach:

  • Extract substitution table or shift value
  • Build reverse mapping
  • Apply to encoded data

Binary-to-Text Encodings

Recognition Signature:

Hex encoding:
  "0123456789abcdef"
  nibble_high = (byte >> 4) & 0xF;
  nibble_low = byte & 0xF;

Binary/ASCII:
  Converting to "01011010" strings

Custom encodings:
  Mapping bytes to multi-character sequences

What to look for:

  • Hex digit strings
  • Bit extraction: >> 4, & 0xF, & 1
  • Character code generation loops
  • 1-to-2 or 1-to-8 byte expansion

ReVa detection:

search-decompilation pattern="(>> 4|& 0xf)"
→ Find nibble extraction (hex encoding)

get-strings to find encoding alphabets
→ Check for hex, binary digit strings

Solution approach:

  • Identify encoding scheme
  • Implement decoder
  • Apply to encoded flag

Input Validation Patterns

Character-by-Character Comparison

Recognition Signature:

Direct comparison:
  for (i = 0; i < len; i++)
    if (input[i] != expected[i])
      return 0;
  return 1;

Comparison with transformation:
  for (i = 0; i < len; i++)
    if (transform(input[i]) != expected[i])
      return 0;

What to look for:

  • Loop over input length
  • Comparison inside loop: !=, ==
  • Early return on mismatch
  • Success after full loop completion

ReVa detection:

search-decompilation pattern="(if.*!=|if.*==)"
→ Find comparison operations

get-decompilation of validation function
→ Identify loop structure

read-memory at expected value array
→ Extract expected bytes

Solution approach:

  • If direct comparison: read expected array, that's the flag
  • If transformed comparison: reverse transformation
  • If complex transformation: trace each character

Checksum Validation

Recognition Signature:

Sum check:
  sum = 0;
  for (i = 0; i < len; i++)
    sum += input[i];
  return (sum == EXPECTED_SUM);

XOR check:
  xor = 0;
  for (i = 0; i < len; i++)
    xor ^= input[i];
  return (xor == EXPECTED_XOR);

Custom accumulation:
  result = SEED;
  for (i = 0; i < len; i++)
    result = (result * MULT + input[i]) % MOD;
  return (result == EXPECTED);

What to look for:

  • Accumulator variable (sum, product, xor)
  • Loop updating accumulator
  • Final comparison to constant
  • May be combined with other checks

ReVa detection:

search-decompilation pattern="(\\+=|\\*=|\\^=)"
→ Find accumulator updates

get-decompilation of validation
→ Identify accumulation pattern

read-memory at expected value
→ Extract target checksum

Solution approach:

  • Single checksum: underconstrained (many solutions)
  • Multiple checksums: may uniquely identify input
  • Extract all constraints, solve as system of equations

Constraint-Based Validation

Recognition Signature:

Multiple independent checks:
  if (input[0] + input[1] != 0x64) return 0;
  if (input[0] - input[1] != 0x14) return 0;
  if (input[2] ^ 0x42 != 0x33) return 0;
  if (input[3] * 2 == input[4]) return 0;
  return 1;

Relational constraints:
  if (input[i] != input[j] + 5) return 0;

What to look for:

  • Multiple if-statements with comparisons
  • Arithmetic operations on input elements
  • Relationships between different input positions
  • Constants in comparisons

ReVa detection:

get-decompilation of validation function
→ Identify all comparison statements

set-decompilation-comment on each constraint
→ Document relationships

Extract to external solver:
→ List all constraints, solve with z3 or similar

Solution approach:

  • Extract all constraints
  • Frame as system of equations
  • Solve using constraint solver (z3, SMT)
  • Verify solution satisfies all constraints

Algorithm Patterns

Mathematical Sequences

Recognition Signature:

Fibonacci:
  a = 0, b = 1;
  while (...) {
    next = a + b;
    a = b;
    b = next;
  }

Factorial:
  result = 1;
  for (i = 1; i <= n; i++)
    result *= i;

Prime checking:
  for (i = 2; i < sqrt(n); i++)
    if (n % i == 0) return 0;
  return 1;

What to look for:

  • Iterative or recursive patterns
  • Arithmetic progressions
  • Number theory operations (modulo, divisibility)
  • Known sequence generation

ReVa detection:

search-decompilation pattern="(fibonacci|factorial|prime)"
→ Find named functions (if not stripped)

get-decompilation of suspicious function
→ Identify mathematical pattern

Recognize by structure:
→ Two-variable update (Fibonacci)
→ Multiplication accumulator (factorial)
→ Modulo divisibility (prime check)

Solution approach:

  • Recognize the algorithm
  • Understand how it validates input
  • Derive required input or replicate logic

Matrix Operations

Recognition Signature:

Matrix multiplication:
  for (i = 0; i < rows; i++)
    for (j = 0; j < cols; j++)
      for (k = 0; k < inner; k++)
        result[i][j] += a[i][k] * b[k][j];

Linear transformations:
  output[i] = matrix[i][0] * input[0] + matrix[i][1] * input[1] + ...;

What to look for:

  • Triple-nested loops (matrix multiply)
  • 2D array indexing: array[i][j] or array[i * width + j]
  • Accumulator in inner loop
  • Linear combinations of input

ReVa detection:

search-decompilation pattern="\\[.*\\]\\[.*\\]"
→ Find 2D array access

get-decompilation showing nested loops
→ Count loop depth (3 = likely matrix multiply)

read-memory at matrix constants
→ Extract transformation matrix

Solution approach:

  • Extract matrix
  • Invert matrix (if square and invertible)
  • Apply inverse to expected output to get required input

State Machine Patterns

Recognition Signature:

Explicit state variable:
  int state = STATE_INIT;
  while (running) {
    switch (state) {
      case STATE_INIT: /* ... */ state = STATE_READY; break;
      case STATE_READY: /* ... */ state = STATE_PROCESS; break;
      case STATE_PROCESS: /* ... */ state = STATE_DONE; break;
    }
  }

Implicit state (position in input):
  for (i = 0; i < len; i++) {
    if (/* condition based on i and input */)
      /* different processing for different positions */
  }

What to look for:

  • State variable with multiple values
  • Large switch statement on state
  • State transitions (state = NEW_STATE)
  • Different behavior based on current state

ReVa detection:

search-decompilation pattern="(case|switch)"
→ Find switch statements

get-decompilation of state machine
→ Map state transitions

rename-variables to clarify states
→ current_state, next_state, etc.

Solution approach:

  • Map state transition graph
  • Identify accepting states (success)
  • Determine input sequence that reaches accepting state

Obfuscation Patterns

Control Flow Obfuscation

Recognition Signature:

Opaque predicates (always true/false):
  if (x * x >= 0)  // Always true
    real_code();
  else
    never_executed();

Dispatcher loops:
  while (1) {
    switch (dispatch_value) {
      case 0: /* block A */; dispatch_value = 5; break;
      case 5: /* block B */; dispatch_value = 2; break;
      case 2: /* block C */; dispatch_value = -1; break;
      case -1: return;
    }
  }

What to look for:

  • Unnecessary conditionals
  • Complex control flow with simple logic
  • Dispatcher-based execution (case jumps)
  • Dead code branches

ReVa detection:

get-decompilation of obfuscated function
→ Look for unusual control flow

set-bookmark type="Warning" for suspicious patterns
→ Mark opaque predicates, dispatchers

Focus on data flow, ignore control flow complexity
→ Track input transformation regardless of jumps

Solution approach:

  • Ignore obfuscation, trace data flow
  • Use dynamic analysis to observe actual execution path
  • Simplify manually or with deobfuscation tools

String Obfuscation

Recognition Signature:

Stack strings (character-by-character):
  str[0] = 'f'; str[1] = 'l'; str[2] = 'a'; str[3] = 'g';

Encrypted strings (decrypted at runtime):
  decrypt_string(encrypted_data, key, output);

Computed strings:
  for (i = 0; i < len; i++)
    str[i] = base[i] ^ key;

What to look for:

  • Character assignments to array
  • String decryption functions
  • XOR or arithmetic on character arrays
  • Strings not visible in static string list

ReVa detection:

get-strings may not show obfuscated strings
→ Use decompilation to find construction

search-decompilation pattern="\\[0\\] = "
→ Find character-by-character assignments

find-cross-references to decryption functions
→ Locate where strings are revealed

Solution approach:

  • Identify deobfuscation routine
  • Extract encrypted data and key
  • Decrypt manually or use dynamic analysis to observe decrypted string

Anti-Debugging (CTF Context)

Recognition Signature:

Debugger detection:
  if (ptrace(PTRACE_TRACEME, 0, 1, 0) < 0) exit(1);  // Linux
  if (IsDebuggerPresent()) exit(1);  // Windows

Timing checks:
  start = time();
  /* short operation */
  end = time();
  if (end - start > THRESHOLD) exit(1);  // Detected breakpoint delay

Self-modification:
  Decrypt code section at runtime
  Execute decrypted code
  Re-encrypt afterwards

What to look for:

  • Debugger detection APIs
  • Timing measurements
  • Memory protection changes
  • Code modification at runtime

ReVa detection:

get-symbols includeExternal=true
→ Look for: ptrace, IsDebuggerPresent, time, gettimeofday

search-decompilation pattern="(ptrace|IsDebugger|time)"
→ Find anti-debug checks

find-cross-references to VirtualProtect, mprotect
→ Identify self-modifying code

Solution approach:

  • Patch out anti-debug checks (NOP the exit)
  • Use anti-anti-debugging tools
  • Analyze in sandbox that hides debugger
  • For CTF, often acceptable to patch binary

Common CTF Tricks

Flag Format Validation

Pattern:

Check prefix:
  if (strncmp(input, "flag{", 5) != 0) return 0;

Check suffix:
  if (input[len-1] != '}') return 0;

Check length:
  if (strlen(input) != EXPECTED_LEN) return 0;

What to look for:

  • String comparison with literal "flag{" or "CTF{"
  • Bracket/brace checks
  • Length validation

ReVa detection:

search-strings-regex pattern="(flag\\{|CTF\\{)"
→ Find flag format strings

get-decompilation of validation
→ Extract format requirements

Solution approach:

  • Note format requirements
  • Focus on solving for content between delimiters
  • Reconstruct full flag with proper format

Multi-Stage Validation

Pattern:

Stage 1: Check format (flag{...})
Stage 2: Check length (must be 32 characters)
Stage 3: Check checksum (sum must equal X)
Stage 4: Check encryption (encrypted content matches Y)

What to look for:

  • Multiple validation functions called in sequence
  • Early exits on failure
  • Progressive constraints

ReVa detection:

find-cross-references to validation function
→ See if called from multi-stage validator

get-decompilation of main validator
→ Identify call sequence

Analyze each stage separately
→ Understand cumulative constraints

Solution approach:

  • Solve each stage's constraints
  • Combine solutions (flag must satisfy ALL stages)
  • Work backwards from most constrained to least

Hidden Success Path

Pattern:

Obvious failure message:
  printf("Wrong!\n");

Hidden success logic:
  if (/* complex condition */)
    system("cat /flag.txt");  // No message, just action

What to look for:

  • Success action without visible message
  • File access (cat flag, open flag.txt)
  • Network communication of flag
  • Success indicated by lack of "Wrong" message

ReVa detection:

search-strings-regex pattern="(flag|/flag|flag\\.txt)"
→ Find flag file references

find-cross-references to flag file
→ Locate success path

get-decompilation of success condition
→ Understand requirements

Solution approach:

  • Don't rely on "Correct!" message
  • Look for flag output actions
  • Check for file reads, network sends
  • Success may be silent

Using These Patterns

Pattern Matching Workflow

  1. Observe code structure

    • Loops, conditionals, function calls
    • Data types, array sizes
    • Constants and literals
  2. Compare to pattern catalog

    • Does this match a crypto pattern?
    • Is this an encoding scheme?
    • Looks like input validation?
  3. Verify with specific checks

    Hypothesis: This is AES
    Check 1: read-memory at constant array → Matches AES S-box? ✓
    Check 2: Count loop iterations → 10, 12, or 14? ✓
    Check 3: Block size 16 bytes? ✓
    Conclusion: AES confirmed
    
  4. Apply pattern-specific solution

    • AES → Extract key, decrypt
    • XOR → Extract key, XOR again
    • Constraint validation → Extract constraints, solve

Quick Reference Decision Tree

Does it have loops with XOR?
  → Check Simple XOR Patterns

Does it have large constant arrays?
  → Check Block Cipher or Hash Patterns

Does it have swap operations and modulo?
  → Check Stream Cipher Patterns

Does it have character-by-character comparison?
  → Check Input Validation Patterns

Does it have 64-character lookup table?
  → Check Base64 Pattern

Does it have mathematical operations (factorial, fibonacci)?
  → Check Algorithm Patterns

Is control flow overly complex?
  → Check Obfuscation Patterns

Combining Patterns

Real challenges often combine multiple patterns:

Example: Crypto + Validation

Input → Format Check (flag{...}) → XOR Decode → AES Decrypt → Compare to Expected

Solve:

  1. Extract format requirements
  2. Identify XOR key
  3. Identify AES key
  4. Extract expected value
  5. Work backwards: AES_decrypt(XOR_decode(expected)) with known keys

Example: Encoding + Constraint

Input → Base64 Decode → Constraint Check (sum == X, product == Y)

Solve:

  1. Extract constraints on decoded values
  2. Solve constraints
  3. Base64 encode solution

Remember

Patterns are recognition shortcuts, not rigid rules:

  • Use patterns to quickly identify challenge type
  • Adapt pattern solutions to specific implementation
  • If pattern doesn't fit, analyze from first principles
  • Document your pattern matches with bookmarks/comments
  • Build your own pattern library from experience

When you recognize a pattern, you skip hours of analysis and jump directly to solution strategy.