21 KiB
CTF Reverse Engineering Pattern Recognition
This document provides pattern recognition guides for common CTF reverse engineering challenges. Focus on identifying patterns quickly to guide your solution strategy.
Cryptographic Patterns
Simple XOR Patterns
Recognition Signature:
Single-byte XOR:
for (i = 0; i < len; i++)
output[i] = input[i] ^ 0xKEY;
Multi-byte XOR (repeating key):
for (i = 0; i < len; i++)
output[i] = input[i] ^ key[i % keylen];
Rolling XOR:
xor_val = seed;
for (i = 0; i < len; i++) {
output[i] = input[i] ^ xor_val;
xor_val = next_value(xor_val); // Linear congruential or similar
}
What to look for:
- Very short functions (5-15 lines decompiled)
- XOR operation in loop
- Constant value or small array
- Modulo operation for key index (
i % keylen)
ReVa detection:
search-decompilation pattern="\\^" caseSensitive=false
→ Find XOR operations
get-decompilation of suspicious function
→ Look for loop with XOR
read-memory at key location
→ Extract XOR key
Solution approach:
- XOR is self-inverse:
decrypt(x) = encrypt(x) - If you have ciphertext + key: plaintext = ciphertext XOR key
- If you have plaintext + ciphertext: key = plaintext XOR ciphertext
- If you have partial known plaintext: derive key, decrypt rest
Base64 and Variants
Recognition Signature:
Character lookup table (64-character alphabet):
Standard: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
Custom: May use different alphabet
Bit manipulation:
3 bytes → 4 encoded characters
Shifting and masking: (data >> 18) & 0x3F
Padding:
'=' characters or custom padding
What to look for:
- 64-character string constant (lookup table)
- Bit shifting:
>> 6,>> 12,>> 18 - Masking:
& 0x3F(6 bits) - 3-to-4 or 4-to-3 byte conversion ratio
- Padding logic
ReVa detection:
search-strings-regex pattern="[A-Za-z0-9+/]{64}"
→ Find base64 alphabet
search-decompilation pattern="& 0x3f"
→ Find 6-bit masking (base64 characteristic)
get-decompilation of encoding function
→ Confirm 3→4 byte transformation
Solution approach:
- If standard base64: use standard decoder
- If custom alphabet: map custom → standard, then decode
- Reverse engineering: identify alphabet, implement decoder
Block Cipher Patterns (AES, DES, etc.)
Recognition Signature:
AES characteristics:
- 128-bit (16-byte) blocks
- 10, 12, or 14 rounds (for 128, 192, 256-bit keys)
- S-box: 256-byte constant array starting 63 7c 77 7b f2 6b 6f c5...
- Mix columns, shift rows operations
- Key schedule expansion
DES characteristics:
- 64-bit (8-byte) blocks
- 16 rounds
- Permutation tables (IP, FP, E, P, S-boxes)
- Feistel structure (split, swap, repeat)
What to look for:
Nested loops:
for (round = 0; round < NUM_ROUNDS; round++)
for (i = 0; i < BLOCK_SIZE; i++)
state[i] = transform(state[i], key[round]);
Large constant arrays:
uint8_t sbox[256] = {0x63, 0x7c, 0x77, ...};
Block processing:
Fixed-size chunks (16 bytes for AES, 8 for DES)
Key schedule:
Function deriving round keys from master key
ReVa detection:
search-decompilation pattern="(for.*round|for.*0x10)"
→ Find round loops
read-memory at constant arrays
→ Compare first bytes to known S-boxes:
AES: 63 7c 77 7b f2 6b 6f c5
DES S1: 0e 04 0d 01 02 0f 0b 08
get-decompilation with focus on nested loops
→ Count iterations (round count indicates key size)
Solution approach:
- Identify algorithm by S-box or constants
- Extract key from memory or key schedule
- Use standard implementation to decrypt
- For custom implementations, replicate in Python/C
Stream Cipher Patterns (RC4, etc.)
Recognition Signature:
RC4 characteristics:
KSA (Key Scheduling Algorithm):
for i = 0 to 255: S[i] = i
for i = 0 to 255: swap S[i] with S[(S[i] + key[i % keylen]) % 256]
PRGA (Pseudo-Random Generation Algorithm):
i = 0, j = 0
while generating:
i = (i + 1) % 256
j = (j + S[i]) % 256
swap(S[i], S[j])
output = S[(S[i] + S[j]) % 256]
What to look for:
State array initialization:
for (i = 0; i < 256; i++) state[i] = i;
Swap operations:
temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
Modulo arithmetic:
(i + 1) % 256
index & 0xFF (equivalent to % 256)
Simple XOR with keystream:
output[i] = input[i] ^ keystream[i];
ReVa detection:
search-decompilation pattern="(swap|temp.*=.*\\[)"
→ Find array swap operations
get-decompilation of initialization
→ Look for 0-255 loop filling array
find-cross-references to state array
→ Trace usage through KSA and PRGA
Solution approach:
- Extract key from initialization
- Replicate KSA to generate initial state
- Replicate PRGA to generate keystream
- XOR ciphertext with keystream to decrypt
Hash Function Patterns
Recognition Signature:
MD5/SHA characteristics:
- Fixed initialization vectors (magic constants)
- Block processing (512 bits / 64 bytes)
- Multiple rounds (64 for MD5/SHA-256, 80 for SHA-1)
- Bitwise operations: rotations, XOR, AND, OR, NOT
- Padding: append 0x80, then zeros, then length
Magic constants:
MD5: 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476
SHA-1: adds 0xc3d2e1f0
SHA-256: Eight 32-bit constants derived from square roots
What to look for:
Characteristic constants:
Search for 0x67452301 (MD5/SHA-1 IV)
Fixed round counts:
for (round = 0; round < 64; round++) // MD5, SHA-256
for (round = 0; round < 80; round++) // SHA-1
Bitwise rotation macros:
ROTL(x, n) = (x << n) | (x >> (32-n))
Message schedule (W array):
Expands 16 input words to 64/80 words
Padding logic:
Append 0x80, zeros, then 64-bit length
ReVa detection:
search-decompilation pattern="0x67452301"
→ Find MD5/SHA initialization
read-memory at round constants
→ Identify specific hash variant
get-decompilation of hash function
→ Count rounds, identify structure
Solution approach:
- Hash functions are one-way (cannot decrypt)
- If you find hash of flag: need to brute force or use known input
- If you find comparison: extract expected hash, try common flags
- Check for weak hash (MD5, SHA-1) or short input (brute-forceable)
Encoding Patterns
Character Substitution
Recognition Signature:
Lookup table mapping:
output[i] = table[input[i]];
Caesar cipher (shift):
output[i] = (input[i] - 'A' + shift) % 26 + 'A';
Custom alphabet:
const char* alphabet = "ZYXWVUTSRQPONMLKJIHGFEDCBAzyxwvutsrqponmlkjihgfedcba";
output[i] = alphabet[input[i] - 'A'];
What to look for:
- Character array constants (alphabets, substitution tables)
- Character-by-character processing loops
- Range checks:
if (c >= 'A' && c <= 'Z') - Arithmetic on character codes:
c - 'A',c + shift
ReVa detection:
search-strings-regex pattern="[A-Z]{26}"
→ Find alphabet strings
search-decompilation pattern="(- 'A'|% 26)"
→ Find character arithmetic
get-decompilation of encoding function
→ Identify substitution pattern
Solution approach:
- Extract substitution table or shift value
- Build reverse mapping
- Apply to encoded data
Binary-to-Text Encodings
Recognition Signature:
Hex encoding:
"0123456789abcdef"
nibble_high = (byte >> 4) & 0xF;
nibble_low = byte & 0xF;
Binary/ASCII:
Converting to "01011010" strings
Custom encodings:
Mapping bytes to multi-character sequences
What to look for:
- Hex digit strings
- Bit extraction:
>> 4,& 0xF,& 1 - Character code generation loops
- 1-to-2 or 1-to-8 byte expansion
ReVa detection:
search-decompilation pattern="(>> 4|& 0xf)"
→ Find nibble extraction (hex encoding)
get-strings to find encoding alphabets
→ Check for hex, binary digit strings
Solution approach:
- Identify encoding scheme
- Implement decoder
- Apply to encoded flag
Input Validation Patterns
Character-by-Character Comparison
Recognition Signature:
Direct comparison:
for (i = 0; i < len; i++)
if (input[i] != expected[i])
return 0;
return 1;
Comparison with transformation:
for (i = 0; i < len; i++)
if (transform(input[i]) != expected[i])
return 0;
What to look for:
- Loop over input length
- Comparison inside loop:
!=,== - Early return on mismatch
- Success after full loop completion
ReVa detection:
search-decompilation pattern="(if.*!=|if.*==)"
→ Find comparison operations
get-decompilation of validation function
→ Identify loop structure
read-memory at expected value array
→ Extract expected bytes
Solution approach:
- If direct comparison: read expected array, that's the flag
- If transformed comparison: reverse transformation
- If complex transformation: trace each character
Checksum Validation
Recognition Signature:
Sum check:
sum = 0;
for (i = 0; i < len; i++)
sum += input[i];
return (sum == EXPECTED_SUM);
XOR check:
xor = 0;
for (i = 0; i < len; i++)
xor ^= input[i];
return (xor == EXPECTED_XOR);
Custom accumulation:
result = SEED;
for (i = 0; i < len; i++)
result = (result * MULT + input[i]) % MOD;
return (result == EXPECTED);
What to look for:
- Accumulator variable (sum, product, xor)
- Loop updating accumulator
- Final comparison to constant
- May be combined with other checks
ReVa detection:
search-decompilation pattern="(\\+=|\\*=|\\^=)"
→ Find accumulator updates
get-decompilation of validation
→ Identify accumulation pattern
read-memory at expected value
→ Extract target checksum
Solution approach:
- Single checksum: underconstrained (many solutions)
- Multiple checksums: may uniquely identify input
- Extract all constraints, solve as system of equations
Constraint-Based Validation
Recognition Signature:
Multiple independent checks:
if (input[0] + input[1] != 0x64) return 0;
if (input[0] - input[1] != 0x14) return 0;
if (input[2] ^ 0x42 != 0x33) return 0;
if (input[3] * 2 == input[4]) return 0;
return 1;
Relational constraints:
if (input[i] != input[j] + 5) return 0;
What to look for:
- Multiple if-statements with comparisons
- Arithmetic operations on input elements
- Relationships between different input positions
- Constants in comparisons
ReVa detection:
get-decompilation of validation function
→ Identify all comparison statements
set-decompilation-comment on each constraint
→ Document relationships
Extract to external solver:
→ List all constraints, solve with z3 or similar
Solution approach:
- Extract all constraints
- Frame as system of equations
- Solve using constraint solver (z3, SMT)
- Verify solution satisfies all constraints
Algorithm Patterns
Mathematical Sequences
Recognition Signature:
Fibonacci:
a = 0, b = 1;
while (...) {
next = a + b;
a = b;
b = next;
}
Factorial:
result = 1;
for (i = 1; i <= n; i++)
result *= i;
Prime checking:
for (i = 2; i < sqrt(n); i++)
if (n % i == 0) return 0;
return 1;
What to look for:
- Iterative or recursive patterns
- Arithmetic progressions
- Number theory operations (modulo, divisibility)
- Known sequence generation
ReVa detection:
search-decompilation pattern="(fibonacci|factorial|prime)"
→ Find named functions (if not stripped)
get-decompilation of suspicious function
→ Identify mathematical pattern
Recognize by structure:
→ Two-variable update (Fibonacci)
→ Multiplication accumulator (factorial)
→ Modulo divisibility (prime check)
Solution approach:
- Recognize the algorithm
- Understand how it validates input
- Derive required input or replicate logic
Matrix Operations
Recognition Signature:
Matrix multiplication:
for (i = 0; i < rows; i++)
for (j = 0; j < cols; j++)
for (k = 0; k < inner; k++)
result[i][j] += a[i][k] * b[k][j];
Linear transformations:
output[i] = matrix[i][0] * input[0] + matrix[i][1] * input[1] + ...;
What to look for:
- Triple-nested loops (matrix multiply)
- 2D array indexing:
array[i][j]orarray[i * width + j] - Accumulator in inner loop
- Linear combinations of input
ReVa detection:
search-decompilation pattern="\\[.*\\]\\[.*\\]"
→ Find 2D array access
get-decompilation showing nested loops
→ Count loop depth (3 = likely matrix multiply)
read-memory at matrix constants
→ Extract transformation matrix
Solution approach:
- Extract matrix
- Invert matrix (if square and invertible)
- Apply inverse to expected output to get required input
State Machine Patterns
Recognition Signature:
Explicit state variable:
int state = STATE_INIT;
while (running) {
switch (state) {
case STATE_INIT: /* ... */ state = STATE_READY; break;
case STATE_READY: /* ... */ state = STATE_PROCESS; break;
case STATE_PROCESS: /* ... */ state = STATE_DONE; break;
}
}
Implicit state (position in input):
for (i = 0; i < len; i++) {
if (/* condition based on i and input */)
/* different processing for different positions */
}
What to look for:
- State variable with multiple values
- Large switch statement on state
- State transitions (state = NEW_STATE)
- Different behavior based on current state
ReVa detection:
search-decompilation pattern="(case|switch)"
→ Find switch statements
get-decompilation of state machine
→ Map state transitions
rename-variables to clarify states
→ current_state, next_state, etc.
Solution approach:
- Map state transition graph
- Identify accepting states (success)
- Determine input sequence that reaches accepting state
Obfuscation Patterns
Control Flow Obfuscation
Recognition Signature:
Opaque predicates (always true/false):
if (x * x >= 0) // Always true
real_code();
else
never_executed();
Dispatcher loops:
while (1) {
switch (dispatch_value) {
case 0: /* block A */; dispatch_value = 5; break;
case 5: /* block B */; dispatch_value = 2; break;
case 2: /* block C */; dispatch_value = -1; break;
case -1: return;
}
}
What to look for:
- Unnecessary conditionals
- Complex control flow with simple logic
- Dispatcher-based execution (case jumps)
- Dead code branches
ReVa detection:
get-decompilation of obfuscated function
→ Look for unusual control flow
set-bookmark type="Warning" for suspicious patterns
→ Mark opaque predicates, dispatchers
Focus on data flow, ignore control flow complexity
→ Track input transformation regardless of jumps
Solution approach:
- Ignore obfuscation, trace data flow
- Use dynamic analysis to observe actual execution path
- Simplify manually or with deobfuscation tools
String Obfuscation
Recognition Signature:
Stack strings (character-by-character):
str[0] = 'f'; str[1] = 'l'; str[2] = 'a'; str[3] = 'g';
Encrypted strings (decrypted at runtime):
decrypt_string(encrypted_data, key, output);
Computed strings:
for (i = 0; i < len; i++)
str[i] = base[i] ^ key;
What to look for:
- Character assignments to array
- String decryption functions
- XOR or arithmetic on character arrays
- Strings not visible in static string list
ReVa detection:
get-strings may not show obfuscated strings
→ Use decompilation to find construction
search-decompilation pattern="\\[0\\] = "
→ Find character-by-character assignments
find-cross-references to decryption functions
→ Locate where strings are revealed
Solution approach:
- Identify deobfuscation routine
- Extract encrypted data and key
- Decrypt manually or use dynamic analysis to observe decrypted string
Anti-Debugging (CTF Context)
Recognition Signature:
Debugger detection:
if (ptrace(PTRACE_TRACEME, 0, 1, 0) < 0) exit(1); // Linux
if (IsDebuggerPresent()) exit(1); // Windows
Timing checks:
start = time();
/* short operation */
end = time();
if (end - start > THRESHOLD) exit(1); // Detected breakpoint delay
Self-modification:
Decrypt code section at runtime
Execute decrypted code
Re-encrypt afterwards
What to look for:
- Debugger detection APIs
- Timing measurements
- Memory protection changes
- Code modification at runtime
ReVa detection:
get-symbols includeExternal=true
→ Look for: ptrace, IsDebuggerPresent, time, gettimeofday
search-decompilation pattern="(ptrace|IsDebugger|time)"
→ Find anti-debug checks
find-cross-references to VirtualProtect, mprotect
→ Identify self-modifying code
Solution approach:
- Patch out anti-debug checks (NOP the exit)
- Use anti-anti-debugging tools
- Analyze in sandbox that hides debugger
- For CTF, often acceptable to patch binary
Common CTF Tricks
Flag Format Validation
Pattern:
Check prefix:
if (strncmp(input, "flag{", 5) != 0) return 0;
Check suffix:
if (input[len-1] != '}') return 0;
Check length:
if (strlen(input) != EXPECTED_LEN) return 0;
What to look for:
- String comparison with literal "flag{" or "CTF{"
- Bracket/brace checks
- Length validation
ReVa detection:
search-strings-regex pattern="(flag\\{|CTF\\{)"
→ Find flag format strings
get-decompilation of validation
→ Extract format requirements
Solution approach:
- Note format requirements
- Focus on solving for content between delimiters
- Reconstruct full flag with proper format
Multi-Stage Validation
Pattern:
Stage 1: Check format (flag{...})
Stage 2: Check length (must be 32 characters)
Stage 3: Check checksum (sum must equal X)
Stage 4: Check encryption (encrypted content matches Y)
What to look for:
- Multiple validation functions called in sequence
- Early exits on failure
- Progressive constraints
ReVa detection:
find-cross-references to validation function
→ See if called from multi-stage validator
get-decompilation of main validator
→ Identify call sequence
Analyze each stage separately
→ Understand cumulative constraints
Solution approach:
- Solve each stage's constraints
- Combine solutions (flag must satisfy ALL stages)
- Work backwards from most constrained to least
Hidden Success Path
Pattern:
Obvious failure message:
printf("Wrong!\n");
Hidden success logic:
if (/* complex condition */)
system("cat /flag.txt"); // No message, just action
What to look for:
- Success action without visible message
- File access (cat flag, open flag.txt)
- Network communication of flag
- Success indicated by lack of "Wrong" message
ReVa detection:
search-strings-regex pattern="(flag|/flag|flag\\.txt)"
→ Find flag file references
find-cross-references to flag file
→ Locate success path
get-decompilation of success condition
→ Understand requirements
Solution approach:
- Don't rely on "Correct!" message
- Look for flag output actions
- Check for file reads, network sends
- Success may be silent
Using These Patterns
Pattern Matching Workflow
-
Observe code structure
- Loops, conditionals, function calls
- Data types, array sizes
- Constants and literals
-
Compare to pattern catalog
- Does this match a crypto pattern?
- Is this an encoding scheme?
- Looks like input validation?
-
Verify with specific checks
Hypothesis: This is AES Check 1: read-memory at constant array → Matches AES S-box? ✓ Check 2: Count loop iterations → 10, 12, or 14? ✓ Check 3: Block size 16 bytes? ✓ Conclusion: AES confirmed -
Apply pattern-specific solution
- AES → Extract key, decrypt
- XOR → Extract key, XOR again
- Constraint validation → Extract constraints, solve
Quick Reference Decision Tree
Does it have loops with XOR?
→ Check Simple XOR Patterns
Does it have large constant arrays?
→ Check Block Cipher or Hash Patterns
Does it have swap operations and modulo?
→ Check Stream Cipher Patterns
Does it have character-by-character comparison?
→ Check Input Validation Patterns
Does it have 64-character lookup table?
→ Check Base64 Pattern
Does it have mathematical operations (factorial, fibonacci)?
→ Check Algorithm Patterns
Is control flow overly complex?
→ Check Obfuscation Patterns
Combining Patterns
Real challenges often combine multiple patterns:
Example: Crypto + Validation
Input → Format Check (flag{...}) → XOR Decode → AES Decrypt → Compare to Expected
Solve:
- Extract format requirements
- Identify XOR key
- Identify AES key
- Extract expected value
- Work backwards: AES_decrypt(XOR_decode(expected)) with known keys
Example: Encoding + Constraint
Input → Base64 Decode → Constraint Check (sum == X, product == Y)
Solve:
- Extract constraints on decoded values
- Solve constraints
- Base64 encode solution
Remember
Patterns are recognition shortcuts, not rigid rules:
- Use patterns to quickly identify challenge type
- Adapt pattern solutions to specific implementation
- If pattern doesn't fit, analyze from first principles
- Document your pattern matches with bookmarks/comments
- Build your own pattern library from experience
When you recognize a pattern, you skip hours of analysis and jump directly to solution strategy.