Initial commit

2025-11-30 09:05:19 +08:00
commit 09fec2555b
96 changed files with 24269 additions and 0 deletions
--- a/skills/writing-scripts/references/python.md
+++ b/skills/writing-scripts/references/python.md
@@ -0,0 +1,406 @@
+# Python Scripting Reference
+
+Detailed patterns and examples for Python automation scripts.
+
+## Subprocess Patterns
+
+### Two-Stage Subprocess (Avoid Shell Parsing)
+
+**Problem:** Using `shell=True` with complex patterns causes shell parsing issues.
+
+**❌ Don't: shell=True with complex patterns**
+```python
+cmd = 'curl -s "url" | grep -oE "pattern(with|parens)"'
+subprocess.run(cmd, shell=True, ...)
+```
+
+**✅ Do: Separate calls with input= piping**
+```python
+curl_result = subprocess.run(['curl', '-s', url],
+                            capture_output=True, text=True)
+grep_result = subprocess.run(['grep', '-oE', pattern],
+                            input=curl_result.stdout,
+                            capture_output=True, text=True)
+```
+
+### Why List Arguments Work
+
+- Python executes command directly (no shell interpretation)
+- Arguments passed as literal strings
+- Special chars like `|(){}` treated as text, not operators
+
+### When shell=True Is Needed
+
+Only use for hard-coded commands that require shell features:
+- `*` wildcards
+- `~` home directory expansion
+- `&&` operators
+- Environment variable expansion
+
+```python
+# Hard-coded command only
+subprocess.run('ls *.txt | wc -l', shell=True, ...)
+```
+
+## Debugging Subprocess Failures
+
+### Workflow
+
+1. **Test command in bash first** - Verify it works outside Python
+2. **Add debug output:**
+   ```python
+   result = subprocess.run(cmd, ...)
+   print(f"stdout: {result.stdout[:100]}")
+   print(f"stderr: {result.stderr}")
+   print(f"returncode: {result.returncode}")
+   ```
+3. **Check stderr for shell errors** - Syntax errors indicate shell parsing issues
+4. **Rewrite without shell=True** - Use list arguments and two-stage pattern
+
+### Common Errors
+
+| Error | Cause | Solution |
+|-------|-------|----------|
+| `syntax error near unexpected token '('` | Shell parsing regex/parens | Two-stage subprocess |
+| `command not found` | PATH issue or typo | Check command exists with `which` |
+| Empty stdout | Command construction error | Debug with stderr output |
+
+### Debugging Invisible Characters
+
+**Problem:** Files with invisible characters (backspace, null bytes) cause mysterious errors.
+
+**Symptoms:**
+- LaTeX: `Unicode character ^^H (U+0008) not set up for use with LaTeX`
+- Commands fail with "invalid character" but file looks normal
+
+**Detection:**
+```bash
+# Show all characters including invisible ones
+od -c file.txt
+
+# Check specific line range
+sed -n '10,20p' file.txt | od -c
+
+# Find backspaces
+grep -P '\x08' file.txt
+```
+
+**Example output:**
+```
+0000000    %   %       f   i   l   e   .  \n  \b   \   b   e   g   i
+                                            ^^^ backspace character
+```
+
+**Fix:**
+```bash
+# Remove all backspace characters
+tr -d '\b' < corrupted.tex > clean.tex
+
+# Remove all control characters (preserve newlines)
+tr -cd '[:print:]\n' < file.txt > clean.txt
+```
+
+**Prevention:** Use proper quoting when generating files (see Bash reference for LaTeX string escaping).
+
+## Error Handling
+
+### Basic Pattern
+
+```python
+import sys
+import subprocess
+
+try:
+    result = subprocess.run(['command'],
+                          capture_output=True,
+                          text=True,
+                          check=True)  # Raises on non-zero exit
+except subprocess.CalledProcessError as e:
+    print(f"Error: Command failed with exit code {e.returncode}", file=sys.stderr)
+    print(f"stderr: {e.stderr}", file=sys.stderr)
+    sys.exit(1)
+except FileNotFoundError:
+    print("Error: Command not found in PATH", file=sys.stderr)
+    sys.exit(1)
+```
+
+### File Operations
+
+```python
+try:
+    with open(file_path, 'r') as f:
+        content = f.read()
+except FileNotFoundError:
+    print(f"Error: File not found: {file_path}", file=sys.stderr)
+    sys.exit(1)
+except PermissionError:
+    print(f"Error: Permission denied: {file_path}", file=sys.stderr)
+    sys.exit(1)
+except IOError as e:
+    print(f"Error reading file: {e}", file=sys.stderr)
+    sys.exit(1)
+```
+
+## Argparse Patterns
+
+### Multi-Mode Scripts
+
+```python
+import argparse
+
+parser = argparse.ArgumentParser(description='Script description')
+parser.add_argument('input', nargs='?', help='Input file or topic')
+parser.add_argument('--url', help='Direct URL mode')
+parser.add_argument('--verify', action='store_true', help='Verify output')
+args = parser.parse_args()
+
+# Validate combinations
+if not args.input and not args.url:
+    parser.error("Provide either input or --url")
+```
+
+### Common Flag Patterns
+
+```python
+parser.add_argument('-v', '--verbose', action='store_true',
+                   help='Verbose output')
+parser.add_argument('-f', '--force', action='store_true',
+                   help='Force operation')
+parser.add_argument('-o', '--output', default='output.txt',
+                   help='Output file')
+parser.add_argument('--count', type=int, default=5,
+                   help='Number of items')
+parser.add_argument('--config', type=str,
+                   help='Config file path')
+```
+
+### Mutually Exclusive Groups
+
+```python
+group = parser.add_mutually_exclusive_group()
+group.add_argument('--json', action='store_true')
+group.add_argument('--yaml', action='store_true')
+```
+
+## Environment Variables
+
+```python
+import os
+
+# ✅ Never hardcode credentials
+API_KEY = os.getenv('API_KEY')
+if not API_KEY:
+    print("Error: API_KEY environment variable not set", file=sys.stderr)
+    sys.exit(1)
+
+# ✅ Provide defaults
+LOG_LEVEL = os.getenv('LOG_LEVEL', 'INFO')
+OUTPUT_DIR = os.getenv('OUTPUT_DIR', './output')
+
+# ✅ Type conversion with defaults
+MAX_RETRIES = int(os.getenv('MAX_RETRIES', '3'))
+TIMEOUT = float(os.getenv('TIMEOUT', '30.0'))
+```
+
+## File Processing Patterns
+
+### Process Files Matching Pattern
+
+```python
+import glob
+import sys
+
+def process_files(pattern: str) -> list[str]:
+    """Find and process files matching pattern."""
+    files = glob.glob(pattern, recursive=True)
+    results = []
+
+    for file in files:
+        try:
+            with open(file, 'r') as f:
+                content = f.read()
+                results.append(process(content))
+        except IOError as e:
+            print(f"Error reading {file}: {e}", file=sys.stderr)
+
+    return results
+```
+
+### Safe File Writing
+
+```python
+import tempfile
+import shutil
+
+def safe_write(file_path: str, content: str):
+    """Write to temp file first, then atomic move."""
+    # Write to temp file in same directory
+    dir_name = os.path.dirname(file_path)
+    with tempfile.NamedTemporaryFile(mode='w', dir=dir_name,
+                                     delete=False) as tmp:
+        tmp.write(content)
+        tmp_path = tmp.name
+
+    # Atomic move
+    shutil.move(tmp_path, file_path)
+```
+
+## URL Verification
+
+```python
+import subprocess
+
+def verify_url(url: str) -> bool:
+    """Verify URL is accessible with HTTP HEAD request."""
+    result = subprocess.run(['curl', '-I', '-s', url],
+                          capture_output=True, text=True)
+
+    if 'HTTP/2 200' in result.stdout or 'HTTP/1.1 200' in result.stdout:
+        if 'content-type:' in result.stdout.lower():
+            return True
+    return False
+```
+
+## Automation Script Patterns
+
+### Dry-Run Mode
+
+```python
+import argparse
+
+parser = argparse.ArgumentParser()
+parser.add_argument('--force', action='store_true',
+                   help='Apply changes (dry-run by default)')
+args = parser.parse_args()
+
+dry_run = not args.force
+
+# Use dry_run flag throughout script
+for item in items:
+    change_description = f"Would rename {item['old']} → {item['new']}"
+
+    if dry_run:
+        print(f"→ {change_description}")
+    else:
+        print(f"✓ {change_description}")
+        apply_change(item)
+```
+
+### Backup-First Pattern
+
+```python
+from datetime import datetime
+import shutil
+
+def backup_before_modify(config_path: str) -> str:
+    """Create timestamped backup before modifications."""
+    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
+    backup_path = f"{config_path}.backup.{timestamp}"
+
+    shutil.copy2(config_path, backup_path)
+    print(f"✓ Backup created: {backup_path}")
+
+    return backup_path
+
+# Use in operations
+if not dry_run:
+    backup_before_modify(config_path)
+    update_config(config_path)
+```
+
+### Self-Documenting Output
+
+```python
+print("=" * 70)
+print("CONFIGURATION MIGRATION")
+print("=" * 70)
+print()
+
+print("Step 1: Analyzing input files")
+print("-" * 70)
+files = find_files()
+print(f"Found: {len(files)} files")
+for f in files[:5]:
+    print(f"  • {f}")
+print()
+
+print("Step 2: Validating configuration")
+print("-" * 70)
+errors = validate_config()
+if errors:
+    print(f"✗ Found {len(errors)} errors")
+    for error in errors:
+        print(f"  • {error}")
+else:
+    print("✓ Configuration valid")
+```
+
+## Common Pitfalls
+
+### ❌ Using shell=True Unnecessarily
+
+```python
+# Vulnerable and error-prone
+subprocess.run(f'rm -rf {user_input}', shell=True)  # DANGER
+```
+
+### ✅ Use List Arguments
+
+```python
+subprocess.run(['rm', '-rf', user_input])  # Safe
+```
+
+### ❌ Not Handling Encoding
+
+```python
+result = subprocess.run(['cmd'], capture_output=True)
+print(result.stdout)  # bytes, not string
+```
+
+### ✅ Specify text=True
+
+```python
+result = subprocess.run(['cmd'], capture_output=True, text=True)
+print(result.stdout)  # string
+```
+
+### ❌ Ignoring Errors
+
+```python
+result = subprocess.run(['cmd'])
+# No error handling
+```
+
+### ✅ Check Exit Code
+
+```python
+result = subprocess.run(['cmd'], capture_output=True, text=True)
+if result.returncode != 0:
+    print(f"Error: {result.stderr}", file=sys.stderr)
+    sys.exit(1)
+```
+
+## Validation Tools
+
+```bash
+# Check syntax
+python3 -m py_compile script.py
+
+# Lint with pylint
+pip install pylint
+pylint script.py
+
+# Format with black
+pip install black
+black script.py
+
+# Type check with mypy
+pip install mypy
+mypy script.py
+```
+
+## References
+
+- Python subprocess docs: https://docs.python.org/3/library/subprocess.html
+- Real Python subprocess guide: https://realpython.com/python-subprocess/
+- Argparse tutorial: https://docs.python.org/3/howto/argparse.html