Initial commit
This commit is contained in:
406
skills/writing-scripts/references/python.md
Normal file
406
skills/writing-scripts/references/python.md
Normal file
@@ -0,0 +1,406 @@
|
||||
# Python Scripting Reference
|
||||
|
||||
Detailed patterns and examples for Python automation scripts.
|
||||
|
||||
## Subprocess Patterns
|
||||
|
||||
### Two-Stage Subprocess (Avoid Shell Parsing)
|
||||
|
||||
**Problem:** Using `shell=True` with complex patterns causes shell parsing issues.
|
||||
|
||||
**❌ Don't: shell=True with complex patterns**
|
||||
```python
|
||||
cmd = 'curl -s "url" | grep -oE "pattern(with|parens)"'
|
||||
subprocess.run(cmd, shell=True, ...)
|
||||
```
|
||||
|
||||
**✅ Do: Separate calls with input= piping**
|
||||
```python
|
||||
curl_result = subprocess.run(['curl', '-s', url],
|
||||
capture_output=True, text=True)
|
||||
grep_result = subprocess.run(['grep', '-oE', pattern],
|
||||
input=curl_result.stdout,
|
||||
capture_output=True, text=True)
|
||||
```
|
||||
|
||||
### Why List Arguments Work
|
||||
|
||||
- Python executes command directly (no shell interpretation)
|
||||
- Arguments passed as literal strings
|
||||
- Special chars like `|(){}` treated as text, not operators
|
||||
|
||||
### When shell=True Is Needed
|
||||
|
||||
Only use for hard-coded commands that require shell features:
|
||||
- `*` wildcards
|
||||
- `~` home directory expansion
|
||||
- `&&` operators
|
||||
- Environment variable expansion
|
||||
|
||||
```python
|
||||
# Hard-coded command only
|
||||
subprocess.run('ls *.txt | wc -l', shell=True, ...)
|
||||
```
|
||||
|
||||
## Debugging Subprocess Failures
|
||||
|
||||
### Workflow
|
||||
|
||||
1. **Test command in bash first** - Verify it works outside Python
|
||||
2. **Add debug output:**
|
||||
```python
|
||||
result = subprocess.run(cmd, ...)
|
||||
print(f"stdout: {result.stdout[:100]}")
|
||||
print(f"stderr: {result.stderr}")
|
||||
print(f"returncode: {result.returncode}")
|
||||
```
|
||||
3. **Check stderr for shell errors** - Syntax errors indicate shell parsing issues
|
||||
4. **Rewrite without shell=True** - Use list arguments and two-stage pattern
|
||||
|
||||
### Common Errors
|
||||
|
||||
| Error | Cause | Solution |
|
||||
|-------|-------|----------|
|
||||
| `syntax error near unexpected token '('` | Shell parsing regex/parens | Two-stage subprocess |
|
||||
| `command not found` | PATH issue or typo | Check command exists with `which` |
|
||||
| Empty stdout | Command construction error | Debug with stderr output |
|
||||
|
||||
### Debugging Invisible Characters
|
||||
|
||||
**Problem:** Files with invisible characters (backspace, null bytes) cause mysterious errors.
|
||||
|
||||
**Symptoms:**
|
||||
- LaTeX: `Unicode character ^^H (U+0008) not set up for use with LaTeX`
|
||||
- Commands fail with "invalid character" but file looks normal
|
||||
|
||||
**Detection:**
|
||||
```bash
|
||||
# Show all characters including invisible ones
|
||||
od -c file.txt
|
||||
|
||||
# Check specific line range
|
||||
sed -n '10,20p' file.txt | od -c
|
||||
|
||||
# Find backspaces
|
||||
grep -P '\x08' file.txt
|
||||
```
|
||||
|
||||
**Example output:**
|
||||
```
|
||||
0000000 % % f i l e . \n \b \ b e g i
|
||||
^^^ backspace character
|
||||
```
|
||||
|
||||
**Fix:**
|
||||
```bash
|
||||
# Remove all backspace characters
|
||||
tr -d '\b' < corrupted.tex > clean.tex
|
||||
|
||||
# Remove all control characters (preserve newlines)
|
||||
tr -cd '[:print:]\n' < file.txt > clean.txt
|
||||
```
|
||||
|
||||
**Prevention:** Use proper quoting when generating files (see Bash reference for LaTeX string escaping).
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Basic Pattern
|
||||
|
||||
```python
|
||||
import sys
|
||||
import subprocess
|
||||
|
||||
try:
|
||||
result = subprocess.run(['command'],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=True) # Raises on non-zero exit
|
||||
except subprocess.CalledProcessError as e:
|
||||
print(f"Error: Command failed with exit code {e.returncode}", file=sys.stderr)
|
||||
print(f"stderr: {e.stderr}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
except FileNotFoundError:
|
||||
print("Error: Command not found in PATH", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
```
|
||||
|
||||
### File Operations
|
||||
|
||||
```python
|
||||
try:
|
||||
with open(file_path, 'r') as f:
|
||||
content = f.read()
|
||||
except FileNotFoundError:
|
||||
print(f"Error: File not found: {file_path}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
except PermissionError:
|
||||
print(f"Error: Permission denied: {file_path}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
except IOError as e:
|
||||
print(f"Error reading file: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
```
|
||||
|
||||
## Argparse Patterns
|
||||
|
||||
### Multi-Mode Scripts
|
||||
|
||||
```python
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description='Script description')
|
||||
parser.add_argument('input', nargs='?', help='Input file or topic')
|
||||
parser.add_argument('--url', help='Direct URL mode')
|
||||
parser.add_argument('--verify', action='store_true', help='Verify output')
|
||||
args = parser.parse_args()
|
||||
|
||||
# Validate combinations
|
||||
if not args.input and not args.url:
|
||||
parser.error("Provide either input or --url")
|
||||
```
|
||||
|
||||
### Common Flag Patterns
|
||||
|
||||
```python
|
||||
parser.add_argument('-v', '--verbose', action='store_true',
|
||||
help='Verbose output')
|
||||
parser.add_argument('-f', '--force', action='store_true',
|
||||
help='Force operation')
|
||||
parser.add_argument('-o', '--output', default='output.txt',
|
||||
help='Output file')
|
||||
parser.add_argument('--count', type=int, default=5,
|
||||
help='Number of items')
|
||||
parser.add_argument('--config', type=str,
|
||||
help='Config file path')
|
||||
```
|
||||
|
||||
### Mutually Exclusive Groups
|
||||
|
||||
```python
|
||||
group = parser.add_mutually_exclusive_group()
|
||||
group.add_argument('--json', action='store_true')
|
||||
group.add_argument('--yaml', action='store_true')
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```python
|
||||
import os
|
||||
|
||||
# ✅ Never hardcode credentials
|
||||
API_KEY = os.getenv('API_KEY')
|
||||
if not API_KEY:
|
||||
print("Error: API_KEY environment variable not set", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# ✅ Provide defaults
|
||||
LOG_LEVEL = os.getenv('LOG_LEVEL', 'INFO')
|
||||
OUTPUT_DIR = os.getenv('OUTPUT_DIR', './output')
|
||||
|
||||
# ✅ Type conversion with defaults
|
||||
MAX_RETRIES = int(os.getenv('MAX_RETRIES', '3'))
|
||||
TIMEOUT = float(os.getenv('TIMEOUT', '30.0'))
|
||||
```
|
||||
|
||||
## File Processing Patterns
|
||||
|
||||
### Process Files Matching Pattern
|
||||
|
||||
```python
|
||||
import glob
|
||||
import sys
|
||||
|
||||
def process_files(pattern: str) -> list[str]:
|
||||
"""Find and process files matching pattern."""
|
||||
files = glob.glob(pattern, recursive=True)
|
||||
results = []
|
||||
|
||||
for file in files:
|
||||
try:
|
||||
with open(file, 'r') as f:
|
||||
content = f.read()
|
||||
results.append(process(content))
|
||||
except IOError as e:
|
||||
print(f"Error reading {file}: {e}", file=sys.stderr)
|
||||
|
||||
return results
|
||||
```
|
||||
|
||||
### Safe File Writing
|
||||
|
||||
```python
|
||||
import tempfile
|
||||
import shutil
|
||||
|
||||
def safe_write(file_path: str, content: str):
|
||||
"""Write to temp file first, then atomic move."""
|
||||
# Write to temp file in same directory
|
||||
dir_name = os.path.dirname(file_path)
|
||||
with tempfile.NamedTemporaryFile(mode='w', dir=dir_name,
|
||||
delete=False) as tmp:
|
||||
tmp.write(content)
|
||||
tmp_path = tmp.name
|
||||
|
||||
# Atomic move
|
||||
shutil.move(tmp_path, file_path)
|
||||
```
|
||||
|
||||
## URL Verification
|
||||
|
||||
```python
|
||||
import subprocess
|
||||
|
||||
def verify_url(url: str) -> bool:
|
||||
"""Verify URL is accessible with HTTP HEAD request."""
|
||||
result = subprocess.run(['curl', '-I', '-s', url],
|
||||
capture_output=True, text=True)
|
||||
|
||||
if 'HTTP/2 200' in result.stdout or 'HTTP/1.1 200' in result.stdout:
|
||||
if 'content-type:' in result.stdout.lower():
|
||||
return True
|
||||
return False
|
||||
```
|
||||
|
||||
## Automation Script Patterns
|
||||
|
||||
### Dry-Run Mode
|
||||
|
||||
```python
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--force', action='store_true',
|
||||
help='Apply changes (dry-run by default)')
|
||||
args = parser.parse_args()
|
||||
|
||||
dry_run = not args.force
|
||||
|
||||
# Use dry_run flag throughout script
|
||||
for item in items:
|
||||
change_description = f"Would rename {item['old']} → {item['new']}"
|
||||
|
||||
if dry_run:
|
||||
print(f"→ {change_description}")
|
||||
else:
|
||||
print(f"✓ {change_description}")
|
||||
apply_change(item)
|
||||
```
|
||||
|
||||
### Backup-First Pattern
|
||||
|
||||
```python
|
||||
from datetime import datetime
|
||||
import shutil
|
||||
|
||||
def backup_before_modify(config_path: str) -> str:
|
||||
"""Create timestamped backup before modifications."""
|
||||
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
|
||||
backup_path = f"{config_path}.backup.{timestamp}"
|
||||
|
||||
shutil.copy2(config_path, backup_path)
|
||||
print(f"✓ Backup created: {backup_path}")
|
||||
|
||||
return backup_path
|
||||
|
||||
# Use in operations
|
||||
if not dry_run:
|
||||
backup_before_modify(config_path)
|
||||
update_config(config_path)
|
||||
```
|
||||
|
||||
### Self-Documenting Output
|
||||
|
||||
```python
|
||||
print("=" * 70)
|
||||
print("CONFIGURATION MIGRATION")
|
||||
print("=" * 70)
|
||||
print()
|
||||
|
||||
print("Step 1: Analyzing input files")
|
||||
print("-" * 70)
|
||||
files = find_files()
|
||||
print(f"Found: {len(files)} files")
|
||||
for f in files[:5]:
|
||||
print(f" • {f}")
|
||||
print()
|
||||
|
||||
print("Step 2: Validating configuration")
|
||||
print("-" * 70)
|
||||
errors = validate_config()
|
||||
if errors:
|
||||
print(f"✗ Found {len(errors)} errors")
|
||||
for error in errors:
|
||||
print(f" • {error}")
|
||||
else:
|
||||
print("✓ Configuration valid")
|
||||
```
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### ❌ Using shell=True Unnecessarily
|
||||
|
||||
```python
|
||||
# Vulnerable and error-prone
|
||||
subprocess.run(f'rm -rf {user_input}', shell=True) # DANGER
|
||||
```
|
||||
|
||||
### ✅ Use List Arguments
|
||||
|
||||
```python
|
||||
subprocess.run(['rm', '-rf', user_input]) # Safe
|
||||
```
|
||||
|
||||
### ❌ Not Handling Encoding
|
||||
|
||||
```python
|
||||
result = subprocess.run(['cmd'], capture_output=True)
|
||||
print(result.stdout) # bytes, not string
|
||||
```
|
||||
|
||||
### ✅ Specify text=True
|
||||
|
||||
```python
|
||||
result = subprocess.run(['cmd'], capture_output=True, text=True)
|
||||
print(result.stdout) # string
|
||||
```
|
||||
|
||||
### ❌ Ignoring Errors
|
||||
|
||||
```python
|
||||
result = subprocess.run(['cmd'])
|
||||
# No error handling
|
||||
```
|
||||
|
||||
### ✅ Check Exit Code
|
||||
|
||||
```python
|
||||
result = subprocess.run(['cmd'], capture_output=True, text=True)
|
||||
if result.returncode != 0:
|
||||
print(f"Error: {result.stderr}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
```
|
||||
|
||||
## Validation Tools
|
||||
|
||||
```bash
|
||||
# Check syntax
|
||||
python3 -m py_compile script.py
|
||||
|
||||
# Lint with pylint
|
||||
pip install pylint
|
||||
pylint script.py
|
||||
|
||||
# Format with black
|
||||
pip install black
|
||||
black script.py
|
||||
|
||||
# Type check with mypy
|
||||
pip install mypy
|
||||
mypy script.py
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- Python subprocess docs: https://docs.python.org/3/library/subprocess.html
|
||||
- Real Python subprocess guide: https://realpython.com/python-subprocess/
|
||||
- Argparse tutorial: https://docs.python.org/3/howto/argparse.html
|
||||
Reference in New Issue
Block a user