393 lines
9.4 KiB
Markdown
393 lines
9.4 KiB
Markdown
---
|
|
title: Pre-Commit Quality Standards
|
|
description: Project type detection, quality checks, and pre-commit configurations for pre-commit validation
|
|
tags: [pre-commit, quality, linting, security, project-types, validation]
|
|
---
|
|
|
|
# Pre-Commit Quality Standards
|
|
|
|
## Metadata
|
|
|
|
**Purpose**: Define project type detection patterns, pre-commit quality checks, and standard pre-commit configurations for different project types
|
|
**Applies to**: Pre-commit validation commands and pre-commit setup workflows
|
|
**Version**: 1.0.0
|
|
|
|
---
|
|
|
|
## Instructions
|
|
|
|
### Project Type Detection
|
|
|
|
Use file system indicators to determine project type(s):
|
|
|
|
#### Python Projects
|
|
**Indicators**:
|
|
- `pyproject.toml` (modern Python packaging)
|
|
- `setup.py` (traditional Python packaging)
|
|
- `requirements.txt` or `requirements/*.txt`
|
|
- `Pipfile` (pipenv)
|
|
- Presence of `*.py` files in src/ or root
|
|
|
|
**Detection command**:
|
|
```bash
|
|
ls -la | grep -E 'pyproject.toml|setup.py|requirements.txt|Pipfile'
|
|
find . -maxdepth 2 -name "*.py" | head -1
|
|
```
|
|
|
|
#### Data Science Projects
|
|
**Indicators**:
|
|
- `*.ipynb` Jupyter notebooks
|
|
- Common data directories: `data/`, `notebooks/`, `models/`
|
|
- Data files: `*.csv`, `*.parquet`, `*.pkl`
|
|
- ML config files: `mlflow.yaml`, `dvc.yaml`
|
|
|
|
**Detection command**:
|
|
```bash
|
|
find . -name "*.ipynb" | head -1
|
|
ls -d data/ notebooks/ models/ 2>/dev/null
|
|
```
|
|
|
|
#### Plugin Marketplace Projects
|
|
**Indicators**:
|
|
- `.claude-plugin/` directory
|
|
- `.claude-plugin/plugin.json` or `.claude-plugin/marketplace.json`
|
|
- `commands/`, `skills/`, `agents/` directories
|
|
|
|
**Detection command**:
|
|
```bash
|
|
ls -d .claude-plugin/ 2>/dev/null
|
|
```
|
|
|
|
#### Mixed Projects
|
|
Projects may have multiple types (e.g., Python + Jupyter, Python + Marketplace). Report all detected types.
|
|
|
|
### Universal Quality Checks
|
|
|
|
These checks apply to **all project types**:
|
|
|
|
#### 1. Commit Message Validation
|
|
Reference: `commit-message-standards` skill
|
|
- Format: `<type>(<scope>): <subject>`
|
|
- Length: Subject ≤50 characters
|
|
- Issue linking: Based on commit type
|
|
|
|
#### 2. Branch Naming Validation
|
|
Reference: `github-workflow-patterns` skill
|
|
- Format: `<type>/<description>`
|
|
- Valid types: feature, fix, hotfix, refactor, docs, experiment, chore
|
|
- Lowercase with hyphens
|
|
|
|
#### 3. Secret Detection
|
|
Scan for common secret patterns:
|
|
```regex
|
|
# API Keys
|
|
(api[_-]?key|apikey)[\s:=]["']?[a-zA-Z0-9]{20,}
|
|
|
|
# AWS Keys
|
|
(aws[_-]?access[_-]?key[_-]?id|AKIA[0-9A-Z]{16})
|
|
|
|
# Private Keys
|
|
-----BEGIN (RSA |DSA |EC )?PRIVATE KEY-----
|
|
|
|
# Tokens
|
|
(github[_-]?token|gh[pous]_[a-zA-Z0-9]{36,})
|
|
(sk-[a-zA-Z0-9]{48}) # OpenAI/Anthropic
|
|
|
|
# Passwords
|
|
(password|passwd|pwd)[\s:=]["'][^"']{8,}
|
|
|
|
# Generic secrets
|
|
(secret|token)[\s:=]["'][a-zA-Z0-9+/=]{20,}
|
|
```
|
|
|
|
**Check command**:
|
|
```bash
|
|
git diff --staged | grep -E '(api[_-]?key|password|secret|token|AWS|AKIA|sk-[a-zA-Z0-9])'
|
|
```
|
|
|
|
#### 4. Large File Detection
|
|
**Thresholds**:
|
|
- Warn: >5MB
|
|
- Fail: >10MB (unless using Git LFS)
|
|
|
|
**Check command**:
|
|
```bash
|
|
git diff --staged --name-only | python3 -c "
|
|
import sys, os
|
|
for line in sys.stdin:
|
|
file = line.strip()
|
|
if os.path.isfile(file):
|
|
size = os.path.getsize(file)
|
|
if size > 10*1024*1024:
|
|
print(f'{file}: {size/1024/1024:.2f} MB')
|
|
"
|
|
```
|
|
|
|
#### 5. Merge Conflict Markers
|
|
```bash
|
|
git diff --staged | grep -E '^(<{7}|={7}|>{7})'
|
|
```
|
|
|
|
#### 6. Trailing Whitespace
|
|
```bash
|
|
git diff --staged --check
|
|
```
|
|
|
|
#### 7. Direct Commits to Protected Branches
|
|
```bash
|
|
branch=$(git branch --show-current)
|
|
if [[ "$branch" == "main" || "$branch" == "master" ]]; then
|
|
echo "ERROR: Direct commits to $branch not allowed"
|
|
fi
|
|
```
|
|
|
|
### Project-Specific Quality Checks
|
|
|
|
#### Python Projects
|
|
|
|
**Syntax Validation**:
|
|
```bash
|
|
# Check Python syntax
|
|
for file in $(git diff --staged --name-only | grep '\.py$'); do
|
|
python -m py_compile "$file"
|
|
done
|
|
```
|
|
|
|
**Common Issues to Detect**:
|
|
```bash
|
|
# Debug statements
|
|
grep -r "import pdb" $(git diff --staged --name-only)
|
|
grep -r "breakpoint()" $(git diff --staged --name-only)
|
|
|
|
# Print debugging (warn only, not fail)
|
|
grep -r "print(" $(git diff --staged --name-only | grep '\.py$')
|
|
|
|
# Hardcoded paths (warn)
|
|
grep -E '(/Users/|/home/|C:\\)' $(git diff --staged --name-only | grep '\.py$')
|
|
```
|
|
|
|
**Type Hints** (for production-tier projects):
|
|
```bash
|
|
# Check if type hints are present
|
|
grep -E ': (str|int|float|bool|List|Dict|Optional)' file.py
|
|
```
|
|
|
|
#### Plugin Marketplace Projects
|
|
|
|
**JSON Validation**:
|
|
```bash
|
|
# Validate all plugin.json files
|
|
for file in $(find . -path "*/.claude-plugin/plugin.json"); do
|
|
python -m json.tool "$file" >/dev/null || echo "Invalid JSON: $file"
|
|
done
|
|
|
|
# Validate marketplace.json
|
|
python -m json.tool .claude-plugin/marketplace.json >/dev/null
|
|
```
|
|
|
|
**Markdown Quality**:
|
|
```bash
|
|
# Check for trailing whitespace
|
|
grep -n '\s$' *.md
|
|
|
|
# Check for proper heading hierarchy
|
|
# (H1 only once, H2-H6 properly nested)
|
|
```
|
|
|
|
**Plugin Structure Validation**:
|
|
- Required files: `.claude-plugin/plugin.json`
|
|
- Required fields in plugin.json: name, description, version, author
|
|
- Commands must have YAML frontmatter with description
|
|
- Skills should follow 3-tier structure (Metadata/Instructions/Resources)
|
|
|
|
#### Data Science Projects
|
|
|
|
**Notebook Checks**:
|
|
```bash
|
|
# Large notebooks (>5MB)
|
|
find . -name "*.ipynb" -size +5M
|
|
|
|
# Check for outputs in notebooks (optional - some teams want outputs cleared)
|
|
grep -l '"outputs": \[' *.ipynb | grep -v '"outputs": \[\]'
|
|
|
|
# Check for execution count in notebooks
|
|
grep -l '"execution_count":' *.ipynb
|
|
```
|
|
|
|
**Data File Checks**:
|
|
```bash
|
|
# Large data files that shouldn't be in git
|
|
find data/ -type f -size +10M 2>/dev/null
|
|
|
|
# Check if Git LFS is configured for data files
|
|
git lfs ls-files | grep -E '\.(csv|parquet|pkl|h5)$'
|
|
```
|
|
|
|
**Model File Checks**:
|
|
```bash
|
|
# Large model files
|
|
find models/ -type f -size +100M 2>/dev/null
|
|
```
|
|
|
|
---
|
|
|
|
## Resources
|
|
|
|
### Pre-Commit Configuration Templates
|
|
|
|
#### Python Project Template
|
|
|
|
```yaml
|
|
# .pre-commit-config.yaml for Python projects
|
|
repos:
|
|
# Code formatting
|
|
- repo: https://github.com/psf/black
|
|
rev: 23.12.1
|
|
hooks:
|
|
- id: black
|
|
language_version: python3.11
|
|
|
|
# Linting
|
|
- repo: https://github.com/astral-sh/ruff-pre-commit
|
|
rev: v0.1.9
|
|
hooks:
|
|
- id: ruff
|
|
args: [--fix, --exit-non-zero-on-fix]
|
|
|
|
# Type checking
|
|
- repo: https://github.com/pre-commit/mirrors-mypy
|
|
rev: v1.8.0
|
|
hooks:
|
|
- id: mypy
|
|
additional_dependencies: [types-all]
|
|
|
|
# Security and general checks
|
|
- repo: https://github.com/pre-commit/pre-commit-hooks
|
|
rev: v4.5.0
|
|
hooks:
|
|
- id: trailing-whitespace
|
|
- id: end-of-file-fixer
|
|
- id: check-yaml
|
|
- id: check-json
|
|
- id: check-added-large-files
|
|
args: ['--maxkb=10000']
|
|
- id: check-merge-conflict
|
|
- id: detect-private-key
|
|
|
|
# Secret detection
|
|
- repo: https://github.com/Yelp/detect-secrets
|
|
rev: v1.4.0
|
|
hooks:
|
|
- id: detect-secrets
|
|
args: ['--baseline', '.secrets.baseline']
|
|
```
|
|
|
|
#### Plugin Marketplace Project Template
|
|
|
|
```yaml
|
|
# .pre-commit-config.yaml for Plugin Marketplace projects
|
|
repos:
|
|
# Markdown linting
|
|
- repo: https://github.com/igorshubovych/markdownlint-cli
|
|
rev: v0.38.0
|
|
hooks:
|
|
- id: markdownlint
|
|
args: [--fix]
|
|
|
|
# YAML validation
|
|
- repo: https://github.com/pre-commit/pre-commit-hooks
|
|
rev: v4.5.0
|
|
hooks:
|
|
- id: check-yaml
|
|
- id: check-json
|
|
- id: trailing-whitespace
|
|
- id: end-of-file-fixer
|
|
- id: check-merge-conflict
|
|
|
|
# JSON formatting
|
|
- repo: https://github.com/pre-commit/mirrors-prettier
|
|
rev: v3.1.0
|
|
hooks:
|
|
- id: prettier
|
|
types_or: [json, yaml, markdown]
|
|
```
|
|
|
|
#### Data Science Project Template
|
|
|
|
```yaml
|
|
# .pre-commit-config.yaml for Data Science projects
|
|
repos:
|
|
# Python formatting and linting (same as Python template)
|
|
- repo: https://github.com/psf/black
|
|
rev: 23.12.1
|
|
hooks:
|
|
- id: black
|
|
|
|
- repo: https://github.com/astral-sh/ruff-pre-commit
|
|
rev: v0.1.9
|
|
hooks:
|
|
- id: ruff
|
|
|
|
# Jupyter notebook handling
|
|
- repo: https://github.com/nbQA-dev/nbQA
|
|
rev: 1.7.1
|
|
hooks:
|
|
- id: nbqa-black
|
|
- id: nbqa-ruff
|
|
|
|
# Clear notebook outputs
|
|
- repo: https://github.com/kynan/nbstripout
|
|
rev: 0.6.1
|
|
hooks:
|
|
- id: nbstripout
|
|
|
|
# General checks
|
|
- repo: https://github.com/pre-commit/pre-commit-hooks
|
|
rev: v4.5.0
|
|
hooks:
|
|
- id: trailing-whitespace
|
|
- id: end-of-file-fixer
|
|
- id: check-added-large-files
|
|
args: ['--maxkb=5000']
|
|
- id: detect-private-key
|
|
```
|
|
|
|
### Secret Detection Patterns Reference
|
|
|
|
**High-confidence patterns** (should fail):
|
|
- AWS Access Key: `AKIA[0-9A-Z]{16}`
|
|
- GitHub Personal Access Token: `ghp_[a-zA-Z0-9]{36}`
|
|
- GitHub OAuth Token: `gho_[a-zA-Z0-9]{36}`
|
|
- Private Key: `-----BEGIN.*PRIVATE KEY-----`
|
|
- Anthropic/OpenAI Key: `sk-[a-zA-Z0-9]{48}`
|
|
|
|
**Medium-confidence patterns** (should warn):
|
|
- Generic API key assignment: `api_key\s*=\s*["'][^"']+["']`
|
|
- Password assignment: `password\s*=\s*["'][^"']+["']`
|
|
- Token assignment: `token\s*=\s*["'][^"']+["']`
|
|
|
|
**Low-confidence patterns** (informational only):
|
|
- Environment variable usage: `os.getenv('API_KEY')`
|
|
- Config references: `config['secret']`
|
|
|
|
### Quality Check Severity Levels
|
|
|
|
**FAIL (Blocking)**:
|
|
- Secrets detected (high confidence)
|
|
- Syntax errors
|
|
- Merge conflict markers
|
|
- Direct commits to main/master
|
|
- Files >10MB without Git LFS
|
|
|
|
**WARN (Non-blocking)**:
|
|
- Print statements in Python code
|
|
- Files >5MB
|
|
- Debug statements in non-test files
|
|
- Missing type hints (production tier only)
|
|
- Secrets detected (medium confidence)
|
|
|
|
**INFO (Informational)**:
|
|
- Pre-commit not configured
|
|
- Optional checks not run
|
|
- Secrets detected (low confidence)
|