Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:03:17 +08:00
commit df4751ce28
11 changed files with 2308 additions and 0 deletions

View File

@@ -0,0 +1,372 @@
---
name: semantic-search
description: Use semantic search to find relevant code and documentation when user asks about specific functionality, features, or implementation patterns. Automatically invoke when user asks "where is...", "how does... work", "find code that...", or similar conceptual queries. More powerful than grep for concept-based searches. Uses odino CLI with BGE embeddings for fully local semantic search.
allowed-tools: Bash, Read
---
# Semantic Search
## Overview
Enable natural language semantic search across codebases and notes using odino CLI with BGE embeddings. Unlike grep (exact text matching) or glob (filename patterns), semantic search finds code by what it does, not what it's called.
## When to Use This Skill
Automatically invoke semantic search when the user:
- Asks "where is [concept]" or "how does [feature] work"
- Wants to find implementation of a concept/pattern
- Needs to understand codebase structure around a topic
- Searches for patterns by meaning, not exact text
- Asks exploratory questions like "show me authentication logic"
**Do not use** for:
- Exact string matching (use grep)
- Filename patterns (use glob)
- Known file paths (use read)
- When the user explicitly requests grep/glob
## Directory Traversal Logic
Odino requires running commands from the directory containing `.odino/` config. To make this transparent (like git), use this helper function:
```bash
# Function to find .odino directory by traversing up the directory tree
find_odino_root() {
local dir="$PWD"
while [[ "$dir" != "/" ]]; do
if [[ -d "$dir/.odino" ]]; then
echo "$dir"
return 0
fi
dir="$(dirname "$dir")"
done
return 1
}
# Usage in commands
if ODINO_ROOT=$(find_odino_root); then
echo "Found index at: $ODINO_ROOT"
(cd "$ODINO_ROOT" && odino query -q "$QUERY")
else
echo "No .odino index found in current path"
echo "Suggestion: Run /semq:index to create an index"
fi
```
**Why this matters:**
- User can be in any subdirectory of their project
- Commands automatically find the project root (where `.odino/` lives)
- Mirrors git behavior (works from anywhere in the tree)
## Quick Start
### Check if Directory is Indexed
Before searching, verify an index exists:
```bash
if ODINO_ROOT=$(find_odino_root); then
(cd "$ODINO_ROOT" && odino status)
else
echo "No index found. Suggest running /semq:index"
fi
```
### Search Indexed Codebase
```bash
# Basic search
odino query -q "authentication logic"
# With directory traversal
if ODINO_ROOT=$(find_odino_root); then
(cd "$ODINO_ROOT" && odino query -q "$QUERY")
fi
```
### Parse and Present Results
Odino returns results in a formatted table:
```
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ File ┃ Score ┃ Content ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ knowledge/Search Algorithms.md │ 0.361 │ 1 --- │
│ │ │ 2 tags: [todo/stub] │
│ │ │ 3 module: CMPU 4010 │
│ │ │ ... │
│ │ │ 7 # Search Algorithms in AI │
│ │ │ ... │
└─────────────────────────────────┴──────────┴─────────────────────────────────┘
Found 2 results
```
**Enhanced workflow:**
1. Parse table to extract file paths, scores, and content previews
2. Read top 2-3 results (score > 0.3) for full context
3. Summarize findings with explanations
4. Use code-pointer to open most relevant file
5. Suggest follow-up queries or related concepts
## Query Inference
Transform user requests into better semantic queries with realistic output examples.
### Example 1: Conceptual Query
**User asks:** "error handling"
**Inferred query:** `error handling exception management try catch validation`
**Sample odino output:**
```
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ File ┃ Score ┃ Content ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ knowledge/Error Handling.md │ 0.876 │ 1 --- │
│ │ │ 2 tags: [software-eng, best- │
│ │ │ 3 --- │
│ │ │ 4 # Error Handling │
│ │ │ 5 │
│ │ │ 6 Error handling is the proc │
│ │ │ 7 runtime errors gracefully │
│ │ │ 8 system stability. │
│ │ │ 9 │
│ │ │ 10 ## Key Concepts │
│ │ │ 11 - Try-catch blocks for syn │
│ │ │ 12 - Promise rejection handli │
│ │ │ 13 - Input validation to prev │
│ │ │ 14 - Logging errors for debug │
│ │ │ 15 - User-friendly error mess │
│ │ │ 16 │
│ │ │ 17 ## Best Practices │
│ │ │ 18 1. Fail fast - validate ea │
│ │ │ 19 2. Log with context - incl │
│ │ │ 20 3. Don't swallow errors - │
└─────────────────────────────────┴──────────┴─────────────────────────────────┘
```
### Example 2: Code Query
**User asks:** "DB connection code"
**Inferred query:**
```
database connection pooling setup
import mysql.connector
pool = mysql.connector.pooling.MySQLConnectionPool(
pool_name="mypool",
pool_size=5,
host="localhost",
database="mydb"
)
connection = pool.get_connection()
```
**Sample odino output:**
```
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ File ┃ Score ┃ Content ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ src/db/connection.js │ 0.924 │ 1 const mysql = require('mys │
│ │ │ 2 │
│ │ │ 3 // Create connection pool │
│ │ │ 4 const pool = mysql.createP │
│ │ │ 5 host: process.env.DB_HOS │
│ │ │ 6 user: process.env.DB_USE │
│ │ │ 7 password: process.env.DB │
│ │ │ 8 database: process.env.DB │
│ │ │ 9 waitForConnections: true │
│ │ │ 10 connectionLimit: 10, │
│ │ │ 11 queueLimit: 0 │
│ │ │ 12 }); │
│ │ │ 13 │
│ │ │ 14 // Test connection │
│ │ │ 15 pool.getConnection((err, c │
│ │ │ 16 if (err) { │
│ │ │ 17 console.error('DB conn │
│ │ │ 18 process.exit(1); │
│ │ │ 19 } │
│ │ │ 20 console.log('Connected t │
└─────────────────────────────────┴──────────┴─────────────────────────────────┘
```
### Example 3: Algorithm Query (with code)
**User asks:** "BFS algorithm in Python"
**Inferred query:**
```
breadth first search BFS graph traversal queue
def bfs(graph, start):
visited = set()
queue = [start]
while queue:
node = queue.pop(0)
if node not in visited:
visited.add(node)
queue.extend(graph[node])
return visited
```
**Sample odino output:**
```
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ File ┃ Score ┃ Content ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ knowledge/Search Algorithms.md │ 0.891 │ 1 --- │
│ │ │ 2 tags: [ai, algorithms] │
│ │ │ 3 module: CMPU 4010 AI │
│ │ │ 4 --- │
│ │ │ 5 # Search Algorithms in AI │
│ │ │ 6 │
│ │ │ 7 Algorithms for finding sol │
│ │ │ 8 problem spaces. Used in pa │
│ │ │ 9 game AI, and optimization. │
│ │ │ 10 │
│ │ │ 11 ## Types │
│ │ │ 12 │
│ │ │ 13 ### Uninformed Search │
│ │ │ 14 - **BFS**: Explores level │
│ │ │ 15 - **DFS**: Explores deeply │
│ │ │ 16 - **Uniform Cost**: Expand │
│ │ │ 17 │
│ │ │ 18 ### Informed Search │
│ │ │ 19 - **A***: Uses heuristic + │
│ │ │ 20 - **Greedy**: Only conside │
│ │ │ 21 - **Hill Climbing**: Local │
└─────────────────────────────────┴──────────┴─────────────────────────────────┘
```
### Inference Patterns
- **Expand abbreviations:** DB → database, auth → authentication
- **Code queries include sample code:** User asks "connection pooling" → Query includes Python example with `pool.get_connection()`
- **Use specified language:** User mentions "JavaScript" → Use JavaScript syntax in query
- **Default to Python:** No language specified → Use Python code examples
- **Add related concepts:** "search" → include BFS, DFS, A* terminology
- **Add context words:** "handling", "management", "setup", "configuration"
## Core Capabilities
### 1. Semantic Search
Find code by describing what it does, not exact text:
**User asks:** "Where is the database connection handling?"
**Workflow:**
1. Check if directory is indexed (use `find_odino_root`)
2. Run `odino query -q "database connection handling"`
3. Parse results and rank by score
4. Read top 2-3 results for context
5. Summarize findings with file paths
6. Suggest using code-pointer to open specific files
**Example:**
```bash
if ODINO_ROOT=$(find_odino_root); then
RESULTS=$(cd "$ODINO_ROOT" && odino query -q "database connection handling")
# Parse results, read top files, summarize
else
echo "No index found. Would you like me to index this directory?"
fi
```
### 2. Index Status Check
Verify indexing status before operations:
```bash
if ODINO_ROOT=$(find_odino_root); then
(cd "$ODINO_ROOT" && odino status)
# Shows: indexed files, model, last update
else
echo "No .odino index found"
fi
```
### 3. Integration with Other Tools
**Semantic search → code-pointer:**
```bash
# After finding relevant file
echo "Found authentication logic in src/auth/middleware.js:42"
echo "Opening file..."
code -g src/auth/middleware.js:42
```
**Semantic search → grep refinement:**
```bash
# Use semantic search to find the area
odino query -q "API endpoint handlers"
# Then use grep for exact matches in those files
grep -n "app.get\|app.post" src/routes/*.js
```
### 4. Handling Edge Cases
**No index found:**
```bash
if ! ODINO_ROOT=$(find_odino_root); then
echo "No semantic search index found in current path."
echo ""
echo "To create an index, run:"
echo " /semq:index"
echo ""
echo "This will index the current directory for semantic search."
fi
```
**Empty results:**
```bash
if [[ -z "$RESULTS" ]]; then
echo "No results found for query: $QUERY"
echo ""
echo "Suggestions:"
echo "- Try a different query (more general or specific)"
echo "- Verify the index is up to date (/semq:status)"
echo "- Consider using grep for exact text matching"
fi
```
## Slash Commands
This skill provides several slash commands for explicit control:
- **`/semq:search <query>`** - Search indexed codebase
- **`/semq:here <query>`** - Search with automatic directory traversal
- **`/semq:index [path]`** - Index directory for semantic search
- **`/semq:status [path]`** - Show indexing status and stats
## Best Practices
1. **Always check for index first** - Use `find_odino_root` before search operations
2. **Parse results clearly** - Show scores, file paths, and context
3. **Combine with other tools** - Use code-pointer for opening files, grep for exact matches
4. **Handle failures gracefully** - Suggest solutions when no index or no results
5. **Read top results** - Provide context by reading the most relevant files
6. **Use directory traversal** - Don't assume user is in project root
## Effective Query Patterns
Good queries are conceptual, not literal:
- ❌ "config.js" → Use glob instead
- ✅ "configuration loading logic"
- ❌ "validateEmail" → Use grep instead
- ✅ "email validation functions"
- ❌ "class AuthService" → Use grep instead
- ✅ "authentication service implementation"
## Technical Details
**Model:** BAAI/bge-small-en-v1.5 (33M params, ~133MB)
**Vector DB:** ChromaDB (stored in `.odino/chroma_db/`)
**Index location:** `.odino/` directory in project root
**Embedding batch size:** 16 (GPU) or 8 (CPU)
## Reference Documentation
For detailed information, see:
- **`references/cli_basics.md`** - Odino CLI syntax, commands, and options
- **`references/search_patterns.md`** - Effective query examples and tips
- **`references/integration.md`** - Workflows with code-pointer, grep, glob
Load these references as needed for deeper technical details or complex use cases.

View File

@@ -0,0 +1,226 @@
# Odino CLI Basics
Reference guide for odino CLI syntax, commands, and options.
## Installation
```bash
# Install via pipx (recommended)
pipx install odino
# Verify installation
odino --version
```
## Core Commands
### `odino index`
Index a directory for semantic search.
```bash
# Index current directory
odino index
# Index specific directory
odino index /path/to/project
# Specify model (recommended: BGE for efficiency)
odino index --model BAAI/bge-small-en-v1.5
# Force reindex (ignores existing index)
odino index --force
```
**Configuration:**
- Creates `.odino/` directory in indexed location
- Stores config in `.odino/config.json`
- Stores embeddings in `.odino/chroma_db/`
**Default model:** EmmanuelEA/eea-embedding-gemma (600MB)
**Recommended model:** BAAI/bge-small-en-v1.5 (133MB, faster)
### `odino query`
Search indexed directory using natural language.
```bash
# Basic search
odino query -q "authentication logic"
# Search with custom number of results
odino query -q "database connections" -n 10
# Search specific path
odino query -q "error handling" -p /path/to/indexed/dir
```
**Options:**
- `-q, --query <QUERY>` - Natural language search query (required)
- `-n, --num-results <N>` - Number of results to return (default: 5)
- `-p, --path <PATH>` - Path to indexed directory (default: current dir)
**Output format:**
```
Score: 0.85 | Path: src/auth/middleware.js
Score: 0.78 | Path: src/auth/tokens.js
Score: 0.72 | Path: src/utils/validation.js
```
### `odino status`
Show indexing status and statistics.
```bash
# Status for current directory
odino status
# Status for specific path
odino status -p /path/to/project
```
**Output includes:**
- Number of indexed files
- Total chunks generated
- Model name
- Index location
- Last modified date
## Configuration File
Location: `.odino/config.json` in indexed directory
```json
{
"model_name": "BAAI/bge-small-en-v1.5",
"embedding_batch_size": 16,
"chunk_size": 512,
"chunk_overlap": 50
}
```
**Key settings:**
- `model_name` - Embedding model to use
- `embedding_batch_size` - Batch size for GPU/CPU (16 for GPU, 8 for CPU)
- `chunk_size` - Token length for text chunks
- `chunk_overlap` - Overlap between chunks
## .odinoignore File
Create `.odinoignore` in project root to exclude files/directories (gitignore syntax):
```
# Build artifacts
build/
dist/
*.pyc
__pycache__/
# Dependencies
node_modules/
venv/
.venv/
# Config files
.env
.env.local
*.secret
```
## Model Comparison
| Model | Size | Params | MTEB Score | Speed |
|-------|------|--------|------------|-------|
| eea-embedding-gemma | 600MB | 308M | 69.67 | Slower |
| bge-small-en-v1.5 | 133MB | 33M | ~62-63 | Faster |
**Recommendation:** Use BGE for most cases (smaller, faster, good quality)
## Common CLI Patterns
**Index with BGE model:**
```bash
odino index --model BAAI/bge-small-en-v1.5
```
**Search from subdirectory:**
```bash
# Requires finding .odino directory first (see SKILL.md)
cd project/src/utils
odino query -q "validation" -p ../..
```
**Reindex after code changes:**
```bash
odino index --force
```
**Check if directory is indexed:**
```bash
if [[ -d .odino ]]; then
echo "Directory is indexed"
odino status
else
echo "Directory is not indexed"
fi
```
## Performance Tips
1. **Use BGE model** - 78% smaller, 90% fewer parameters, only ~7 point MTEB drop
2. **Adjust batch size** - Use 16 for GPU, 8 for CPU
3. **Use .odinoignore** - Exclude build artifacts, dependencies, config files
4. **GPU acceleration** - Much faster indexing if CUDA available
5. **Chunking strategy** - Default 512 tokens works well for most code
## Troubleshooting
**"Command not found: odino"**
```bash
# Ensure pipx bin directory is in PATH
export PATH="$HOME/.local/bin:$PATH"
# Or reinstall
pipx install odino
```
**"No index found"**
```bash
# Check for .odino directory
ls -la .odino
# If missing, index first
odino index --model BAAI/bge-small-en-v1.5
```
**GPU out of memory**
```bash
# Reduce batch size in .odino/config.json
{
"embedding_batch_size": 8 # or even 4
}
# Then reindex
odino index --force
```
**Slow indexing**
```bash
# Use smaller model
odino index --model BAAI/bge-small-en-v1.5
# Reduce batch size if GPU memory limited
# Edit .odino/config.json: "embedding_batch_size": 8
```
## Exit Codes
- `0` - Success
- `1` - General error (no index, invalid path, etc.)
- `2` - Invalid arguments
## Environment Variables
Odino respects standard environment variables:
- `CUDA_VISIBLE_DEVICES` - GPU selection
- `HF_HOME` - Hugging Face cache directory (for model downloads)

View File

@@ -0,0 +1,449 @@
# Integration with Other Tools
Guide to combining semantic search with other Claude Code tools for powerful workflows.
## Tool Integration Matrix
| Tool | Purpose | When to Use After Semantic Search |
|------|---------|-----------------------------------|
| **code-pointer** | Open files at specific lines | When you want to view/edit found code |
| **grep** | Exact text matching | To find specific strings in sem-search results |
| **glob** | File pattern matching | To filter results by file type/location |
| **read** | Read file contents | To examine top semantic search results |
## Semantic Search → Code-Pointer
**Use case:** Open relevant files in VSCode after semantic search
### Basic Pattern
```bash
# 1. Find relevant code with semantic search
RESULTS=$(odino query -q "authentication middleware")
# Example output:
# Score: 0.89 | Path: src/middleware/auth.js
# Score: 0.82 | Path: src/middleware/jwt.js
# 2. Open top result in VSCode
code -g src/middleware/auth.js
```
### With Line Numbers
When you know the specific section:
```bash
# Find the file
odino query -q "JWT token generation"
# → src/auth/jwt.js
# Read the file to find exact line
grep -n "generateToken" src/auth/jwt.js
# → 42:function generateToken(user) {
# Open at specific line
code -g src/auth/jwt.js:42
```
### Automated Workflow
```bash
# Parse top result and open automatically
TOP_FILE=$(odino query -q "password hashing" | head -1 | cut -d'|' -f2 | xargs)
code -g "$TOP_FILE"
```
## Semantic Search → Grep
**Use case:** Find exact text in files discovered by semantic search
### Two-Stage Search
```bash
# Stage 1: Find relevant area with semantic search
odino query -q "database connection handling"
# → Found: src/db/connection.js, src/db/pool.js
# Stage 2: Find exact string in those files
grep -n "createConnection\|getConnection" src/db/*.js
```
### Narrowing Results
```bash
# Broad semantic search
odino query -q "API endpoints"
# → Returns 20+ files
# Narrow to specific endpoint with grep
grep -r "app.post('/users')" .
```
### Finding Patterns
```bash
# Find error handling code
odino query -q "error handling patterns"
# → src/middleware/error.js, src/utils/errors.js
# Find specific error types
grep -r "try.*catch\|throw new" src/middleware/error.js src/utils/errors.js
```
## Semantic Search → Glob
**Use case:** Filter semantic search results by file patterns
### File Type Filtering
```bash
# Find configuration code
odino query -q "application configuration"
# → Might return .js, .json, .yaml files
# Focus on just config files
find . -name "*.config.js" -o -name "config.json"
```
### Directory-Specific Search
```bash
# Find test files related to authentication
odino query -q "authentication testing"
# Then narrow to test directory
ls tests/**/*auth*.test.js
```
## Semantic Search → Read
**Use case:** Examine top results to understand context
### Standard Workflow
```bash
# 1. Semantic search
odino query -q "user registration logic"
# → Top 3 results:
# Score: 0.91 | Path: src/routes/auth.js
# Score: 0.85 | Path: src/services/user.js
# Score: 0.79 | Path: src/validators/user.js
# 2. Read top result
cat src/routes/auth.js
# 3. Read related files for full context
cat src/services/user.js
cat src/validators/user.js
```
### With Summary
```bash
# Find relevant code
odino query -q "payment processing"
# Read top results and summarize
echo "## Payment Processing Implementation"
echo ""
echo "### Main handler:"
head -20 src/routes/payment.js
echo ""
echo "### Service layer:"
head -20 src/services/payment.js
```
## Complete Workflow Examples
### Example 1: Debugging a Feature
**Goal:** Fix bug in user login
```bash
# 1. Find login code with semantic search
odino query -q "user login authentication"
# → src/routes/auth.js (0.92)
# → src/services/auth.js (0.88)
# → src/middleware/passport.js (0.81)
# 2. Read main login handler
cat src/routes/auth.js | grep -A 20 "post.*login"
# 3. Find error handling in that file
grep -n "catch\|error" src/routes/auth.js
# 4. Open at error handling line
code -g src/routes/auth.js:67
```
### Example 2: Understanding a Feature
**Goal:** Understand how caching works
```bash
# 1. Find caching implementation
odino query -q "caching implementation and configuration"
# → src/cache/redis.js (0.94)
# → src/config/cache.js (0.87)
# → src/middleware/cache.js (0.79)
# 2. Read configuration first
cat src/config/cache.js
# 3. Then read main implementation
cat src/cache/redis.js
# 4. Find usage examples
grep -r "cache.get\|cache.set" src/ | head -10
# 5. Open implementation in editor
code -g src/cache/redis.js
```
### Example 3: Refactoring
**Goal:** Replace all database connection code
```bash
# 1. Find all database connection code
odino query -q "database connection creation and management"
# → Returns 5 files
# 2. Find exact connection creation calls
grep -rn "createConnection\|mysql.connect\|pg.Pool" src/
# 3. Check each file with semantic context
odino query -q "connection pooling"
odino query -q "database transaction handling"
# 4. Open files for editing
code -g src/db/connection.js src/db/pool.js src/db/transaction.js
```
### Example 4: Code Review
**Goal:** Review all authentication changes
```bash
# 1. Find authentication code
odino query -q "authentication and authorization logic"
# 2. Find recent changes (combine with git)
git diff main -- src/auth/ src/middleware/auth.js
# 3. Semantic search for related security code
odino query -q "security validation and sanitization"
# 4. Check for vulnerabilities
grep -r "eval\|exec\|innerHTML" src/auth/
```
## Advanced Integration Patterns
### Cascading Search
Start broad, narrow progressively:
```bash
# Level 1: Semantic search (broad)
odino query -q "user management"
# → 15 files
# Level 2: File pattern (medium)
find src/ -path "*/users/*" -name "*.js"
# → 8 files
# Level 3: Grep (narrow)
grep -l "updateUser\|deleteUser" src/users/*.js
# → 3 files
# Level 4: Open specific files
code -g src/users/controller.js src/users/service.js
```
### Context Building
Build understanding layer by layer:
```bash
# 1. High-level architecture
odino query -q "application architecture and structure"
# 2. Specific subsystem
odino query -q "data access layer implementation"
# 3. Specific functionality
odino query -q "CRUD operations for users"
# 4. Exact implementation
grep -A 30 "function createUser" src/data/users.js
```
### Verification Workflow
Verify semantic search results:
```bash
# 1. Semantic search
odino query -q "input validation"
# → src/validators/user.js (0.87)
# 2. Verify by reading
cat src/validators/user.js | head -50
# 3. Find all validation usage
grep -r "import.*validator" src/
# 4. Cross-reference with tests
odino query -q "validation testing"
```
## When to Use Each Tool
### Use semantic search when:
- Exploring unfamiliar codebase
- Finding conceptual implementations
- Locating cross-cutting concerns
- Understanding feature architecture
### Use grep when:
- Searching for exact strings
- Finding variable/function usages
- Locating error messages
- Searching for TODOs/FIXMEs
### Use glob when:
- Finding files by name pattern
- Filtering by file type
- Working with specific directories
- Batch operations on matching files
### Use code-pointer when:
- Need to view/edit specific code
- Opening files at exact lines
- Navigating to definitions
- Debugging specific sections
### Use read when:
- Examining file contents
- Understanding context
- Checking configuration
- Reviewing small files
## Tool Selection Decision Tree
```
Need to find code?
├─ Know exact string? → Use grep
├─ Know filename pattern? → Use glob
├─ Know file path? → Use read
└─ Know concept only? → Use semantic search
└─ Got results? → Use code-pointer to open
└─ Need exact line? → Use grep in file
```
## Best Practices
1. **Start semantic, end specific**
- Semantic search → Find area
- Grep → Find exact code
- Code-pointer → Open for editing
2. **Read before editing**
- Semantic search → Find files
- Read → Understand context
- Code-pointer → Open in editor
3. **Verify with multiple tools**
- Semantic search → Find candidates
- Grep → Verify it's the right code
- Read → Confirm implementation
4. **Build context progressively**
- Semantic search → High-level structure
- Semantic search → Specific subsystem
- Read/Grep → Detailed implementation
5. **Combine for complex tasks**
- Semantic search → Find authentication
- Grep → Find specific function
- Code-pointer → Open file
- Read → Check related files
## Performance Tips
1. **Cache semantic results**
- Run semantic search once
- Use results for multiple grep/read operations
2. **Narrow scope early**
- Use semantic search to identify directory
- Then use grep/glob only in that directory
3. **Batch file operations**
- Collect file paths from semantic search
- Read multiple files in one pass
4. **Use appropriate tool**
- Don't use semantic search for exact strings
- Don't use grep for conceptual searches
## Common Mistakes to Avoid
**❌ Using semantic search for exact matches:**
```bash
# Wrong
odino query -q "function validateEmail"
# Right
grep -r "function validateEmail" .
```
**❌ Using grep for concepts:**
```bash
# Wrong (might miss variations)
grep -r "auth" .
# Right
odino query -q "authentication implementation"
```
**❌ Not reading semantic results:**
```bash
# Wrong (blindly trusting results)
odino query -q "payment" | head -1 | cut -d'|' -f2
# Right (verify first)
odino query -q "payment"
cat [top-result] # Verify it's actually payment code
```
**❌ Opening too many files:**
```bash
# Wrong
odino query -q "validation" | while read line; do
code -g "$(echo $line | cut -d'|' -f2)"
done # Opens 20+ files
# Right
odino query -q "validation" | head -3
# Review, then open specific files
code -g src/validators/user.js
```
## Summary
**Effective integration means:**
- Using the right tool for each task
- Starting broad (semantic) and narrowing (grep)
- Verifying results before acting
- Building context progressively
- Combining tools for complex workflows
**Remember:**
- Semantic search → Finding concepts
- Grep → Finding exact text
- Glob → Finding files
- Read → Understanding context
- Code-pointer → Editing code

View File

@@ -0,0 +1,346 @@
# Effective Semantic Search Patterns
Guide to crafting effective semantic search queries and interpreting results.
## Query Design Principles
### Be Conceptual, Not Literal
Semantic search works best with conceptual queries that describe **what the code does**, not what it's named.
**❌ Poor queries (too literal):**
- "validateEmail" → Use grep instead
- "config.js" → Use glob instead
- "class AuthService" → Use grep instead
- "TODO" → Use grep instead
**✅ Good queries (conceptual):**
- "email validation logic"
- "configuration loading and parsing"
- "authentication service implementation"
- "incomplete features or pending work"
### Use Natural Language
Write queries as you would explain the concept to a colleague.
**❌ Keyword stuffing:**
- "db connect pool config"
**✅ Natural language:**
- "database connection pooling configuration"
### Be Specific When Needed
Balance specificity with generality based on what you're looking for.
**Too general:**
- "functions" → Will match everything
- "code" → Will match everything
**Too specific:**
- "JWT token validation using bcrypt with salt rounds set to 10" → Too narrow
**Just right:**
- "JWT token validation"
- "password hashing and verification"
## Common Query Patterns
### Finding Implementation
**Pattern:** "[concept] implementation" or "how [feature] works"
```bash
odino query -q "authentication implementation"
odino query -q "how caching works"
odino query -q "error handling implementation"
```
### Finding Configuration
**Pattern:** "[system/feature] configuration" or "[thing] setup"
```bash
odino query -q "database configuration"
odino query -q "API endpoint setup"
odino query -q "logging configuration"
```
### Finding Patterns
**Pattern:** "[pattern/technique] usage" or "examples of [pattern]"
```bash
odino query -q "middleware usage patterns"
odino query -q "dependency injection examples"
odino query -q "async/await error handling"
```
### Finding by Purpose
**Pattern:** "code that [does something]" or "[action] logic"
```bash
odino query -q "code that validates user input"
odino query -q "file upload logic"
odino query -q "payment processing"
```
### Finding Documentation
**Pattern:** "[topic] documentation" or "how to [task]"
```bash
odino query -q "API documentation"
odino query -q "how to deploy the application"
odino query -q "setup instructions"
```
## Result Interpretation
### Understanding Scores
Odino returns results with similarity scores (0.0 to 1.0):
- **0.85-1.0**: Highly relevant, almost certainly what you're looking for
- **0.70-0.84**: Likely relevant, worth checking
- **0.60-0.69**: Possibly relevant, may contain related concepts
- **<0.60**: Weakly related, probably not useful
**Example output:**
```
Score: 0.92 | Path: src/auth/jwt.js # Definitely check this
Score: 0.78 | Path: src/middleware/auth.js # Likely relevant
Score: 0.64 | Path: src/utils/crypto.js # Maybe related
Score: 0.51 | Path: src/config/index.js # Probably not it
```
### When to Read Files
**Always read:** Top 1-2 results (score > 0.80)
**Sometimes read:** Next 2-3 results (score 0.65-0.80) for context
**Rarely read:** Results with score < 0.65
### Combining Results
Often the answer spans multiple files:
```bash
odino query -q "user authentication flow"
# Results might include:
# - Login endpoint (score: 0.89)
# - JWT generation (score: 0.85)
# - Password verification (score: 0.82)
# - Session management (score: 0.76)
```
Read top results to understand the complete picture.
## Refinement Strategies
### Too Many Results
Make query more specific:
```bash
# Too broad
odino query -q "validation"
# Better
odino query -q "email format validation"
```
### Too Few Results
Make query more general:
```bash
# Too narrow
odino query -q "SHA256 password hashing with bcrypt"
# Better
odino query -q "password hashing"
```
### Wrong Results
Try different phrasing:
```bash
# If "API endpoint handlers" doesn't work well
odino query -q "route definitions"
odino query -q "HTTP request handlers"
odino query -q "REST API implementation"
```
## Advanced Patterns
### Multi-Concept Queries
Combine related concepts for broader coverage:
```bash
odino query -q "authentication and authorization logic"
odino query -q "database queries and ORM usage"
odino query -q "error handling and logging"
```
### Feature-Specific Queries
Target specific features or subsystems:
```bash
odino query -q "user registration feature"
odino query -q "shopping cart functionality"
odino query -q "notification system"
```
### Cross-Cutting Concerns
Find patterns that span the codebase:
```bash
odino query -q "error handling patterns"
odino query -q "input validation across endpoints"
odino query -q "database transaction usage"
```
## Query Examples by Use Case
### Code Exploration
"I'm new to this codebase, where do I start?"
```bash
odino query -q "main application entry point"
odino query -q "core business logic"
odino query -q "primary data models"
```
### Bug Hunting
"There's a bug in feature X, where's the code?"
```bash
odino query -q "user login functionality"
odino query -q "payment processing logic"
odino query -q "email sending implementation"
```
### Refactoring
"I need to change how we do X across the codebase"
```bash
odino query -q "database connection creation"
odino query -q "API key validation"
odino query -q "date formatting and parsing"
```
### Learning Patterns
"How does this codebase handle X?"
```bash
odino query -q "dependency injection patterns"
odino query -q "testing approach and examples"
odino query -q "configuration management"
```
## When Semantic Search Doesn't Help
Use other tools when:
1. **Exact string needed** - Use `grep`
```bash
grep -r "validateEmail" .
```
2. **Filename patterns** - Use `glob` or `find`
```bash
find . -name "*config*.js"
```
3. **Known file location** - Use `read` directly
```bash
# Just read the file
cat src/config/database.js
```
4. **Symbol definitions** - Use language-specific tools
```bash
# For Python
grep -r "class AuthService" .
# For JavaScript
grep -r "export.*AuthService" .
```
## Combining Tools Workflow
**Best practice:** Start semantic, narrow with grep/glob
```bash
# 1. Find the general area with semantic search
odino query -q "database migrations"
# → Found: migrations/2024-01-15-add-users.sql
# 2. Narrow to specific files/patterns
find migrations/ -name "*users*"
# 3. Search for exact strings in those files
grep -n "CREATE TABLE" migrations/*users*.sql
```
## Tips for Better Results
1. **Use verbs and nouns** - "validates user input" not just "validation"
2. **Include context** - "email validation in registration" not just "email"
3. **Think about purpose** - What does the code **do**, not what it's **called**
4. **Try synonyms** - "authentication" vs "login" vs "sign in"
5. **Be patient** - Try 2-3 query variations if first doesn't work
6. **Check top 3-5 results** - Sometimes #3 is the best match
7. **Combine with file reading** - Read top results to confirm relevance
## Anti-Patterns to Avoid
**❌ Searching for variable names:**
```bash
odino query -q "userEmail" # Use grep instead
```
**❌ Searching for exact error messages:**
```bash
odino query -q "Error: Connection refused" # Use grep instead
```
**❌ Searching for file paths:**
```bash
odino query -q "src/utils/validation.js" # Use find/glob instead
```
**❌ Searching for TODOs/comments:**
```bash
odino query -q "TODO fix this" # Use grep instead
```
**❌ Overly generic queries:**
```bash
odino query -q "code" # Way too broad
```
## Summary
**Good semantic queries are:**
- Conceptual, not literal
- Natural language, not keywords
- Focused on purpose/behavior
- Appropriately specific
**After getting results:**
- Check scores (> 0.70 is good)
- Read top 2-3 files for context
- Combine with grep/glob for precision
- Iterate query if needed