Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 09:00:26 +08:00
commit 11bee70e53
13 changed files with 2402 additions and 0 deletions

View File

@@ -0,0 +1,196 @@
# Buildkite Annotation Patterns
Annotations are build-level messages that appear at the top of a build page. They can contain success messages, warnings, errors, or informational content. **Not all projects use annotations consistently.**
## What Are Annotations?
Annotations are created by build steps using the `buildkite-agent annotate` command. They appear prominently at the top of the build page and can be styled with different colors/icons.
### Annotation Styles
- **`success`** - Green, checkmark icon, positive message
- **`info`** - Blue, info icon, informational message (most common)
- **`warning`** - Yellow, warning icon, something to be aware of
- **`error`** - Red, error icon, indicates problems
## Project-Specific Patterns
### Projects That Use Annotations Heavily (e.g., zen-payroll)
These projects surface important information in annotations:
1. **Test Failures**: RSpec, Jest, or other test failures may be summarized in annotations
- Failed test count
- Links to failed test files
- Stack traces or error messages
2. **Coverage Reports**: Code coverage changes or drops below thresholds
3. **Linting Errors**: Rubocop, ESLint violations grouped by severity
4. **Build Resources**: Links to documentation, help channels, or common issues
5. **Security Scans**: Dependency vulnerabilities, security warnings
6. **Performance Issues**: Slow tests, memory issues, or other performance concerns
**When checking status**: Always look at annotations first for these projects. They often contain the most actionable information.
### Projects Without Annotations (e.g., gusto-karafka)
Smaller or simpler projects may not use annotations at all. For these projects:
- **All failure information is in job logs**: Must read individual job output
- **No centralized summary**: Need to check each failed job separately
- **Simpler debugging path**: Less information to parse, but more manual work
## Accessing Annotations
### Via MCP Tools
```javascript
// List all annotations for a build
mcp__MCPProxy__call_tool('buildkite:list_annotations', {
org_slug: 'gusto',
pipeline_slug: 'zenpayroll',
build_number: '1359675',
});
```
Annotation response includes:
- `context`: Unique identifier for the annotation
- `style`: success/info/warning/error
- `body_html`: HTML content of the annotation
- `created_at`: Timestamp
### Via bktide
```bash
npx bktide annotations gusto/zenpayroll#1359675
```
## Interpreting Annotations
### 1. Start with Error-Styled Annotations
Check for `style: "error"` first - these indicate critical problems:
- Test suite failures
- Build failures
- Security issues
### 2. Check Warning Annotations
`style: "warning"` may indicate:
- Degraded performance
- Coverage drops
- Flaky tests
- Deprecated dependencies
### 3. Info Annotations for Context
`style: "info"` often contains:
- Build metadata
- Links to resources
- Change summaries
- Help information
### 4. Success Annotations
`style: "success"` indicates:
- All tests passed
- Coverage improved
- Performance metrics good
## Common Annotation Patterns
### Test Failure Annotations
Typically include:
```
❌ 15 tests failed
spec/models/user_spec.rb
- validates email format
- validates password strength
spec/controllers/api_controller_spec.rb
- returns 401 when unauthorized
```
**Action**: Read the listed test failures, then examine the job logs for full details.
### Build Resource Annotations
```
Having problems with your build?
- Check build documentation: [link]
- Ask in #build-stability Slack channel
```
**Action**: These are informational - reference them if you're stuck debugging.
### Coverage Annotations
```
⚠️ Code coverage decreased by 2.5%
Current: 85.3% | Previous: 87.8%
```
**Action**: May or may not be actionable depending on project policy.
## When Annotations Are Missing
If a build has no annotations:
1. **Don't assume success**: Check the overall build state
2. **Look at job logs**: All failure information will be in individual jobs
3. **Check job states**: Failed jobs will have `state: "failed"`
4. **Read failed job logs**: Use MCP tools or bktide to get logs
## Inconsistencies Across Projects
Be aware that annotation usage varies wildly:
- **Some projects**: Every failure is annotated
- **Some projects**: Only critical failures annotated
- **Some projects**: No annotations at all
- **Some projects**: Annotations are informational only, not diagnostic
**Never rely solely on annotations.** Always check:
1. Overall build state
2. Job states
3. Annotations (if present)
4. Job logs for failed jobs
## Example Workflows
### Checking a Failed Build With Annotations
1. Get build status → state is `failed`
2. List annotations → find error-styled annotation with test failures
3. Note which tests failed from annotation
4. Get detailed logs for failed job
5. Read stack traces and error messages
### Checking a Failed Build Without Annotations
1. Get build status → state is `failed`
2. Check job summary → identify which jobs failed
3. Get detailed information for each failed job
4. Read logs for each failed job
5. Identify root cause from logs
### Checking a Passing Build
1. Get build status → state is `passed`
2. Optionally check annotations for warnings or info
3. Note any "broken" jobs (may be expected)
4. No need to read logs unless investigating performance

View File

@@ -0,0 +1,107 @@
# Buildkite Build and Job States
Understanding Buildkite states is critical for correctly interpreting build status. Some states are misleading or require additional context.
## Build States
### Terminal States
- **`passed`** - All jobs completed successfully
- **`failed`** - One or more jobs failed
- **`canceled`** - Build was manually canceled
- **`skipped`** - Build was skipped (e.g., due to branch filters)
- **`blocked`** - Build is waiting for manual approval via block step
### Active States
- **`running`** - Build is currently executing
- **`scheduled`** - Build is queued and waiting to start
- **`creating`** - Build is being created
## Job States
### Terminal States
- **`passed`** - Job completed successfully
- **`failed`** - Job failed with non-zero exit code
- **`canceled`** - Job was canceled
- **`skipped`** - Job was skipped
- **`timed_out`** - Job exceeded time limit
### Special States (Often Misleading)
- **`broken`** - This is the most misleading state. It can mean:
- Job was skipped because an earlier job in the pipeline failed
- Job was skipped due to dependency conditions not being met
- Job was skipped due to conditional logic in the pipeline config
- **NOT necessarily a failure of this specific job**
Example: In the zen-payroll pipeline, many jobs show as "broken" but are actually skipped because their dependencies indicated they weren't needed (e.g., no relevant file changes).
- **`soft_failed`** - Job failed but was marked as "soft fail" (doesn't block pipeline)
- Shows as failed but doesn't cause overall build failure
- Often used for optional checks or flaky tests
### Active States
- **`waiting`** - Job is waiting for dependencies
- **`waiting_failed`** - Job was waiting but its dependency failed
- **`assigned`** - Job has been assigned to an agent
- **`accepted`** - Agent has accepted the job
- **`running`** - Job is currently executing
- **`blocked`** - Job is a block step waiting for manual unblock
## Interpreting Build Status
### Progressive Disclosure Pattern
When checking build status, follow this pattern:
1. **Start with overall state**: `passed`, `failed`, `canceled`, `blocked`
2. **If failed, check job summary**: How many jobs failed vs broken vs passed?
3. **Examine failed jobs specifically**: Don't assume "broken" means the job itself failed
4. **Check annotations**: Some projects surface important failures in annotations
5. **Inspect logs**: For actual failures, read the job logs
### Common Pitfalls
1. **Treating "broken" as "failed"**: A "broken" job is often just skipped due to pipeline logic, not an actual failure.
2. **Ignoring soft fails**: Jobs marked as `soft_failed` may contain important information even though they don't block the build.
3. **Missing blocked builds**: A `blocked` build is waiting for approval and won't progress without manual intervention.
4. **Overlooking job dependencies**: Jobs may be skipped (`broken`) because their dependencies weren't met, which is expected behavior.
## Project-Specific Patterns
### zen-payroll Pipeline
- **Heavy use of conditional execution**: Many jobs are conditionally skipped based on file changes
- **"broken" is normal**: A build with many "broken" jobs may still be perfectly healthy
- **Check annotations**: Important test failures are often surfaced in build annotations
- **Multiple test suites**: Different test types (unit, integration, system) have different failure patterns
### Smaller Pipelines (e.g., gusto-karafka)
- **Fewer conditional jobs**: Most jobs are expected to run
- **"broken" usually indicates a problem**: Less conditional logic means broken jobs are more likely to be actual issues
- **Simpler job graphs**: Easier to trace why a job didn't run
- **May not use annotations**: Failures are usually just in job logs
## When to Investigate
Investigate a build when:
1. Overall build state is `failed`
2. Jobs show `failed` state (not just `broken`)
3. Build is `blocked` and you need to unblock it
4. Annotations contain error messages
5. Job logs show actual errors (red output, stack traces, test failures)
Don't automatically investigate when:
1. Build is `passed` (even if some jobs are `broken`)
2. Jobs are `soft_failed` unless specifically requested
3. Jobs are `broken` due to conditional execution (check pipeline config)

View File

@@ -0,0 +1,291 @@
# Tool Capabilities Reference
This document provides complete capability information for all Buildkite status checking tools.
## Overview
Three tool categories exist with different strengths and limitations:
1. **MCP Tools** - Direct Buildkite API access via Model Context Protocol
2. **bktide CLI** - Human-readable command-line tool (npm package)
3. **Bundled Scripts** - Helper wrappers in this skill's `scripts/` directory
## Capability Matrix
| Capability | MCP Tools | bktide | Scripts | Notes |
| --------------------- | ------------------------------- | ----------------------- | -------------------------- | -------------------------- |
| List organizations | ✅ `buildkite:list_orgs` | ❌ | ❌ | |
| List pipelines | ✅ `buildkite:list_pipelines` | ✅ `bktide pipelines` | ❌ | |
| List builds | ✅ `buildkite:list_builds` | ✅ `bktide builds` | ✅ `find-commit-builds.js` | Scripts are specialized |
| Get build details | ✅ `buildkite:get_build` | ✅ `bktide build` | ❌ | |
| Get annotations | ✅ `buildkite:list_annotations` | ✅ `bktide annotations` | ❌ | |
| **Retrieve job logs** | **`buildkite:get_logs`** | **❌ NO** | **`get-build-logs.js`** | **bktide cannot get logs** |
| Get log metadata | ✅ `buildkite:get_logs_info` | ❌ | ❌ | |
| List artifacts | ✅ `buildkite:list_artifacts` | ❌ | ❌ | |
| Wait for build | ✅ `buildkite:wait_for_build` | ❌ | ✅ `wait-for-build.js` | MCP preferred |
| Unblock jobs | ✅ `buildkite:unblock_job` | ❌ | ❌ | |
| Real-time updates | ✅ | ❌ | ✅ | Via polling |
| Human-readable output | ❌ (JSON) | ✅ | Varies | |
| Works offline | ❌ | ❌ | ❌ | All need network |
| Requires auth | ✅ (MCP config) | ✅ (BK_TOKEN) | ✅ (uses bktide) | |
## Detailed Tool Information
### MCP Tools (Primary)
**Access Method:** `mcp__MCPProxy__call_tool("buildkite:<tool>", {...})`
**Authentication:** Configured in MCP server settings (typically uses `BUILDKITE_API_TOKEN`)
**Pros:**
- Complete API coverage
- Always available (no external dependencies)
- Real-time data
- Structured JSON responses
**Cons:**
- Verbose JSON output
- Requires parsing for human reading
**Key Tools:**
#### `buildkite:get_build`
Get detailed build information including job states, timing, and metadata.
Parameters:
- `org_slug` (required): Organization slug
- `pipeline_slug` (required): Pipeline slug
- `build_number` (required): Build number
- `detail_level` (optional): "summary" | "detailed" | "complete"
- `job_state` (optional): Filter jobs by state ("failed", "passed", etc.)
Returns: Build object with jobs array, state, timing, author, etc.
#### `buildkite:get_logs`
**THE CRITICAL TOOL** - Retrieve actual log output from a job.
Parameters:
- `org_slug` (required): Organization slug
- `pipeline_slug` (required): Pipeline slug
- `build_number` (required): Build number
- `job_id` (required): Job UUID (NOT step ID from URL)
Returns: Log text content
**Common Issues:**
- "job not found" → Using step ID instead of job UUID
- Empty response → Job hasn't started or finished yet
#### `buildkite:wait_for_build`
Poll build until completion.
Parameters:
- `org_slug` (required): Organization slug
- `pipeline_slug` (required): Pipeline slug
- `build_number` (required): Build number
- `timeout` (optional): Seconds until timeout (default: 1800)
- `poll_interval` (optional): Seconds between checks (default: 30)
Returns: Final build state when complete or timeout
### bktide CLI (Secondary)
**Access Method:** `npx bktide <command>`
**Authentication:** `BK_TOKEN` environment variable or `~/.bktide/config`
**Pros:**
- Human-readable colored output
- Intuitive command structure
- Good for interactive terminal work
**Cons:**
- External npm dependency
- **CANNOT retrieve job logs** (most critical limitation)
- Limited compared to full API
- Requires npx/node installed
**Key Commands:**
```bash
npx bktide pipelines <org> # List pipelines
npx bktide builds <org>/<pipeline> # List recent builds
npx bktide build <org>/<pipeline>/<number> # Build details
npx bktide build <org>/<pipeline>/<number> --jobs # Show job summary
npx bktide build <org>/<pipeline>/<number> --failed # Show failed jobs only
npx bktide annotations <org>/<pipeline>/<number> # Show annotations
```
**Critical**: bktide has NO command for retrieving logs. The `build` command shows job states and names, but NOT log content.
### Bundled Scripts (Tertiary)
**Access Method:** `~/.claude/skills/buildkite-status/scripts/<script>.js`
**Authentication:** Use bktide internally (requires `BK_TOKEN`)
**Pros:**
- Purpose-built for specific workflows
- Handle common use cases automatically
- Provide structured output
**Cons:**
- Depend on bktide (external dependency)
- Limited to specific use cases
- May have version compatibility issues
**Available Scripts:**
#### `find-commit-builds.js`
Find builds matching a specific commit SHA.
Usage:
```bash
~/.claude/skills/buildkite-status/scripts/find-commit-builds.js <org> <commit-sha>
```
Returns: JSON array of matching builds
#### `wait-for-build.js`
Monitor build until completion (background-friendly).
Usage:
```bash
~/.claude/skills/buildkite-status/scripts/wait-for-build.js <org> <pipeline> <build> [options]
```
Options:
- `--timeout <seconds>`: Max wait time (default: 1800)
- `--interval <seconds>`: Poll interval (default: 30)
Exit codes:
- 0: Build passed
- 1: Build failed
- 2: Build canceled
- 3: Timeout
#### `get-build-logs.js` (NEW - to be implemented)
Retrieve logs for a failed job with automatic UUID resolution.
Usage:
```bash
~/.claude/skills/buildkite-status/scripts/get-build-logs.js <org> <pipeline> <build> <job-label-or-uuid>
```
Features:
- Accepts job label or UUID
- Automatically resolves label → UUID
- Handles step ID confusion
- Formats output for readability
## Decision Matrix: Which Tool to Use
### Use MCP Tools When:
- Getting build details
- **Retrieving job logs** (ONLY option with bktide)
- Waiting for builds (preferred over script)
- Unblocking jobs
- Automating workflows
- Need structured data
### Use bktide When:
- Interactive terminal work
- Want human-readable summary
- Listing pipelines/builds
- Getting quick status overview
- **NOT when you need logs** (it can't do this)
### Use Scripts When:
- Need specialized workflow (find commits)
- Want background monitoring
- MCP tools fail (fallback)
- Automating repetitive tasks
## Common Mistakes
### ❌ Trying to get logs with bktide
**Don't**: `npx bktide build <org>/<pipeline>/<number> --logs`
**Why**: This flag doesn't exist. bktide cannot retrieve logs.
**Do**: Use `buildkite:get_logs` MCP tool
### ❌ Using step ID for log retrieval
**Don't**: Extract `sid=019a5f...` from URL and use directly
**Why**: Step IDs ≠ Job UUIDs. MCP tools need job UUIDs.
**Do**: Call `buildkite:get_build` to get job details, extract `uuid` field
### ❌ Abandoning MCP tools when script fails
**Don't**: "Script failed, I'll use GitHub instead"
**Why**: Scripts depend on bktide. MCP tools are independent.
**Do**: Use MCP tools directly when scripts fail
## Troubleshooting
### Issue: "job not found" when calling get_logs
**Diagnosis**: Using step ID instead of job UUID
**Solution**:
1. Call `buildkite:get_build` with `detail_level: "detailed"`
2. Find job by `label` field
3. Extract `uuid` field
4. Use that UUID in `get_logs` call
### Issue: bktide command not found
**Diagnosis**: npm/npx not installed or not in PATH
**Solution**:
1. Use MCP tools instead (preferred)
2. Or install: `npm install -g @anthropic/bktide`
### Issue: Empty logs returned
**Diagnosis**: Job hasn't completed or logs not available yet
**Solution**:
1. Check job `state` - should be terminal (passed/failed/canceled)
2. Wait for job to finish
3. Check job `started_at` and `finished_at` timestamps
## See Also
- [SKILL.md](../SKILL.md) - Main skill documentation
- [troubleshooting.md](troubleshooting.md) - Common errors and solutions
- [url-parsing.md](url-parsing.md) - Understanding Buildkite URLs

View File

@@ -0,0 +1,365 @@
# Buildkite Status Troubleshooting
Common errors when working with Buildkite and how to resolve them.
## MCP Tool Errors
### Error: "job not found"
**When**: Calling `buildkite:get_logs`
**Cause**: Using step ID from URL instead of job UUID from API
**Solution**:
1. Call `buildkite:get_build` with `detail_level: "detailed"`
2. Find job by `label` field
3. Extract `uuid` field (NOT the `id` field)
4. Use that UUID in `get_logs`
**Example**:
```javascript
// ❌ Wrong - using step ID from URL
mcp__MCPProxy__call_tool('buildkite:get_logs', {
job_id: '019a5f23-8109-4656-a033-bd62a82ca239', // This is a step ID
});
// ✅ Correct - get job UUID from API first
const build = await mcp__MCPProxy__call_tool('buildkite:get_build', {
org_slug: 'gusto',
pipeline_slug: 'payroll-building-blocks',
build_number: '29627',
detail_level: 'detailed',
});
const job = build.jobs.find(
(j) => j.label === 'ste rspec' && j.state === 'failed'
);
await mcp__MCPProxy__call_tool('buildkite:get_logs', {
org_slug: 'gusto',
pipeline_slug: 'payroll-building-blocks',
build_number: '29627',
job_id: job.uuid, // This is the correct job UUID
});
```
**See Also**: [url-parsing.md](url-parsing.md) for step ID vs job UUID explanation
---
### Error: "build not found" or "pipeline not found"
**When**: Calling any MCP tool
**Cause**: Incorrect org slug or pipeline slug format
**Common Mistakes**:
- Using repository name instead of pipeline slug
- Including org name in pipeline slug
- Using display name instead of URL slug
**Solution**:
Extract slugs from URL correctly:
```
https://buildkite.com/gusto/payroll-building-blocks/builds/123
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^
org pipeline slug
```
**Slug Format Rules**:
- All lowercase
- Hyphens instead of underscores
- No spaces
- No special characters
**Example**:
```javascript
// ❌ Wrong
{ org_slug: "Gusto", pipeline_slug: "Payroll Building Blocks" }
// ✅ Correct
{ org_slug: "gusto", pipeline_slug: "payroll-building-blocks" }
```
---
### Error: Empty logs returned
**When**: Calling `buildkite:get_logs`
**Causes**:
1. Job hasn't started yet
2. Job is still running
3. Job failed before producing output
4. Logs not available yet (eventual consistency)
**Diagnosis**:
Check job state first:
```javascript
const build = await mcp__MCPProxy__call_tool('buildkite:get_build', {
detail_level: 'detailed',
});
const job = build.jobs.find((j) => j.uuid === jobUuid);
console.log(job.state); // Should be terminal: passed/failed/canceled
console.log(job.started_at); // Should not be null
console.log(job.finished_at); // Should not be null for terminal state
```
**Solution**:
- If state is `waiting` or `running`: Wait for job to complete
- If state is terminal but logs empty: Wait a few seconds for eventual consistency
- If still empty: Job may have failed immediately (check exit_status)
---
### Error: "Unauthorized" or "Forbidden"
**When**: Any MCP tool call
**Cause**: Authentication or permission issue
**Diagnosis Steps**:
1. Check MCP server configuration:
```bash
# MCP server should have BUILDKITE_API_TOKEN configured
```
2. Verify token has correct scope:
- `read_builds` - Required for reading build info
- `read_build_logs` - Required for log retrieval
- `read_pipelines` - Required for pipeline listing
3. Check organization access:
- Token must have access to the specific organization
- Some orgs require SSO
**Solution**:
- Verify BUILDKITE_API_TOKEN in MCP config
- Generate new token at https://buildkite.com/user/api-access-tokens
- Ensure token has required scopes
- Report to human partner if still failing (may need org admin help)
---
## bktide CLI Errors
### Error: "bktide: command not found"
**Cause**: bktide not installed or not in PATH
**Solution**:
Use MCP tools instead (preferred):
```javascript
// Instead of: npx bktide build gusto/payroll-building-blocks/123
mcp__MCPProxy__call_tool('buildkite:get_build', {
org_slug: 'gusto',
pipeline_slug: 'payroll-building-blocks',
build_number: '123',
});
```
Or install bktide:
```bash
npm install -g @anthropic/bktide
```
---
### Error: "Cannot read logs with bktide"
**Cause**: bktide does not have log retrieval capability
**Solution**:
Use MCP tools for logs:
```javascript
mcp__MCPProxy__call_tool('buildkite:get_logs', {
org_slug: 'gusto',
pipeline_slug: 'payroll-building-blocks',
build_number: '123',
job_id: '<job-uuid>',
});
```
**See Also**: [tool-capabilities.md](tool-capabilities.md) for complete capability matrix
---
## Script Errors
### Error: Script fails with "bktide error"
**Cause**: Scripts depend on bktide internally
**Solution**:
1. Use equivalent MCP tool instead (preferred)
2. Or ensure bktide is installed and configured
3. Or check `BK_TOKEN` environment variable is set
**Example**:
```bash
# Script failing
~/.claude/skills/buildkite-status/scripts/wait-for-build.js gusto payroll-building-blocks 123
# Use MCP tool instead
mcp__MCPProxy__call_tool("buildkite:wait_for_build", {
org_slug: "gusto",
pipeline_slug: "payroll-building-blocks",
build_number: "123",
timeout: 1800,
poll_interval: 30
})
```
---
## Build State Confusion
### Issue: Many jobs show "broken" but build looks healthy
**Cause**: "broken" doesn't mean failed - it usually means skipped
**Explanation**:
Buildkite uses "broken" state for:
- Jobs skipped because dependency failed
- Jobs skipped due to conditional logic
- Jobs skipped because file changes didn't affect them
**Solution**:
Filter for actual failures:
```javascript
mcp__MCPProxy__call_tool('buildkite:get_build', {
detail_level: 'detailed',
job_state: 'failed', // Only show actually failed jobs
});
```
**See Also**: [buildkite-states.md](buildkite-states.md) for complete state explanations
---
### Issue: Build shows "failed" but all jobs passed
**Cause**: A "soft_failed" job counts as passed in job list but failed for build state
**Solution**:
Check for soft failures:
```javascript
const build = await mcp__MCPProxy__call_tool('buildkite:get_build', {
detail_level: 'detailed',
});
const softFails = build.jobs.filter((j) => j.soft_failed === true);
console.log(softFails); // These caused build to fail but are marked non-blocking
```
---
## Common Workflow Issues
### Issue: Cannot find recent build for branch
**Cause**: Build may be filtered or pipeline has many builds
**Solution**:
Use branch filter and increase limit:
```javascript
mcp__MCPProxy__call_tool('buildkite:list_builds', {
org_slug: 'gusto',
pipeline_slug: 'payroll-building-blocks',
branch: 'my-feature-branch',
per_page: 20, // Default may be smaller
});
```
Or find by commit:
```bash
~/.claude/skills/buildkite-status/scripts/find-commit-builds.js gusto <commit-sha>
```
---
### Issue: Multiple jobs have same label, can't tell which failed
**Cause**: Parallelized jobs have same base label
**Solution**:
Jobs with same label are numbered:
- "rspec (1/10)"
- "rspec (2/10)"
Match on full label including partition:
```javascript
const failedJob = build.jobs.find(
(j) => j.label === 'rspec (2/10)' && j.state === 'failed'
);
```
Or find all failed jobs with that label:
```javascript
const failedRspecJobs = build.jobs.filter(
(j) => j.label.startsWith('rspec (') && j.state === 'failed'
);
```
---
## Decision Tree: What to Do When Stuck
```
Unable to investigate build failure?
├─ Can't get build details
│ ├─ Check URL format → [url-parsing.md]
│ ├─ Check org/pipeline slugs → lowercase, hyphenated
│ └─ Check auth → BUILDKITE_API_TOKEN configured
├─ Can't get job logs
│ ├─ Using bktide? → Use MCP tools instead [tool-capabilities.md]
│ ├─ Getting "job not found"? → Using step ID instead of job UUID [url-parsing.md]
│ ├─ Empty logs? → Check job state (started_at, finished_at)
│ └─ Still failing? → Report to human partner (may be auth/permission)
├─ Confused about job states
│ ├─ Many "broken" jobs? → Normal, means skipped [buildkite-states.md]
│ ├─ "soft_failed"? → Failed but non-blocking
│ └─ Can't find failed job? → Filter with job_state: "failed"
└─ Tool not working
├─ MCP tool error? → Check auth, verify slugs
├─ bktide error? → Use MCP tools instead
└─ Script error? → Use MCP tools directly
```
## See Also
- [SKILL.md](../SKILL.md) - Main skill documentation
- [tool-capabilities.md](tool-capabilities.md) - What each tool can do
- [url-parsing.md](url-parsing.md) - Understanding URLs and IDs
- [buildkite-states.md](buildkite-states.md) - Build and job states

View File

@@ -0,0 +1,272 @@
# Buildkite URL Parsing Reference
This document explains Buildkite URL formats and how to extract information from them.
## URL Formats
Buildkite uses several URL patterns for builds and jobs:
### Build URL (Most Common)
```
https://buildkite.com/{org}/{pipeline}/builds/{number}
```
Example:
```
https://buildkite.com/gusto/payroll-building-blocks/builds/29627
```
Extracting:
- `org`: "gusto"
- `pipeline`: "payroll-building-blocks"
- `number`: "29627"
### Step/Job URL (From Build Page)
```
https://buildkite.com/{org}/{pipeline}/builds/{number}/steps/{view}?sid={step-id}
```
Example:
```
https://buildkite.com/gusto/payroll-building-blocks/builds/29627/steps/canvas?sid=019a5f23-8109-4656-a033-bd62a82ca239
```
Extracting:
- `org`: "gusto"
- `pipeline`: "payroll-building-blocks"
- `number`: "29627"
- `view`: "canvas" (UI view type)
- `sid`: "019a5f23-8109-4656-a033-bd62a82ca239" (step ID)
**IMPORTANT**: The `sid` (step ID) is NOT the same as job UUID. See "Step IDs vs Job UUIDs" below.
### Job Detail URL
```
https://buildkite.com/{org}/{pipeline}/builds/{number}/jobs/{job-uuid}
```
Example:
```
https://buildkite.com/gusto/payroll-building-blocks/builds/29627/jobs/019a5f20-2d30-4c67-9edd-87fb92e1f487
```
Extracting:
- `org`: "gusto"
- `pipeline`: "payroll-building-blocks"
- `number`: "29627"
- `job-uuid`: "019a5f20-2d30-4c67-9edd-87fb92e1f487"
**NOTE**: This format contains the actual job UUID needed for log retrieval.
## Step IDs vs Job UUIDs
**Critical distinction**: Buildkite has two types of identifiers that are easily confused.
### Step IDs
- **Source**: Query parameter `sid` in step URLs
- **Format**: ULID format (e.g., `019a5f23-8109-4656-a033-bd62a82ca239`)
- **Purpose**: Frontend UI routing
- **Use**: Navigating to specific steps in web UI
- **API Usage**: ❌ NOT accepted by MCP tools
### Job UUIDs
- **Source**: `uuid` field in API responses
- **Format**: ULID format (e.g., `019a5f20-2d30-4c67-9edd-87fb92e1f487`)
- **Purpose**: Backend job identification
- **Use**: API calls to get logs, job details, etc.
- **API Usage**: ✅ Required by MCP `get_logs` tool
### Why the Confusion?
Both use ULID format (starts with `019a5f...`), but:
- Step IDs come from URLs → Web UI routing
- Job UUIDs come from API responses → Backend identification
**You cannot use a step ID for log retrieval.** Always get job UUID from `buildkite:get_build` API.
## Resolving Step ID to Job UUID
When given a step URL with `sid` parameter:
**Step 1: Extract build identifiers**
```javascript
// From: https://buildkite.com/gusto/payroll-building-blocks/builds/29627/steps/canvas?sid=019a5f23...
const org = 'gusto';
const pipeline = 'payroll-building-blocks';
const build = '29627';
// Ignore the sid parameter
```
**Step 2: Get job details from API**
```javascript
mcp__MCPProxy__call_tool('buildkite:get_build', {
org_slug: org,
pipeline_slug: pipeline,
build_number: build,
detail_level: 'detailed',
job_state: 'failed', // If investigating failures
});
```
**Step 3: Match job by properties**
The API response includes all jobs. Match by:
- `label` field (e.g., "ste rspec", "Rubocop")
- `state` field (e.g., "failed")
- `type` field (e.g., "script")
- `step_key` field if available
**Step 4: Extract job UUID**
```javascript
// From API response
const job = response.jobs.find(
(j) => j.label === 'ste rspec' && j.state === 'failed'
);
const jobUuid = job.uuid; // e.g., "019a5f20-2d30-4c67-9edd-87fb92e1f487"
```
**Step 5: Use job UUID for logs**
```javascript
mcp__MCPProxy__call_tool('buildkite:get_logs', {
org_slug: org,
pipeline_slug: pipeline,
build_number: build,
job_id: jobUuid, // NOT the step ID from URL
});
```
## Parsing Logic
### Simple Regex Approach
```javascript
function parseBuildkiteUrl(url) {
// Match build URL pattern
const buildMatch = url.match(
/buildkite\.com\/([^/]+)\/([^/]+)\/builds\/(\d+)/
);
if (!buildMatch) {
throw new Error('Invalid Buildkite URL');
}
return {
org: buildMatch[1],
pipeline: buildMatch[2],
buildNumber: buildMatch[3],
};
}
// Usage
const info = parseBuildkiteUrl(
'https://buildkite.com/gusto/payroll-building-blocks/builds/29627'
);
// => { org: "gusto", pipeline: "payroll-building-blocks", buildNumber: "29627" }
```
### Extracting Step ID (If Needed)
```javascript
function parseStepUrl(url) {
const base = parseBuildkiteUrl(url);
// Extract step ID from query parameter
const sidMatch = url.match(/[?&]sid=([^&]+)/);
return {
...base,
stepId: sidMatch ? sidMatch[1] : null,
};
}
// Usage
const info = parseStepUrl(
'https://buildkite.com/gusto/payroll-building-blocks/builds/29627/steps/canvas?sid=019a5f23...'
);
// => { org: "gusto", pipeline: "payroll-building-blocks", buildNumber: "29627", stepId: "019a5f23..." }
```
**Remember**: The `stepId` is useful for debugging but cannot be used for API calls. Always fetch job UUID from the API.
## Common URL Patterns in Practice
### Pattern 1: User Shares Failing Build URL
**URL**: `https://buildkite.com/org/pipeline/builds/123`
**Workflow**:
1. Extract org/pipeline/build
2. Call `get_build` with `detail_level: "summary"`
3. Check build state
4. If failed, call `get_build` with `detail_level: "detailed"` and `job_state: "failed"`
5. Get logs for each failed job
### Pattern 2: User Shares Step URL (Clicked on Specific Job)
**URL**: `https://buildkite.com/org/pipeline/builds/123/steps/canvas?sid=019a5f23...`
**Workflow**:
1. Extract org/pipeline/build (ignore `sid`)
2. Call `get_build` with `detail_level: "detailed"`
3. Find job matching user's intent (often the failed one)
4. Extract job UUID
5. Get logs for that job
The `sid` hints at which job the user was looking at, but you must resolve it via the API.
### Pattern 3: User Provides Job UUID Directly
**URL**: `https://buildkite.com/org/pipeline/builds/123/jobs/019a5f20-...`
**Workflow**:
1. Extract org/pipeline/build/job-uuid
2. Call `get_logs` directly with the job UUID
3. No resolution needed - this is the actual job UUID
This is the ideal format but least common in practice.
## Edge Cases
### Multiple Jobs with Same Label
Some pipelines parallelize jobs:
- "rspec (1/10)"
- "rspec (2/10)"
- "rspec (3/10)"
When resolving, match the full label string including the partition number.
### Dynamic Pipeline Steps
Some pipelines generate steps dynamically. The step structure may not be predictable from the URL alone. Always query the API to see actual job structure.
### Retried Jobs
When jobs are retried, multiple job UUIDs exist for the same step. The API returns the most recent attempt. Check `retries_count` and `retry_source` fields if investigating retry behavior.
## See Also
- [SKILL.md](../SKILL.md) - Main skill documentation
- [tool-capabilities.md](tool-capabilities.md) - Tool limitations and capabilities
- [troubleshooting.md](troubleshooting.md) - Common errors