Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:37:11 +08:00
commit 20b36ca9b1
56 changed files with 14530 additions and 0 deletions

View File

@@ -0,0 +1,132 @@
---
name: condition-based-waiting
description: |
Flaky test fix pattern - replaces arbitrary timeouts with condition polling
that waits for actual state changes.
trigger: |
- Tests use setTimeout/sleep with arbitrary values
- Tests are flaky (pass sometimes, fail under load)
- Tests timeout when run in parallel
- Waiting for async operations in tests
skip_when: |
- Testing actual timing behavior (debounce, throttle) → timeout is correct
- Synchronous tests → no waiting needed
---
# Condition-Based Waiting
## Overview
Flaky tests often guess at timing with arbitrary delays. This creates race conditions where tests pass on fast machines but fail under load or in CI.
**Core principle:** Wait for the actual condition you care about, not a guess about how long it takes.
## When to Use
```dot
digraph when_to_use {
"Test uses setTimeout/sleep?" [shape=diamond];
"Testing timing behavior?" [shape=diamond];
"Document WHY timeout needed" [shape=box];
"Use condition-based waiting" [shape=box];
"Test uses setTimeout/sleep?" -> "Testing timing behavior?" [label="yes"];
"Testing timing behavior?" -> "Document WHY timeout needed" [label="yes"];
"Testing timing behavior?" -> "Use condition-based waiting" [label="no"];
}
```
**Use when:**
- Tests have arbitrary delays (`setTimeout`, `sleep`, `time.sleep()`)
- Tests are flaky (pass sometimes, fail under load)
- Tests timeout when run in parallel
- Waiting for async operations to complete
**Don't use when:**
- Testing actual timing behavior (debounce, throttle intervals)
- Always document WHY if using arbitrary timeout
## Core Pattern
```typescript
// ❌ BEFORE: Guessing at timing
await new Promise(r => setTimeout(r, 50));
const result = getResult();
expect(result).toBeDefined();
// ✅ AFTER: Waiting for condition
await waitFor(() => getResult() !== undefined);
const result = getResult();
expect(result).toBeDefined();
```
## Quick Patterns
| Scenario | Pattern |
|----------|---------|
| Wait for event | `waitFor(() => events.find(e => e.type === 'DONE'))` |
| Wait for state | `waitFor(() => machine.state === 'ready')` |
| Wait for count | `waitFor(() => items.length >= 5)` |
| Wait for file | `waitFor(() => fs.existsSync(path))` |
| Complex condition | `waitFor(() => obj.ready && obj.value > 10)` |
## Implementation
Generic polling function:
```typescript
async function waitFor<T>(
condition: () => T | undefined | null | false,
description: string,
timeoutMs = 5000
): Promise<T> {
const startTime = Date.now();
while (true) {
const result = condition();
if (result) return result;
if (Date.now() - startTime > timeoutMs) {
throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
}
await new Promise(r => setTimeout(r, 10)); // Poll every 10ms
}
}
```
See @example.ts for complete implementation with domain-specific helpers (`waitForEvent`, `waitForEventCount`, `waitForEventMatch`) from actual debugging session.
## Common Mistakes
**❌ Polling too fast:** `setTimeout(check, 1)` - wastes CPU
**✅ Fix:** Poll every 10ms
**❌ No timeout:** Loop forever if condition never met
**✅ Fix:** Always include timeout with clear error
**❌ Stale data:** Cache state before loop
**✅ Fix:** Call getter inside loop for fresh data
## When Arbitrary Timeout IS Correct
```typescript
// Tool ticks every 100ms - need 2 ticks to verify partial output
await waitForEvent(manager, 'TOOL_STARTED'); // First: wait for condition
await new Promise(r => setTimeout(r, 200)); // Then: wait for timed behavior
// 200ms = 2 ticks at 100ms intervals - documented and justified
```
**Requirements:**
1. First wait for triggering condition
2. Based on known timing (not guessing)
3. Comment explaining WHY
## Real-World Impact
From debugging session (2025-10-03):
- Fixed 15 flaky tests across 3 files
- Pass rate: 60% → 100%
- Execution time: 40% faster
- No more race conditions

View File

@@ -0,0 +1,158 @@
// Complete implementation of condition-based waiting utilities
// From: Lace test infrastructure improvements (2025-10-03)
// Context: Fixed 15 flaky tests by replacing arbitrary timeouts
import type { ThreadManager } from '~/threads/thread-manager';
import type { LaceEvent, LaceEventType } from '~/threads/types';
/**
* Wait for a specific event type to appear in thread
*
* @param threadManager - The thread manager to query
* @param threadId - Thread to check for events
* @param eventType - Type of event to wait for
* @param timeoutMs - Maximum time to wait (default 5000ms)
* @returns Promise resolving to the first matching event
*
* Example:
* await waitForEvent(threadManager, agentThreadId, 'TOOL_RESULT');
*/
export function waitForEvent(
threadManager: ThreadManager,
threadId: string,
eventType: LaceEventType,
timeoutMs = 5000
): Promise<LaceEvent> {
return new Promise((resolve, reject) => {
const startTime = Date.now();
const check = () => {
const events = threadManager.getEvents(threadId);
const event = events.find((e) => e.type === eventType);
if (event) {
resolve(event);
} else if (Date.now() - startTime > timeoutMs) {
reject(new Error(`Timeout waiting for ${eventType} event after ${timeoutMs}ms`));
} else {
setTimeout(check, 10); // Poll every 10ms for efficiency
}
};
check();
});
}
/**
* Wait for a specific number of events of a given type
*
* @param threadManager - The thread manager to query
* @param threadId - Thread to check for events
* @param eventType - Type of event to wait for
* @param count - Number of events to wait for
* @param timeoutMs - Maximum time to wait (default 5000ms)
* @returns Promise resolving to all matching events once count is reached
*
* Example:
* // Wait for 2 AGENT_MESSAGE events (initial response + continuation)
* await waitForEventCount(threadManager, agentThreadId, 'AGENT_MESSAGE', 2);
*/
export function waitForEventCount(
threadManager: ThreadManager,
threadId: string,
eventType: LaceEventType,
count: number,
timeoutMs = 5000
): Promise<LaceEvent[]> {
return new Promise((resolve, reject) => {
const startTime = Date.now();
const check = () => {
const events = threadManager.getEvents(threadId);
const matchingEvents = events.filter((e) => e.type === eventType);
if (matchingEvents.length >= count) {
resolve(matchingEvents);
} else if (Date.now() - startTime > timeoutMs) {
reject(
new Error(
`Timeout waiting for ${count} ${eventType} events after ${timeoutMs}ms (got ${matchingEvents.length})`
)
);
} else {
setTimeout(check, 10);
}
};
check();
});
}
/**
* Wait for an event matching a custom predicate
* Useful when you need to check event data, not just type
*
* @param threadManager - The thread manager to query
* @param threadId - Thread to check for events
* @param predicate - Function that returns true when event matches
* @param description - Human-readable description for error messages
* @param timeoutMs - Maximum time to wait (default 5000ms)
* @returns Promise resolving to the first matching event
*
* Example:
* // Wait for TOOL_RESULT with specific ID
* await waitForEventMatch(
* threadManager,
* agentThreadId,
* (e) => e.type === 'TOOL_RESULT' && e.data.id === 'call_123',
* 'TOOL_RESULT with id=call_123'
* );
*/
export function waitForEventMatch(
threadManager: ThreadManager,
threadId: string,
predicate: (event: LaceEvent) => boolean,
description: string,
timeoutMs = 5000
): Promise<LaceEvent> {
return new Promise((resolve, reject) => {
const startTime = Date.now();
const check = () => {
const events = threadManager.getEvents(threadId);
const event = events.find(predicate);
if (event) {
resolve(event);
} else if (Date.now() - startTime > timeoutMs) {
reject(new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`));
} else {
setTimeout(check, 10);
}
};
check();
});
}
// Usage example from actual debugging session:
//
// BEFORE (flaky):
// ---------------
// const messagePromise = agent.sendMessage('Execute tools');
// await new Promise(r => setTimeout(r, 300)); // Hope tools start in 300ms
// agent.abort();
// await messagePromise;
// await new Promise(r => setTimeout(r, 50)); // Hope results arrive in 50ms
// expect(toolResults.length).toBe(2); // Fails randomly
//
// AFTER (reliable):
// ----------------
// const messagePromise = agent.sendMessage('Execute tools');
// await waitForEventCount(threadManager, threadId, 'TOOL_CALL', 2); // Wait for tools to start
// agent.abort();
// await messagePromise;
// await waitForEventCount(threadManager, threadId, 'TOOL_RESULT', 2); // Wait for results
// expect(toolResults.length).toBe(2); // Always succeeds
//
// Result: 60% pass rate → 100%, 40% faster execution