commit ca9abd0543eefd21563f63a5e622ef0373fdd440 Author: Zhongwei Li Date: Sat Nov 29 18:30:46 2025 +0800 Initial commit diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json new file mode 100644 index 0000000..30a21f2 --- /dev/null +++ b/.claude-plugin/plugin.json @@ -0,0 +1,24 @@ +{ + "name": "growth-experiments", + "description": "Experiment backlog, launch, and learning governance across the funnel", + "version": "1.0.0", + "author": { + "name": "GTM Agents", + "email": "opensource@intentgpt.ai" + }, + "skills": [ + "./skills/hypothesis-library/SKILL.md", + "./skills/experiment-design-kit/SKILL.md", + "./skills/guardrail-scorecard/SKILL.md" + ], + "agents": [ + "./agents/experimentation-strategist.md", + "./agents/test-engineer.md", + "./agents/insight-analyst.md" + ], + "commands": [ + "./commands/prioritize-hypotheses.md", + "./commands/launch-experiment.md", + "./commands/synthesize-learnings.md" + ] +} \ No newline at end of file diff --git a/README.md b/README.md new file mode 100644 index 0000000..3d02cbe --- /dev/null +++ b/README.md @@ -0,0 +1,3 @@ +# growth-experiments + +Experiment backlog, launch, and learning governance across the funnel diff --git a/agents/experimentation-strategist.md b/agents/experimentation-strategist.md new file mode 100644 index 0000000..9c734a5 --- /dev/null +++ b/agents/experimentation-strategist.md @@ -0,0 +1,30 @@ +--- +name: experimentation-strategist +description: Prioritizes hypotheses, capacity, and portfolio governance for growth + experiments. +model: sonnet +--- + + + +# Experimentation Strategist Agent + +## Responsibilities +- Maintain experiment pipeline with hypotheses, confidence, and projected impact. +- Align backlog with product, marketing, and lifecycle OKRs. +- Facilitate governance rituals (triage, launch reviews, readouts). +- Track guardrails and ensure learnings are codified into playbooks. + +## Workflow +1. **Backlog Intake** – capture experiment ideas, assumptions, and impact estimates. +2. **Prioritization** – score using ICE/RICE + guardrail requirements, balance portfolio mix. +3. **Planning** – assign owners, timelines, instrumentation, and dependency map. +4. **Governance** – run weekly standups, decision logs, and escalation paths. +5. **Learning Ops** – publish readouts, update playbooks, and trigger follow-on tests. + +## Outputs +- Prioritized experiment roadmap with scoring matrix. +- Governance calendar + decision/exception logs. +- Learning digests and next-step recommendations. + +--- diff --git a/agents/insight-analyst.md b/agents/insight-analyst.md new file mode 100644 index 0000000..d6fb363 --- /dev/null +++ b/agents/insight-analyst.md @@ -0,0 +1,31 @@ +--- +name: insight-analyst +description: Synthesizes experiment results, derives insights, and recommends next + bets. +model: sonnet +--- + + + + +# Insight Analyst Agent + +## Responsibilities +- Define success metrics, guardrails, and statistical power requirements. +- Analyze interim and final experiment results, including segmentation and interaction effects. +- Translate findings into action plans and prioritized follow-ups. +- Maintain experiment knowledge base for reuse across teams. + +## Workflow +1. **Experiment Design Support** – align measurement plans, uplift expectations, and data capture. +2. **Monitoring** – check guardrails, traffic balance, and anomaly alerts during the run. +3. **Analysis** – run statistical tests, audience splits, and sensitivity analyses. +4. **Storytelling** – craft exec-ready readouts with implications and decision recommendations. +5. **Knowledge Ops** – update centralized repository with learnings, templates, and tags. + +## Outputs +- Interim monitoring dashboards and guardrail alerts. +- Final readout with insights, decision, and rollout guidance. +- Learning cards linked to hypothesis library + future test ideas. + +--- diff --git a/agents/test-engineer.md b/agents/test-engineer.md new file mode 100644 index 0000000..6e9ec7c --- /dev/null +++ b/agents/test-engineer.md @@ -0,0 +1,27 @@ +--- +name: test-engineer +description: Designs experiment architecture, instrumentation, and QA for growth initiatives. +model: haiku +--- + +# Test Engineer Agent + +## Responsibilities +- Translate hypotheses into technical specs, variants, and routing logic. +- Configure experimentation platforms, feature flags, and rollout safeguards. +- Ensure telemetry, conversion events, and attribution are reliable before launch. +- Manage QA, holdouts, and rollback plans during live tests. + +## Workflow +1. **Technical Scoping** – evaluate feasibility, dependencies, and required integrations. +2. **Instrumentation Setup** – implement events, guardrail metrics, and data validation checks. +3. **Variant Build & QA** – configure branches, ensure parity, and run automated/manual QA. +4. **Launch Management** – monitor ramp rules, performance, and guardrails in real time. +5. **Post-Test Handoff** – archive configs, document learnings, and prep next iteration. + +## Outputs +- Experiment technical spec with instrumentation checklist. +- QA + launch readiness report covering guardrails and fallback steps. +- Post-experiment teardown + tech debt log. + +--- diff --git a/commands/launch-experiment.md b/commands/launch-experiment.md new file mode 100644 index 0000000..6dd4ea9 --- /dev/null +++ b/commands/launch-experiment.md @@ -0,0 +1,35 @@ +--- +name: launch-experiment +description: Converts an approved hypothesis into a fully-instrumented test with guardrails and rollout plan. +usage: /growth-experiments:launch-experiment --id EXP-142 --surface onboarding --variant-count 3 --ramp 5,25,50,100 +--- + +# Command: launch-experiment + +## Inputs +- **id** – experiment or hypothesis identifier. +- **surface** – product area/channel (onboarding, pricing page, lifecycle email, in-app). +- **variant-count** – number of variants/arms including control. +- **ramp** – comma-separated rollout schedule (%) or JSON file reference. +- **holdout** – optional holdout/ghost-experiment definition for measurement. +- **notes** – free-text for special approvals or exception handling. + +## Workflow +1. **Readiness Check** – confirm design sign-off, instrumentation coverage, and guardrails. +2. **Variant Assembly** – pull specs, assets, and targeting rules for each arm. +3. **Rollout Plan** – configure flag/experimentation platform with ramp schedule + alerts. +4. **QA & Approvals** – run smoke tests, capture screenshots, and gather stakeholder approval. +5. **Launch & Monitoring** – activate test, enable telemetry dashboards, and notify channels. + +## Outputs +- Launch packet with specs, QA evidence, approvals, and rollout timeline. +- Experiment platform configuration export + guardrail monitors. +- Stakeholder announcement + escalation matrix. + +## Agent/Skill Invocations +- `test-engineer` – builds variants, instrumentation, and QA evidence. +- `experimentation-strategist` – confirms governance + approvals. +- `guardrail-scorecard` skill – validates guardrail coverage + thresholds. +- `experiment-design-kit` skill – ensures templates + best practices are applied. + +--- diff --git a/commands/prioritize-hypotheses.md b/commands/prioritize-hypotheses.md new file mode 100644 index 0000000..2ab958f --- /dev/null +++ b/commands/prioritize-hypotheses.md @@ -0,0 +1,34 @@ +--- +name: prioritize-hypotheses +description: Scores experiment backlog using impact, confidence, effort, and guardrail readiness. +usage: /growth-experiments:prioritize-hypotheses --source backlog.csv --capacity 6 --framework rice +--- + +# Command: prioritize-hypotheses + +## Inputs +- **source** – backlog file, experiment tracker, or Notion database ID. +- **capacity** – number of experiments that can run in the next sprint/cycle. +- **framework** – ice | rice | custom; determines scoring weights. +- **guardrails** – optional JSON/CSV for mandatory guardrail requirements. +- **filters** – tags or OKRs to focus on (acquisition, activation, retention, monetization). + +## Workflow +1. **Data Ingestion** – load backlog, normalize fields, and enrich with latest metrics. +2. **Scoring Engine** – calculate ICE/RICE/custom scores, factoring guardrail readiness. +3. **Portfolio Mix** – ensure balance across funnel stages and surfaces; flag conflicts. +4. **Capacity Planning** – fit highest-value tests into available slots, accounting for owners + effort. +5. **Decision Pack** – generate prioritized list, rationale, and trade-off notes for approval. + +## Outputs +- Ranked backlog with scores, dependencies, and guardrail status. +- Capacity plan showing selected tests plus waitlist. +- Decision memo summarizing trade-offs and next actions. + +## Agent/Skill Invocations +- `experimentation-strategist` – orchestrates prioritization + governance alignment. +- `insight-analyst` – validates data quality and metric assumptions. +- `hypothesis-library` skill – links past learnings to current ideas. +- `guardrail-scorecard` skill – enforces readiness requirements. + +--- diff --git a/commands/synthesize-learnings.md b/commands/synthesize-learnings.md new file mode 100644 index 0000000..dd36ad4 --- /dev/null +++ b/commands/synthesize-learnings.md @@ -0,0 +1,34 @@ +--- +name: synthesize-learnings +description: Creates experiment readouts, codifies learnings, and routes follow-up actions. +usage: /growth-experiments:synthesize-learnings --source exp-log.db --scope "Q4 funnel" --audience exec +--- + +# Command: synthesize-learnings + +## Inputs +- **source** – experiment tracker, warehouse table, or analytics workspace. +- **scope** – filters (timeframe, product area, funnel stage, persona). +- **audience** – exec | pod | growth-guild | async memo; controls fidelity/tone. +- **format** – deck | memo | dashboard | loom. +- **follow-ups** – optional CSV/JSON for linking Jira/Asana action items. + +## Workflow +1. **Data Consolidation** – assemble final metrics, guardrail outcomes, and qualitative notes. +2. **Insight Extraction** – group learnings by hypothesis theme, persona, or funnel stage. +3. **Decision Encoding** – record win/ship, iterative, or archive outcomes plus rationale. +4. **Action Routing** – create follow-up stories, backlog items, or automation triggers. +5. **Knowledge Base Update** – tag learnings in centralized library with attribution + status. + +## Outputs +- Executive-ready readout with insights, decisions, and KPIs. +- Learning cards mapped to hypothesis taxonomy + next bets. +- Action log synced to backlog/project tools. + +## Agent/Skill Invocations +- `insight-analyst` – leads analysis and storytelling. +- `experimentation-strategist` – ensures learnings feed roadmap + governance. +- `hypothesis-library` skill – indexes learnings against taxonomy. +- `experiment-design-kit` skill – suggests iteration ideas based on patterns. + +--- diff --git a/plugin.lock.json b/plugin.lock.json new file mode 100644 index 0000000..406d426 --- /dev/null +++ b/plugin.lock.json @@ -0,0 +1,77 @@ +{ + "$schema": "internal://schemas/plugin.lock.v1.json", + "pluginId": "gh:gtmagents/gtm-agents:plugins/growth-experiments", + "normalized": { + "repo": null, + "ref": "refs/tags/v20251128.0", + "commit": "f71552aec1f22b8c42cf2a780cc173b967f34418", + "treeHash": "cb9022c07228292196db7a9a857f93acd3ddbbd1518f2715ff5d98c0c23461b8", + "generatedAt": "2025-11-28T10:17:14.617812Z", + "toolVersion": "publish_plugins.py@0.2.0" + }, + "origin": { + "remote": "git@github.com:zhongweili/42plugin-data.git", + "branch": "master", + "commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390", + "repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data" + }, + "manifest": { + "name": "growth-experiments", + "description": "Experiment backlog, launch, and learning governance across the funnel", + "version": "1.0.0" + }, + "content": { + "files": [ + { + "path": "README.md", + "sha256": "683f99f26c911cbcafbc37163793072097659e1fce64f81de1c6c9cb66c4936c" + }, + { + "path": "agents/insight-analyst.md", + "sha256": "d2874eece334595ddfe6b0c383fbfd16d2e378f57d36a3ddd0a86b95248cb774" + }, + { + "path": "agents/test-engineer.md", + "sha256": "d55a239ae6d577c42afae902509863106866888765bf2ca2f5db0b8178405478" + }, + { + "path": "agents/experimentation-strategist.md", + "sha256": "6ed1aa05e1f9a60ad9daff451aeb37452b390143f4ac0f55a1242f33cf5993c3" + }, + { + "path": ".claude-plugin/plugin.json", + "sha256": "1f147e5fd6975d3155505764c74b08e384aecbc0a944bb942ef52455c1822572" + }, + { + "path": "commands/synthesize-learnings.md", + "sha256": "0b103dbb5165d31b1ab9a5f7065f7e0198477d90be80074c4b8af3aa552ac3f8" + }, + { + "path": "commands/prioritize-hypotheses.md", + "sha256": "f2edd3494e0b7d153f0408eaba41e84bd0d9fbbcffdb3c568b3eba0b961ff808" + }, + { + "path": "commands/launch-experiment.md", + "sha256": "f087401e2a9650c086d5b6409a6dd8c031002241d44be2958f0c0ec59fd65cab" + }, + { + "path": "skills/hypothesis-library/SKILL.md", + "sha256": "89e222dee10d57bbf336fe09345705a6026853614a59d0d29a43e4b94d10fd3a" + }, + { + "path": "skills/guardrail-scorecard/SKILL.md", + "sha256": "5796941ec8ce4a6f8c29886f8a116d6055d36e41ad4dd27c77ed74f5150fa744" + }, + { + "path": "skills/experiment-design-kit/SKILL.md", + "sha256": "3ab9b7015c111edae825d24057f08970a544f1766e535c341bcc3575dcb41928" + } + ], + "dirSha256": "cb9022c07228292196db7a9a857f93acd3ddbbd1518f2715ff5d98c0c23461b8" + }, + "security": { + "scannedAt": null, + "scannerVersion": null, + "flags": [] + } +} \ No newline at end of file diff --git a/skills/experiment-design-kit/SKILL.md b/skills/experiment-design-kit/SKILL.md new file mode 100644 index 0000000..4dca274 --- /dev/null +++ b/skills/experiment-design-kit/SKILL.md @@ -0,0 +1,62 @@ +--- +name: experiment-design-kit +description: Toolkit for structuring hypotheses, variants, guardrails, and measurement + plans. +--- + +# Experiment Design Kit Skill + +## When to Use +- Translating raw ideas into testable hypotheses with clear success metrics. +- Ensuring experiment briefs include guardrails, instrumentation, and rollout details. +- Coaching pods on best practices for multi-variant or multi-surface tests. + +## Framework +1. **Problem Framing** – define user problem, business impact, and north-star metric. +2. **Hypothesis Structure** – "If we do X for Y persona, we expect Z change" with assumptions. +3. **Measurement Plan** – primary metric, guardrails, min detectable effect, power calc. +4. **Variant Strategy** – control definition, variant catalog, targeting, and exclusion rules. +5. **Operational Plan** – owners, timeline, dependencies, QA/rollback steps. + +## Templates +- Experiment brief (context, hypothesis, design, metrics, launch checklist). +- Guardrail register with thresholds + alerting rules. +- Variant matrix for surfaces, messaging, and states. +- **GTM Agents Growth Backlog Board** – capture idea → sizing → prioritization scoring (ICE/RICE) @puerto/README.md#183-212. +- **Weekly Experiment Packet** – includes KPI guardrails, qualitative notes, and next bets for Marketing Director + Sales Director. +- **Rollback Playbook** – pre-built checklist tied to lifecycle-mapping rip-cord procedures. + +## Tips +- Pressure-test hypotheses with counter-metrics to avoid local optima. +- Document data constraints early to avoid rework during build. +- Pair with `guardrail-scorecard` to ensure sign-off before launch. +- Apply GTM Agents cadence: Monday backlog groom, Wednesday build review, Friday learnings sync. +- Require KPI guardrails per stage (activation, engagement, monetization) before authorizing build. +- If a test risks Sales velocity, include Sales Director in approval routing per GTM Agents governance. + +## GTM Agents Experiment Operating Model +1. **Backlog Intake** – ideas flow from GTM pods; Growth Marketer tags theme, objective, expected impact. +2. **Prioritization** – score with RICE + qualitative "strategic fit" modifier; surface top 3 bets weekly. +3. **Design & Instrumentation** – reference Serena/Context7 to patch code + confirm documentation. +4. **Launch & Monitor** – use guardrail-scorecard to watch leading indicators (churn, complaints, latency). +5. **Learning Loop** – run Sequential Thinking retro; document hypothesis, result, decision, follow-up in backlog card. + +## KPI Guardrails (GTM Agents Reference) +- Activation rate change must stay within ±3% of baseline for Tier-1 segments. +- Revenue per visitor cannot drop more than 2% for more than 48h. +- Support tickets tied to experiment variant must remain <5% of total volume. + +## Weekly Experiment Packet Outline +``` +Week Ending: + +1. Portfolio Snapshot – tests live, status, KPI trend (guardrail vs actual) +2. Key Wins – hypothesis, uplift, next action (ship, iterate, expand) +3. Guardrail Alerts – what tripped, mitigation taken (rollback? scope adjust?) +4. Pipeline Impact – SQLs, ARR influenced, notable customer anecdotes +5. Upcoming Launches – dependencies, owners, open questions +``` + +Share packet with Growth, Marketing Director, Sales Director, and RevOps to mirror GTM Agents's cross-functional communication rhythm. + +--- diff --git a/skills/guardrail-scorecard/SKILL.md b/skills/guardrail-scorecard/SKILL.md new file mode 100644 index 0000000..7c7676e --- /dev/null +++ b/skills/guardrail-scorecard/SKILL.md @@ -0,0 +1,31 @@ +--- +name: guardrail-scorecard +description: Framework for defining, monitoring, and enforcing guardrail metrics across + experiments. +--- + +# Guardrail Scorecard Skill + +## When to Use +- Setting non-negotiable metrics (stability, churn, latency, compliance) before launching tests. +- Monitoring live experiments to ensure guardrails stay within thresholds. +- Reporting guardrail status in launch packets and post-test readouts. + +## Framework +1. **Metric Inventory** – list guardrail metrics, owners, data sources, refresh cadence. +2. **Threshold Matrix** – define warning vs critical bands per metric / persona / region. +3. **Alerting & Escalation** – map notification channels, DRI, and decision timelines. +4. **Exception Handling** – document when guardrail overrides are acceptable and required approvals. +5. **Retrospective Loop** – log breaches, mitigations, and rule updates for future tests. + +## Templates +- Guardrail register (metric, threshold, owner, alert channel). +- Live monitoring dashboard layout. +- Exception memo structure for approvals. + +## Tips +- Tie guardrails to downstream systems (billing, support) to catch second-order impacts. +- Keep thresholds dynamic for seasonality but document logic. +- Pair with `launch-experiment` to ensure readiness before flipping flags. + +--- diff --git a/skills/hypothesis-library/SKILL.md b/skills/hypothesis-library/SKILL.md new file mode 100644 index 0000000..bc0e4e4 --- /dev/null +++ b/skills/hypothesis-library/SKILL.md @@ -0,0 +1,31 @@ +--- +name: hypothesis-library +description: Curated repository of experiment hypotheses, assumptions, and historical + learnings. +--- + +# Hypothesis Library Skill + +## When to Use +- Capturing new experiment ideas with consistent metadata. +- Referencing past wins/losses before prioritizing the backlog. +- Sharing reusable learnings across pods and channels. + +## Framework +1. **Metadata Schema** – hypothesis ID, theme, persona, funnel stage, metrics. +2. **Assumptions Matrix** – belief statements, supporting evidence, confidence rating. +3. **Status Tracking** – idea → scoped → running → decided → archived. +4. **Learning Tags** – impact summary, guardrail notes, follow-up ideas. +5. **Governance Hooks** – approvals, owners, review cadence. + +## Templates +- Intake form for new hypotheses. +- Learning card format (context, result, recommendation). +- Portfolio dashboard summarizing mix by theme/metric. + +## Tips +- Require at least one supporting data point before moving to prioritization. +- Use consistent tagging so search/filtering works across teams. +- Link to `synthesize-learnings` outputs to keep narratives fresh. + +---