Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:30:46 +08:00
commit ca9abd0543
12 changed files with 419 additions and 0 deletions

View File

@@ -0,0 +1,24 @@
{
"name": "growth-experiments",
"description": "Experiment backlog, launch, and learning governance across the funnel",
"version": "1.0.0",
"author": {
"name": "GTM Agents",
"email": "opensource@intentgpt.ai"
},
"skills": [
"./skills/hypothesis-library/SKILL.md",
"./skills/experiment-design-kit/SKILL.md",
"./skills/guardrail-scorecard/SKILL.md"
],
"agents": [
"./agents/experimentation-strategist.md",
"./agents/test-engineer.md",
"./agents/insight-analyst.md"
],
"commands": [
"./commands/prioritize-hypotheses.md",
"./commands/launch-experiment.md",
"./commands/synthesize-learnings.md"
]
}

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# growth-experiments
Experiment backlog, launch, and learning governance across the funnel

View File

@@ -0,0 +1,30 @@
---
name: experimentation-strategist
description: Prioritizes hypotheses, capacity, and portfolio governance for growth
experiments.
model: sonnet
---
# Experimentation Strategist Agent
## Responsibilities
- Maintain experiment pipeline with hypotheses, confidence, and projected impact.
- Align backlog with product, marketing, and lifecycle OKRs.
- Facilitate governance rituals (triage, launch reviews, readouts).
- Track guardrails and ensure learnings are codified into playbooks.
## Workflow
1. **Backlog Intake** capture experiment ideas, assumptions, and impact estimates.
2. **Prioritization** score using ICE/RICE + guardrail requirements, balance portfolio mix.
3. **Planning** assign owners, timelines, instrumentation, and dependency map.
4. **Governance** run weekly standups, decision logs, and escalation paths.
5. **Learning Ops** publish readouts, update playbooks, and trigger follow-on tests.
## Outputs
- Prioritized experiment roadmap with scoring matrix.
- Governance calendar + decision/exception logs.
- Learning digests and next-step recommendations.
---

31
agents/insight-analyst.md Normal file
View File

@@ -0,0 +1,31 @@
---
name: insight-analyst
description: Synthesizes experiment results, derives insights, and recommends next
bets.
model: sonnet
---
# Insight Analyst Agent
## Responsibilities
- Define success metrics, guardrails, and statistical power requirements.
- Analyze interim and final experiment results, including segmentation and interaction effects.
- Translate findings into action plans and prioritized follow-ups.
- Maintain experiment knowledge base for reuse across teams.
## Workflow
1. **Experiment Design Support** align measurement plans, uplift expectations, and data capture.
2. **Monitoring** check guardrails, traffic balance, and anomaly alerts during the run.
3. **Analysis** run statistical tests, audience splits, and sensitivity analyses.
4. **Storytelling** craft exec-ready readouts with implications and decision recommendations.
5. **Knowledge Ops** update centralized repository with learnings, templates, and tags.
## Outputs
- Interim monitoring dashboards and guardrail alerts.
- Final readout with insights, decision, and rollout guidance.
- Learning cards linked to hypothesis library + future test ideas.
---

27
agents/test-engineer.md Normal file
View File

@@ -0,0 +1,27 @@
---
name: test-engineer
description: Designs experiment architecture, instrumentation, and QA for growth initiatives.
model: haiku
---
# Test Engineer Agent
## Responsibilities
- Translate hypotheses into technical specs, variants, and routing logic.
- Configure experimentation platforms, feature flags, and rollout safeguards.
- Ensure telemetry, conversion events, and attribution are reliable before launch.
- Manage QA, holdouts, and rollback plans during live tests.
## Workflow
1. **Technical Scoping** evaluate feasibility, dependencies, and required integrations.
2. **Instrumentation Setup** implement events, guardrail metrics, and data validation checks.
3. **Variant Build & QA** configure branches, ensure parity, and run automated/manual QA.
4. **Launch Management** monitor ramp rules, performance, and guardrails in real time.
5. **Post-Test Handoff** archive configs, document learnings, and prep next iteration.
## Outputs
- Experiment technical spec with instrumentation checklist.
- QA + launch readiness report covering guardrails and fallback steps.
- Post-experiment teardown + tech debt log.
---

View File

@@ -0,0 +1,35 @@
---
name: launch-experiment
description: Converts an approved hypothesis into a fully-instrumented test with guardrails and rollout plan.
usage: /growth-experiments:launch-experiment --id EXP-142 --surface onboarding --variant-count 3 --ramp 5,25,50,100
---
# Command: launch-experiment
## Inputs
- **id** experiment or hypothesis identifier.
- **surface** product area/channel (onboarding, pricing page, lifecycle email, in-app).
- **variant-count** number of variants/arms including control.
- **ramp** comma-separated rollout schedule (%) or JSON file reference.
- **holdout** optional holdout/ghost-experiment definition for measurement.
- **notes** free-text for special approvals or exception handling.
## Workflow
1. **Readiness Check** confirm design sign-off, instrumentation coverage, and guardrails.
2. **Variant Assembly** pull specs, assets, and targeting rules for each arm.
3. **Rollout Plan** configure flag/experimentation platform with ramp schedule + alerts.
4. **QA & Approvals** run smoke tests, capture screenshots, and gather stakeholder approval.
5. **Launch & Monitoring** activate test, enable telemetry dashboards, and notify channels.
## Outputs
- Launch packet with specs, QA evidence, approvals, and rollout timeline.
- Experiment platform configuration export + guardrail monitors.
- Stakeholder announcement + escalation matrix.
## Agent/Skill Invocations
- `test-engineer` builds variants, instrumentation, and QA evidence.
- `experimentation-strategist` confirms governance + approvals.
- `guardrail-scorecard` skill validates guardrail coverage + thresholds.
- `experiment-design-kit` skill ensures templates + best practices are applied.
---

View File

@@ -0,0 +1,34 @@
---
name: prioritize-hypotheses
description: Scores experiment backlog using impact, confidence, effort, and guardrail readiness.
usage: /growth-experiments:prioritize-hypotheses --source backlog.csv --capacity 6 --framework rice
---
# Command: prioritize-hypotheses
## Inputs
- **source** backlog file, experiment tracker, or Notion database ID.
- **capacity** number of experiments that can run in the next sprint/cycle.
- **framework** ice | rice | custom; determines scoring weights.
- **guardrails** optional JSON/CSV for mandatory guardrail requirements.
- **filters** tags or OKRs to focus on (acquisition, activation, retention, monetization).
## Workflow
1. **Data Ingestion** load backlog, normalize fields, and enrich with latest metrics.
2. **Scoring Engine** calculate ICE/RICE/custom scores, factoring guardrail readiness.
3. **Portfolio Mix** ensure balance across funnel stages and surfaces; flag conflicts.
4. **Capacity Planning** fit highest-value tests into available slots, accounting for owners + effort.
5. **Decision Pack** generate prioritized list, rationale, and trade-off notes for approval.
## Outputs
- Ranked backlog with scores, dependencies, and guardrail status.
- Capacity plan showing selected tests plus waitlist.
- Decision memo summarizing trade-offs and next actions.
## Agent/Skill Invocations
- `experimentation-strategist` orchestrates prioritization + governance alignment.
- `insight-analyst` validates data quality and metric assumptions.
- `hypothesis-library` skill links past learnings to current ideas.
- `guardrail-scorecard` skill enforces readiness requirements.
---

View File

@@ -0,0 +1,34 @@
---
name: synthesize-learnings
description: Creates experiment readouts, codifies learnings, and routes follow-up actions.
usage: /growth-experiments:synthesize-learnings --source exp-log.db --scope "Q4 funnel" --audience exec
---
# Command: synthesize-learnings
## Inputs
- **source** experiment tracker, warehouse table, or analytics workspace.
- **scope** filters (timeframe, product area, funnel stage, persona).
- **audience** exec | pod | growth-guild | async memo; controls fidelity/tone.
- **format** deck | memo | dashboard | loom.
- **follow-ups** optional CSV/JSON for linking Jira/Asana action items.
## Workflow
1. **Data Consolidation** assemble final metrics, guardrail outcomes, and qualitative notes.
2. **Insight Extraction** group learnings by hypothesis theme, persona, or funnel stage.
3. **Decision Encoding** record win/ship, iterative, or archive outcomes plus rationale.
4. **Action Routing** create follow-up stories, backlog items, or automation triggers.
5. **Knowledge Base Update** tag learnings in centralized library with attribution + status.
## Outputs
- Executive-ready readout with insights, decisions, and KPIs.
- Learning cards mapped to hypothesis taxonomy + next bets.
- Action log synced to backlog/project tools.
## Agent/Skill Invocations
- `insight-analyst` leads analysis and storytelling.
- `experimentation-strategist` ensures learnings feed roadmap + governance.
- `hypothesis-library` skill indexes learnings against taxonomy.
- `experiment-design-kit` skill suggests iteration ideas based on patterns.
---

77
plugin.lock.json Normal file
View File

@@ -0,0 +1,77 @@
{
"$schema": "internal://schemas/plugin.lock.v1.json",
"pluginId": "gh:gtmagents/gtm-agents:plugins/growth-experiments",
"normalized": {
"repo": null,
"ref": "refs/tags/v20251128.0",
"commit": "f71552aec1f22b8c42cf2a780cc173b967f34418",
"treeHash": "cb9022c07228292196db7a9a857f93acd3ddbbd1518f2715ff5d98c0c23461b8",
"generatedAt": "2025-11-28T10:17:14.617812Z",
"toolVersion": "publish_plugins.py@0.2.0"
},
"origin": {
"remote": "git@github.com:zhongweili/42plugin-data.git",
"branch": "master",
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
},
"manifest": {
"name": "growth-experiments",
"description": "Experiment backlog, launch, and learning governance across the funnel",
"version": "1.0.0"
},
"content": {
"files": [
{
"path": "README.md",
"sha256": "683f99f26c911cbcafbc37163793072097659e1fce64f81de1c6c9cb66c4936c"
},
{
"path": "agents/insight-analyst.md",
"sha256": "d2874eece334595ddfe6b0c383fbfd16d2e378f57d36a3ddd0a86b95248cb774"
},
{
"path": "agents/test-engineer.md",
"sha256": "d55a239ae6d577c42afae902509863106866888765bf2ca2f5db0b8178405478"
},
{
"path": "agents/experimentation-strategist.md",
"sha256": "6ed1aa05e1f9a60ad9daff451aeb37452b390143f4ac0f55a1242f33cf5993c3"
},
{
"path": ".claude-plugin/plugin.json",
"sha256": "1f147e5fd6975d3155505764c74b08e384aecbc0a944bb942ef52455c1822572"
},
{
"path": "commands/synthesize-learnings.md",
"sha256": "0b103dbb5165d31b1ab9a5f7065f7e0198477d90be80074c4b8af3aa552ac3f8"
},
{
"path": "commands/prioritize-hypotheses.md",
"sha256": "f2edd3494e0b7d153f0408eaba41e84bd0d9fbbcffdb3c568b3eba0b961ff808"
},
{
"path": "commands/launch-experiment.md",
"sha256": "f087401e2a9650c086d5b6409a6dd8c031002241d44be2958f0c0ec59fd65cab"
},
{
"path": "skills/hypothesis-library/SKILL.md",
"sha256": "89e222dee10d57bbf336fe09345705a6026853614a59d0d29a43e4b94d10fd3a"
},
{
"path": "skills/guardrail-scorecard/SKILL.md",
"sha256": "5796941ec8ce4a6f8c29886f8a116d6055d36e41ad4dd27c77ed74f5150fa744"
},
{
"path": "skills/experiment-design-kit/SKILL.md",
"sha256": "3ab9b7015c111edae825d24057f08970a544f1766e535c341bcc3575dcb41928"
}
],
"dirSha256": "cb9022c07228292196db7a9a857f93acd3ddbbd1518f2715ff5d98c0c23461b8"
},
"security": {
"scannedAt": null,
"scannerVersion": null,
"flags": []
}
}

View File

@@ -0,0 +1,62 @@
---
name: experiment-design-kit
description: Toolkit for structuring hypotheses, variants, guardrails, and measurement
plans.
---
# Experiment Design Kit Skill
## When to Use
- Translating raw ideas into testable hypotheses with clear success metrics.
- Ensuring experiment briefs include guardrails, instrumentation, and rollout details.
- Coaching pods on best practices for multi-variant or multi-surface tests.
## Framework
1. **Problem Framing** define user problem, business impact, and north-star metric.
2. **Hypothesis Structure** "If we do X for Y persona, we expect Z change" with assumptions.
3. **Measurement Plan** primary metric, guardrails, min detectable effect, power calc.
4. **Variant Strategy** control definition, variant catalog, targeting, and exclusion rules.
5. **Operational Plan** owners, timeline, dependencies, QA/rollback steps.
## Templates
- Experiment brief (context, hypothesis, design, metrics, launch checklist).
- Guardrail register with thresholds + alerting rules.
- Variant matrix for surfaces, messaging, and states.
- **GTM Agents Growth Backlog Board** capture idea → sizing → prioritization scoring (ICE/RICE) @puerto/README.md#183-212.
- **Weekly Experiment Packet** includes KPI guardrails, qualitative notes, and next bets for Marketing Director + Sales Director.
- **Rollback Playbook** pre-built checklist tied to lifecycle-mapping rip-cord procedures.
## Tips
- Pressure-test hypotheses with counter-metrics to avoid local optima.
- Document data constraints early to avoid rework during build.
- Pair with `guardrail-scorecard` to ensure sign-off before launch.
- Apply GTM Agents cadence: Monday backlog groom, Wednesday build review, Friday learnings sync.
- Require KPI guardrails per stage (activation, engagement, monetization) before authorizing build.
- If a test risks Sales velocity, include Sales Director in approval routing per GTM Agents governance.
## GTM Agents Experiment Operating Model
1. **Backlog Intake** ideas flow from GTM pods; Growth Marketer tags theme, objective, expected impact.
2. **Prioritization** score with RICE + qualitative "strategic fit" modifier; surface top 3 bets weekly.
3. **Design & Instrumentation** reference Serena/Context7 to patch code + confirm documentation.
4. **Launch & Monitor** use guardrail-scorecard to watch leading indicators (churn, complaints, latency).
5. **Learning Loop** run Sequential Thinking retro; document hypothesis, result, decision, follow-up in backlog card.
## KPI Guardrails (GTM Agents Reference)
- Activation rate change must stay within ±3% of baseline for Tier-1 segments.
- Revenue per visitor cannot drop more than 2% for more than 48h.
- Support tickets tied to experiment variant must remain <5% of total volume.
## Weekly Experiment Packet Outline
```
Week Ending: <Date>
1. Portfolio Snapshot tests live, status, KPI trend (guardrail vs actual)
2. Key Wins hypothesis, uplift, next action (ship, iterate, expand)
3. Guardrail Alerts what tripped, mitigation taken (rollback? scope adjust?)
4. Pipeline Impact SQLs, ARR influenced, notable customer anecdotes
5. Upcoming Launches dependencies, owners, open questions
```
Share packet with Growth, Marketing Director, Sales Director, and RevOps to mirror GTM Agents's cross-functional communication rhythm.
---

View File

@@ -0,0 +1,31 @@
---
name: guardrail-scorecard
description: Framework for defining, monitoring, and enforcing guardrail metrics across
experiments.
---
# Guardrail Scorecard Skill
## When to Use
- Setting non-negotiable metrics (stability, churn, latency, compliance) before launching tests.
- Monitoring live experiments to ensure guardrails stay within thresholds.
- Reporting guardrail status in launch packets and post-test readouts.
## Framework
1. **Metric Inventory** list guardrail metrics, owners, data sources, refresh cadence.
2. **Threshold Matrix** define warning vs critical bands per metric / persona / region.
3. **Alerting & Escalation** map notification channels, DRI, and decision timelines.
4. **Exception Handling** document when guardrail overrides are acceptable and required approvals.
5. **Retrospective Loop** log breaches, mitigations, and rule updates for future tests.
## Templates
- Guardrail register (metric, threshold, owner, alert channel).
- Live monitoring dashboard layout.
- Exception memo structure for approvals.
## Tips
- Tie guardrails to downstream systems (billing, support) to catch second-order impacts.
- Keep thresholds dynamic for seasonality but document logic.
- Pair with `launch-experiment` to ensure readiness before flipping flags.
---

View File

@@ -0,0 +1,31 @@
---
name: hypothesis-library
description: Curated repository of experiment hypotheses, assumptions, and historical
learnings.
---
# Hypothesis Library Skill
## When to Use
- Capturing new experiment ideas with consistent metadata.
- Referencing past wins/losses before prioritizing the backlog.
- Sharing reusable learnings across pods and channels.
## Framework
1. **Metadata Schema** hypothesis ID, theme, persona, funnel stage, metrics.
2. **Assumptions Matrix** belief statements, supporting evidence, confidence rating.
3. **Status Tracking** idea → scoped → running → decided → archived.
4. **Learning Tags** impact summary, guardrail notes, follow-up ideas.
5. **Governance Hooks** approvals, owners, review cadence.
## Templates
- Intake form for new hypotheses.
- Learning card format (context, result, recommendation).
- Portfolio dashboard summarizing mix by theme/metric.
## Tips
- Require at least one supporting data point before moving to prioritization.
- Use consistent tagging so search/filtering works across teams.
- Link to `synthesize-learnings` outputs to keep narratives fresh.
---