Initial commit
This commit is contained in:
458
skills/market-mechanics-betting/SKILL.md
Normal file
458
skills/market-mechanics-betting/SKILL.md
Normal file
@@ -0,0 +1,458 @@
|
||||
---
|
||||
name: market-mechanics-betting
|
||||
description: Use to convert probabilities into decisions (bet/pass/hedge) and optimize scoring. Invoke when need to calculate edge, size bets optimally (Kelly Criterion), extremize aggregated forecasts, or improve Brier scores. Use when user mentions betting strategy, Kelly, edge calculation, Brier score, extremizing, or translating belief into action.
|
||||
---
|
||||
|
||||
# Market Mechanics & Betting
|
||||
|
||||
## Table of Contents
|
||||
- [What is Market Mechanics?](#what-is-market-mechanics)
|
||||
- [When to Use This Skill](#when-to-use-this-skill)
|
||||
- [Interactive Menu](#interactive-menu)
|
||||
- [Quick Reference](#quick-reference)
|
||||
- [Resource Files](#resource-files)
|
||||
|
||||
---
|
||||
|
||||
## What is Market Mechanics?
|
||||
|
||||
**Market mechanics** translates beliefs (probabilities) into actions (bets, decisions, resource allocation) using quantitative frameworks.
|
||||
|
||||
**Core Principle:** If you believe something with X% probability, you should be willing to bet at certain odds.
|
||||
|
||||
**Why It Matters:**
|
||||
- Forces intellectual honesty (would you bet on this?)
|
||||
- Optimizes resource allocation (how much to bet?)
|
||||
- Improves calibration (betting reveals true beliefs)
|
||||
- Provides scoring framework (Brier, log score)
|
||||
- Enables aggregation (extremizing, market prices)
|
||||
|
||||
---
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use when:
|
||||
- Converting belief to action - Have probability, need decision
|
||||
- Betting decisions - Should I bet? How much?
|
||||
- Resource allocation - How to distribute finite resources?
|
||||
- Scoring forecasts - Measuring accuracy (Brier score)
|
||||
- Aggregating forecasts - Combining multiple predictions
|
||||
- Finding edge - Is my probability better than market?
|
||||
|
||||
Do NOT use when:
|
||||
- No market/betting context exists
|
||||
- Non-quantifiable outcomes
|
||||
- Pure strategic analysis (no probability needed)
|
||||
|
||||
---
|
||||
|
||||
## Interactive Menu
|
||||
|
||||
**What would you like to do?**
|
||||
|
||||
### Core Workflows
|
||||
|
||||
**1. [Calculate Edge](#1-calculate-edge)** - Determine if you have an advantage
|
||||
**2. [Optimize Bet Size (Kelly Criterion)](#2-optimize-bet-size-kelly-criterion)** - How much to bet
|
||||
**3. [Extremize Aggregated Forecasts](#3-extremize-aggregated-forecasts)** - Adjust crowd wisdom
|
||||
**4. [Optimize Brier Score](#4-optimize-brier-score)** - Improve forecast scoring
|
||||
**5. [Hedge and Portfolio Betting](#5-hedge-and-portfolio-betting)** - Manage multiple bets
|
||||
**6. [Learn the Framework](#6-learn-the-framework)** - Deep dive into methodology
|
||||
**7. Exit** - Return to main forecasting workflow
|
||||
|
||||
---
|
||||
|
||||
## 1. Calculate Edge
|
||||
|
||||
**Determine if you have a betting advantage.**
|
||||
|
||||
```
|
||||
Edge Calculation Progress:
|
||||
- [ ] Step 1: Identify market probability
|
||||
- [ ] Step 2: State your probability
|
||||
- [ ] Step 3: Calculate edge
|
||||
- [ ] Step 4: Apply minimum threshold
|
||||
- [ ] Step 5: Make bet/pass decision
|
||||
```
|
||||
|
||||
### Step 1: Identify market probability
|
||||
|
||||
**Sources:** Prediction markets (Polymarket, Kalshi), betting odds, consensus forecasts, base rates
|
||||
|
||||
**Converting betting odds to probability:**
|
||||
```
|
||||
Decimal odds: Probability = 1 / Odds
|
||||
American (+150): Probability = 100 / (150 + 100) = 40%
|
||||
American (-150): Probability = 150 / (150 + 100) = 60%
|
||||
Fractional (3/1): Probability = 1 / (3 + 1) = 25%
|
||||
```
|
||||
|
||||
### Step 2: State your probability
|
||||
|
||||
After running your forecasting process, state: **Your probability:** ___%
|
||||
|
||||
### Step 3: Calculate edge
|
||||
|
||||
```
|
||||
Edge = Your Probability - Market Probability
|
||||
```
|
||||
|
||||
**Interpretation:**
|
||||
- **Positive edge:** More bullish than market → Consider betting YES
|
||||
- **Negative edge:** More bearish than market → Consider betting NO
|
||||
- **Zero edge:** Agree with market → Pass
|
||||
|
||||
### Step 4: Apply minimum threshold
|
||||
|
||||
**Minimum Edge Thresholds:**
|
||||
|
||||
| Context | Minimum Edge | Reasoning |
|
||||
|---------|--------------|-----------|
|
||||
| Prediction markets | 5-10% | Fees ~2-5%, need buffer |
|
||||
| Sports betting | 3-5% | Efficient markets |
|
||||
| Private bets | 2-3% | Only model uncertainty |
|
||||
| High conviction | 8-15% | Substantial edge needed |
|
||||
|
||||
### Step 5: Make bet/pass decision
|
||||
|
||||
```
|
||||
If Edge > Minimum Threshold → Calculate bet size (Kelly)
|
||||
If 0 < Edge < Minimum → Pass (edge too small)
|
||||
If Edge < 0 → Consider opposite bet or pass
|
||||
```
|
||||
|
||||
**Next:** Return to [menu](#interactive-menu) or continue to Kelly sizing
|
||||
|
||||
---
|
||||
|
||||
## 2. Optimize Bet Size (Kelly Criterion)
|
||||
|
||||
**Calculate optimal bet size to maximize long-term growth.**
|
||||
|
||||
```
|
||||
Kelly Criterion Progress:
|
||||
- [ ] Step 1: Understand Kelly formula
|
||||
- [ ] Step 2: Calculate full Kelly
|
||||
- [ ] Step 3: Apply fractional Kelly
|
||||
- [ ] Step 4: Consider bankroll constraints
|
||||
- [ ] Step 5: Execute bet
|
||||
```
|
||||
|
||||
### Step 1: Understand Kelly formula
|
||||
|
||||
```
|
||||
f* = (bp - q) / b
|
||||
|
||||
Where:
|
||||
f* = Fraction of bankroll to bet
|
||||
b = Net odds received (decimal odds - 1)
|
||||
p = Your probability of winning
|
||||
q = Your probability of losing (1 - p)
|
||||
```
|
||||
|
||||
Maximizes expected logarithm of wealth (long-term growth rate).
|
||||
|
||||
### Step 2: Calculate full Kelly
|
||||
|
||||
**Example:**
|
||||
- Your probability: 70% win
|
||||
- Market odds: 1.67 (decimal) → Net odds (b): 0.67
|
||||
- p = 0.70, q = 0.30
|
||||
|
||||
```
|
||||
f* = (0.67 × 0.70 - 0.30) / 0.67 = 0.252 = 25.2%
|
||||
```
|
||||
|
||||
Full Kelly says: **Bet 25.2% of bankroll**
|
||||
|
||||
### Step 3: Apply fractional Kelly
|
||||
|
||||
**Problem with full Kelly:** High variance, model error sensitivity, psychological difficulty
|
||||
|
||||
**Solution: Fractional Kelly**
|
||||
|
||||
```
|
||||
Actual bet = f* × Fraction
|
||||
|
||||
Common fractions:
|
||||
- 1/2 Kelly: f* / 2
|
||||
- 1/3 Kelly: f* / 3
|
||||
- 1/4 Kelly: f* / 4
|
||||
```
|
||||
|
||||
**Recommendation:** Use 1/4 to 1/2 Kelly for most bets.
|
||||
|
||||
**Why:** Reduces variance by 50-75%, still captures most growth, more robust to model error.
|
||||
|
||||
### Step 4: Consider bankroll constraints
|
||||
|
||||
**Practical considerations:**
|
||||
1. Define dedicated betting bankroll (money you can afford to lose)
|
||||
2. Minimum bet size (market minimums)
|
||||
3. Maximum bet size (market/liquidity limits)
|
||||
4. Round to practical amounts
|
||||
|
||||
### Step 5: Execute bet
|
||||
|
||||
**Final check:**
|
||||
- [ ] Confirmed edge > minimum threshold
|
||||
- [ ] Calculated Kelly size
|
||||
- [ ] Applied fractional Kelly (1/4 to 1/2)
|
||||
- [ ] Checked bankroll constraints
|
||||
- [ ] Verified odds haven't changed
|
||||
|
||||
**Place bet.**
|
||||
|
||||
**Next:** Return to [menu](#interactive-menu)
|
||||
|
||||
---
|
||||
|
||||
## 3. Extremize Aggregated Forecasts
|
||||
|
||||
**Adjust crowd wisdom when aggregating multiple predictions.**
|
||||
|
||||
```
|
||||
Extremizing Progress:
|
||||
- [ ] Step 1: Understand why extremizing works
|
||||
- [ ] Step 2: Collect individual forecasts
|
||||
- [ ] Step 3: Calculate simple average
|
||||
- [ ] Step 4: Apply extremizing formula
|
||||
- [ ] Step 5: Validate and finalize
|
||||
```
|
||||
|
||||
### Step 1: Understand why extremizing works
|
||||
|
||||
**The Problem:** When you average forecasts, you get regression to 50%.
|
||||
|
||||
**The Research:** Good Judgment Project found aggregated forecasts are more accurate than individuals BUT systematically too moderate. Extremizing (pushing away from 50%) improves accuracy because multiple forecasters share common information, and simple averaging "overcounts" shared information.
|
||||
|
||||
### Step 2: Collect individual forecasts
|
||||
|
||||
Gather predictions from multiple sources. Ensure forecasts are independent, forecasters used good process, and have similar information available.
|
||||
|
||||
### Step 3: Calculate simple average
|
||||
|
||||
```
|
||||
Average = Sum of forecasts / Number of forecasts
|
||||
```
|
||||
|
||||
### Step 4: Apply extremizing formula
|
||||
|
||||
```
|
||||
Extremized = 50% + (Average - 50%) × Factor
|
||||
|
||||
Where Factor typically ranges from 1.2 to 1.5
|
||||
```
|
||||
|
||||
**Example:**
|
||||
- Average: 77.6%
|
||||
- Factor: 1.3
|
||||
|
||||
```
|
||||
Extremized = 50% + (77.6% - 50%) × 1.3 = 85.88% ≈ 86%
|
||||
```
|
||||
|
||||
**Choosing the Factor:**
|
||||
|
||||
| Situation | Factor | Reasoning |
|
||||
|-----------|--------|-----------|
|
||||
| Forecasters highly correlated | 1.1-1.2 | Weak extremizing |
|
||||
| Moderately independent | 1.3-1.4 | Moderate extremizing |
|
||||
| Very independent | 1.5+ | Strong extremizing |
|
||||
| High expertise | 1.4-1.6 | Trust the signal |
|
||||
|
||||
**Default: Use 1.3 if unsure.**
|
||||
|
||||
### Step 5: Validate and finalize
|
||||
|
||||
**Sanity checks:**
|
||||
1. **Bounded [0%, 100%]:** Cap at 99%/1% if needed
|
||||
2. **Reasonableness:** Does result "feel" right?
|
||||
3. **Compare to best individual:** Extremized should be close to best forecaster
|
||||
|
||||
**Next:** Return to [menu](#interactive-menu)
|
||||
|
||||
---
|
||||
|
||||
## 4. Optimize Brier Score
|
||||
|
||||
**Improve forecast accuracy scoring.**
|
||||
|
||||
```
|
||||
Brier Score Optimization Progress:
|
||||
- [ ] Step 1: Understand Brier score formula
|
||||
- [ ] Step 2: Calculate your Brier score
|
||||
- [ ] Step 3: Decompose into calibration and resolution
|
||||
- [ ] Step 4: Identify improvement strategies
|
||||
- [ ] Step 5: Avoid gaming the metric
|
||||
```
|
||||
|
||||
### Step 1: Understand Brier score formula
|
||||
|
||||
```
|
||||
Brier Score = (1/N) × Σ(Probability - Outcome)²
|
||||
|
||||
Where:
|
||||
- Probability = Your forecast (0 to 1)
|
||||
- Outcome = Actual result (0 or 1)
|
||||
- N = Number of forecasts
|
||||
```
|
||||
|
||||
**Range:** 0 (perfect) to 1 (worst). **Lower is better.**
|
||||
|
||||
### Step 2: Calculate your Brier score
|
||||
|
||||
**Interpretation:**
|
||||
|
||||
| Brier Score | Quality |
|
||||
|-------------|---------|
|
||||
| < 0.10 | Excellent |
|
||||
| 0.10 - 0.15 | Good |
|
||||
| 0.15 - 0.20 | Average |
|
||||
| 0.20 - 0.25 | Below average |
|
||||
| > 0.25 | Poor |
|
||||
|
||||
**Baseline:** Random guessing (always 50%) gives Brier = 0.25
|
||||
|
||||
### Step 3: Decompose into calibration and resolution
|
||||
|
||||
**Brier Score = Calibration Error + Resolution + Uncertainty**
|
||||
|
||||
**Calibration Error:** Do your 70% predictions happen 70% of the time? (measures bias)
|
||||
**Resolution:** How often do you assign different probabilities to different outcomes? (measures discrimination)
|
||||
|
||||
### Step 4: Identify improvement strategies
|
||||
|
||||
**Strategy 1: Fix Calibration**
|
||||
- If overconfident: Widen confidence intervals, be less extreme
|
||||
- If underconfident: Be more extreme when you have strong evidence
|
||||
- Tool: Calibration plot (X: predicted probability, Y: actual frequency)
|
||||
|
||||
**Strategy 2: Improve Resolution**
|
||||
- Avoid being stuck at 50%
|
||||
- Differentiate between easy and hard forecasts
|
||||
- Be bold when evidence is strong
|
||||
|
||||
**Strategy 3: Gather Better Information**
|
||||
- Do more research, use reference classes, decompose with Fermi, update with Bayes
|
||||
|
||||
### Step 5: Avoid gaming the metric
|
||||
|
||||
**Wrong approach:** "Never predict below 10% or above 90%" (gaming)
|
||||
|
||||
**Right approach:** Predict your TRUE belief. If that's 5%, say 5%. Accept that you'll occasionally get large Brier penalties. Over many forecasts, honesty wins.
|
||||
|
||||
**The rule:** Minimize Brier score by being **accurate**, not by being **safe**.
|
||||
|
||||
**Next:** Return to [menu](#interactive-menu)
|
||||
|
||||
---
|
||||
|
||||
## 5. Hedge and Portfolio Betting
|
||||
|
||||
**Manage multiple bets and correlations.**
|
||||
|
||||
```
|
||||
Portfolio Betting Progress:
|
||||
- [ ] Step 1: Identify correlations between bets
|
||||
- [ ] Step 2: Calculate portfolio Kelly
|
||||
- [ ] Step 3: Assess hedging opportunities
|
||||
- [ ] Step 4: Optimize across all positions
|
||||
- [ ] Step 5: Monitor and rebalance
|
||||
```
|
||||
|
||||
### Step 1: Identify correlations between bets
|
||||
|
||||
**The problem:** If bets are correlated, true exposure is higher than sum of individual bets.
|
||||
|
||||
**Correlation examples:**
|
||||
- **Positive:** "Democrats win House" + "Democrats win Senate"
|
||||
- **Negative:** "Team A wins" + "Team B wins" (playing each other)
|
||||
- **Uncorrelated:** "Rain tomorrow" + "Bitcoin price doubles"
|
||||
|
||||
### Step 2: Calculate portfolio Kelly
|
||||
|
||||
**Simplified heuristic:**
|
||||
- If correlation > 0.5: Reduce each bet size by 30-50%
|
||||
- If correlation < -0.5: Can increase total exposure slightly (partial hedge)
|
||||
|
||||
### Step 3: Assess hedging opportunities
|
||||
|
||||
**When to hedge:**
|
||||
1. **Probability changed:** Lock in profit when beliefs shift
|
||||
2. **Lock in profit:** Event moved in your favor, odds improved
|
||||
3. **Reduce exposure:** Too much capital on one outcome
|
||||
|
||||
**Hedging example:**
|
||||
- Bet $100 on A at 60% (1.67 odds) → Payout: $167
|
||||
- Odds change: A now 70%, B now 30% (3.33 odds)
|
||||
- Hedge: Bet $50 on B at 3.33 → Payout if B wins: $167
|
||||
- **Result:** Guaranteed $17 profit regardless of outcome
|
||||
|
||||
### Step 4: Optimize across all positions
|
||||
|
||||
View portfolio holistically. Reduce correlated bets, maintain independence where possible.
|
||||
|
||||
### Step 5: Monitor and rebalance
|
||||
|
||||
**Weekly review:** Check if probabilities changed, assess hedging opportunities, rebalance if needed
|
||||
**After major news:** Update probabilities, consider hedging, recalculate Kelly sizes
|
||||
**Monthly audit:** Portfolio correlation check, bankroll adjustment, performance review
|
||||
|
||||
**Next:** Return to [menu](#interactive-menu)
|
||||
|
||||
---
|
||||
|
||||
## 6. Learn the Framework
|
||||
|
||||
**Deep dive into the methodology.**
|
||||
|
||||
### Resource Files
|
||||
|
||||
📄 **[Betting Theory Fundamentals](resources/betting-theory.md)**
|
||||
- Expected value framework, variance and risk, bankroll management, market efficiency
|
||||
|
||||
📄 **[Kelly Criterion Deep Dive](resources/kelly-criterion.md)**
|
||||
- Mathematical derivation, proof of optimality, extensions and variations, common mistakes
|
||||
|
||||
📄 **[Scoring Rules and Calibration](resources/scoring-rules.md)**
|
||||
- Brier score deep dive, log score, calibration curves, resolution analysis, proper scoring rules
|
||||
|
||||
**Next:** Return to [menu](#interactive-menu)
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### The Market Mechanics Commandments
|
||||
|
||||
1. **Edge > Threshold** - Don't bet small edges (5%+ minimum)
|
||||
2. **Use Fractional Kelly** - Never full Kelly (use 1/4 to 1/2)
|
||||
3. **Extremize aggregates** - Push away from 50% when combining forecasts
|
||||
4. **Minimize Brier honestly** - Be accurate, not safe
|
||||
5. **Watch correlations** - Portfolio risk > sum of individual risks
|
||||
6. **Hedge strategically** - When probabilities change or lock profit
|
||||
7. **Track calibration** - Your 70% should happen 70% of the time
|
||||
|
||||
### One-Sentence Summary
|
||||
|
||||
> Convert beliefs into optimal decisions using edge calculation, Kelly sizing, extremizing, and proper scoring.
|
||||
|
||||
### Integration with Other Skills
|
||||
|
||||
- **Before:** Use after completing forecast (have probability, need action)
|
||||
- **Companion:** Works with `bayesian-reasoning-calibration` for probability updates
|
||||
- **Feeds into:** Portfolio management and adaptive betting strategies
|
||||
|
||||
---
|
||||
|
||||
## Resource Files
|
||||
|
||||
📁 **resources/**
|
||||
- [betting-theory.md](resources/betting-theory.md) - Fundamentals and framework
|
||||
- [kelly-criterion.md](resources/kelly-criterion.md) - Optimal bet sizing
|
||||
- [scoring-rules.md](resources/scoring-rules.md) - Calibration and accuracy measurement
|
||||
|
||||
---
|
||||
|
||||
**Ready to start? Choose a number from the [menu](#interactive-menu) above.**
|
||||
393
skills/market-mechanics-betting/resources/betting-theory.md
Normal file
393
skills/market-mechanics-betting/resources/betting-theory.md
Normal file
@@ -0,0 +1,393 @@
|
||||
# Betting Theory Fundamentals
|
||||
|
||||
This resource explains the core theoretical foundations of rational betting, expected value, variance management, and market efficiency.
|
||||
|
||||
**Foundation for:** All betting and forecasting decisions
|
||||
|
||||
---
|
||||
|
||||
## Why Learn Betting Theory
|
||||
|
||||
**Core insight:** Betting theory separates decision quality from outcome quality. Make +EV decisions repeatedly and survive variance.
|
||||
|
||||
**Enables:**
|
||||
- Think probabilistically (convert beliefs to quantifiable edges)
|
||||
- Manage risk rationally (distinguish bad decisions from bad outcomes)
|
||||
- Avoid costly mistakes (identify predictable failure modes)
|
||||
- Optimize long-term growth (balance aggression with preservation)
|
||||
|
||||
**Research foundation:** Kelly (1956), Samuelson (1963), Thorp (1969), behavioral economics (Kahneman & Tversky), market efficiency (Fama).
|
||||
|
||||
---
|
||||
|
||||
## 1. Expected Value Framework
|
||||
|
||||
### Definition and Formula
|
||||
|
||||
**Expected Value (EV):** Probability-weighted average of all possible outcomes.
|
||||
|
||||
```
|
||||
EV = Σ(Probability × Outcome)
|
||||
|
||||
Binary bet:
|
||||
EV = (P_win × Amount_won) - (P_lose × Amount_lost)
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Bet $100 on 60% event at even odds (+100)
|
||||
EV = (0.60 × $100) - (0.40 × $100) = $20
|
||||
EV% = +20% per $100 wagered
|
||||
```
|
||||
|
||||
### Positive vs Negative EV
|
||||
|
||||
**Decision Framework:**
|
||||
- **EV > +5%:** Strong bet (after fees/uncertainty)
|
||||
- **EV = 0% to +5%:** Marginal (consider passing)
|
||||
- **EV < 0%:** Never bet (unless hedging)
|
||||
|
||||
**Critical Rule:** Judge decisions by EV, not outcomes. Good decisions lose sometimes; bad decisions win sometimes. Process matters in small samples, results matter over 100+ trials.
|
||||
|
||||
### Converting Market Odds to EV
|
||||
|
||||
**Step 1: Implied probability**
|
||||
```
|
||||
Decimal odds: P = 1 / Odds
|
||||
Example: 1.67 → 60%
|
||||
|
||||
American (+): P = 100 / (Odds + 100)
|
||||
Example: +150 → 40%
|
||||
|
||||
American (-): P = |Odds| / (|Odds| + 100)
|
||||
Example: -150 → 60%
|
||||
```
|
||||
|
||||
**Step 2: Calculate edge**
|
||||
```
|
||||
Your probability: 70%
|
||||
Market probability: 60%
|
||||
Edge = 70% - 60% = +10%
|
||||
```
|
||||
|
||||
**Step 3: Calculate EV**
|
||||
```
|
||||
Bet $100 at 1.67 odds:
|
||||
EV = (0.70 × $67) - (0.30 × $100) = +$16.90 = +16.9%
|
||||
```
|
||||
|
||||
### Law of Large Numbers
|
||||
|
||||
**Key principle:** Observed frequency converges to true probability as sample size increases.
|
||||
|
||||
**Practical thresholds:**
|
||||
- 10 bets: High variance, might be down despite +EV
|
||||
- 100 bets: Convergence starting, likely near EV
|
||||
- 1000 bets: Results tightly centered around EV
|
||||
|
||||
**Application:** Don't judge strategy on <30 trials. Variance dominates small samples.
|
||||
|
||||
---
|
||||
|
||||
## 2. Variance and Risk
|
||||
|
||||
### Standard Deviation
|
||||
|
||||
**Measures outcome dispersion around EV.**
|
||||
|
||||
**Formula:**
|
||||
```
|
||||
σ = √(P_win×(Win-EV)² + P_lose×(Loss-EV)²)
|
||||
```
|
||||
|
||||
**Example ($100 bet, 60% win, even odds):**
|
||||
```
|
||||
EV = $20
|
||||
σ = √(0.60×(100-20)² + 0.40×(-100-20)²)
|
||||
σ = √9600 = $98
|
||||
|
||||
Coefficient of Variation: σ/EV = $98/$20 = 4.9
|
||||
```
|
||||
|
||||
**Interpretation:** Standard deviation ($98) is 5× the EV ($20). Variance dominates signal.
|
||||
|
||||
### Volatility Categories
|
||||
|
||||
**Coefficient of Variation (CV = σ/EV):**
|
||||
- CV < 1: Low volatility (10-30 trials to see EV)
|
||||
- CV = 1-3: Moderate (30-50 trials)
|
||||
- CV = 3-10: High (50-100 trials)
|
||||
- CV > 10: Extreme (100+ trials)
|
||||
|
||||
**Higher CV requires:** Larger bankroll, more patience, stronger discipline.
|
||||
|
||||
### Risk of Ruin
|
||||
|
||||
**Probability of losing entire bankroll before profit.**
|
||||
|
||||
**Practical Guidelines:**
|
||||
|
||||
| Bet Size | Risk of Ruin | Assessment |
|
||||
|----------|--------------|------------|
|
||||
| 50% of bankroll | ~40% | Reckless |
|
||||
| 25% of bankroll | ~20% | Aggressive |
|
||||
| 10% of bankroll | ~5% | Moderate |
|
||||
| 5% of bankroll | ~1% | Conservative |
|
||||
| 2% of bankroll | ~0.1% | Very conservative |
|
||||
|
||||
**Kelly Criterion naturally manages risk of ruin. Never bet >10% of bankroll on single bet.**
|
||||
|
||||
### Managing Volatility
|
||||
|
||||
**1. Fractional Kelly (Primary Tool):**
|
||||
- Full Kelly: 100% variance, 40%+ drawdowns
|
||||
- Half Kelly: 25% variance, ~20% drawdowns
|
||||
- Quarter Kelly: 6% variance, ~10% drawdowns
|
||||
|
||||
**2. Diversification:**
|
||||
- Multiple uncorrelated +EV bets
|
||||
- Requires independence (correlation < 0.3)
|
||||
|
||||
**3. Expected Drawdown:**
|
||||
- Even optimal betting experiences 20-40% drawdowns
|
||||
- Mentally prepare for temporary losses
|
||||
- Don't confuse drawdown with -EV strategy
|
||||
|
||||
---
|
||||
|
||||
## 3. Bankroll Management
|
||||
|
||||
### Defining Your Bankroll
|
||||
|
||||
**Valid:** Money you can afford to lose entirely, separate from emergency fund, investment portfolio, daily expenses. **Starting:** $500-$5000 recreational, $10,000+ serious.
|
||||
|
||||
**NOT valid:** Money needed for bills, emergency fund, retirement, money you'd be devastated to lose.
|
||||
|
||||
### Separation Principle
|
||||
|
||||
**Why:** Prevents scared money and revenge betting. Clear accounting, tax clarity, risk containment.
|
||||
|
||||
**Implementation:** Separate betting account, never add money mid-downswing, withdraw profits periodically, stop if bankroll → $0.
|
||||
|
||||
### Growth vs Preservation
|
||||
|
||||
**Preservation (Default):** 1/4 to 1/2 Kelly, for most bettors and bankrolls <$5000
|
||||
**Growth (Advanced):** 1/2 to full Kelly, for large bankrolls and high variance tolerance (requires 2+ years track record)
|
||||
|
||||
### Dynamic Sizing
|
||||
|
||||
Bet size scales with bankroll. Example: $1000 bankroll at 5% = $50. After wins → $1500 → bet $75. After losses → $600 → bet $30.
|
||||
|
||||
**Recalculate:** Daily if >20% change, weekly (active), monthly (casual).
|
||||
|
||||
### Withdrawal Strategy
|
||||
|
||||
**Recommended:** When bankroll doubles, withdraw original amount, continue with profit (break-even if lose profit).
|
||||
**Conservative:** 50% profit monthly. **Aggressive:** Never withdraw (full compounding).
|
||||
|
||||
---
|
||||
|
||||
## 4. Market Efficiency
|
||||
|
||||
### Efficient Market Hypothesis
|
||||
|
||||
**Core claim:** Prices reflect all available information. **Reality:** Semi-strong efficient in liquid, mature markets.
|
||||
|
||||
**Market knows:** Published polls/news, historical base rates, expert commentary, obvious statistical patterns.
|
||||
|
||||
### Where Edges Exist
|
||||
|
||||
**1. Information Asymmetry:** Local knowledge, domain expertise
|
||||
**2. Model Superiority:** Better statistical model, proper extremizing
|
||||
**3. Lower Transaction Costs:** Market 5% fee vs your 0-1%
|
||||
**4. Behavioral Biases:** Recency bias, base rate neglect, narrative following
|
||||
**5. Market Immaturity:** Low liquidity, niche topics, few informed traders
|
||||
|
||||
**Before betting, ask:** "What information or model do I have that the market doesn't?"
|
||||
- Nothing → Pass | Vague → Pass | Specific → Investigate
|
||||
|
||||
### Trust vs Question Market
|
||||
|
||||
**Trust:** Liquid, mature, objective outcome, many informed participants, low emotion
|
||||
**Question:** Illiquid, new, subjective outcome, few informed participants, high emotion (politics, fandom)
|
||||
|
||||
---
|
||||
|
||||
## 5. Common Betting Mistakes
|
||||
|
||||
### Chasing Losses
|
||||
**What:** Increasing bet size after losses. **Why:** Loss aversion, emotional arousal.
|
||||
**Fix:** Never increase bet size after loss, use bankroll %, take break after 2+ losses.
|
||||
|
||||
### Tilt (Emotional Betting)
|
||||
**Triggers:** Bad beat, streaks, external stress. **Symptoms:** No analysis, ignoring Kelly, revenge betting.
|
||||
**Fix:** Pre-commit no bets when tilted. Checklist: Calm? Calculate EV? Kelly sizing? Betting for +EV not revenge?
|
||||
|
||||
### Overconfidence Bias
|
||||
**What:** Overestimating probability accuracy (90% when true is 70%).
|
||||
**Fix:** Track calibration, log predictions + outcomes, calculate curve quarterly. Do 70% predictions happen 70%?
|
||||
|
||||
### Ignoring Variance
|
||||
**What:** Judging strategy on <30 trials. Example: "Down 15% after 20 bets, strategy sucks" (normal variance).
|
||||
**Fix:** Require 50+ bets minimum, 100+ preferred, 200+ for high confidence.
|
||||
|
||||
### Outcome Bias
|
||||
**What:** Judging by results not process. +15% EV lost = good decision (bad outcome). -10% EV won = bad decision (lucky).
|
||||
**Fix:** Checklist: EV correct? Edge > threshold? Kelly fraction? Followed system? YES = good decision regardless of outcome.
|
||||
|
||||
### Hindsight Bias
|
||||
**What:** After outcome, "I knew it would happen."
|
||||
**Fix:** Pre-commit logging, write probability before event, don't revise after, accept 40% events happen 40%.
|
||||
|
||||
---
|
||||
|
||||
## 6. Integration with Kelly Criterion
|
||||
|
||||
### EV Drives Kelly
|
||||
|
||||
**Kelly derives from:** Expected value (edge), odds received, bankroll optimization (maximize log wealth).
|
||||
|
||||
**Key relationship:** `f* = (bp - q) / b`. Edge drives bet size: 10% edge → ~10% Kelly, 5% edge → ~5% Kelly, 0% edge → 0% bet.
|
||||
|
||||
### Variance Tolerance
|
||||
|
||||
| Fraction | Variance | Growth | Drawdown |
|
||||
|----------|----------|--------|----------|
|
||||
| Full (1.0) | 100% | 100% | ~40% |
|
||||
| Half (0.5) | 25% | 75% | ~20% |
|
||||
| Quarter (0.25) | 6% | 50% | ~10% |
|
||||
|
||||
### Bankruptcy Protection
|
||||
|
||||
Kelly never bets 100%: prevents ruin, keeps capital for next bet, scales down as bankroll shrinks. **Practical:** Stop if bankroll drops 80-90%.
|
||||
|
||||
---
|
||||
|
||||
## 7. Practical Examples for Forecasters
|
||||
|
||||
### Example 1: Election Prediction Market
|
||||
|
||||
**Scenario:** Market 55%, your forecast 65%, bankroll $2000
|
||||
|
||||
**Step 1: Edge**
|
||||
```
|
||||
Edge = 65% - 55% = +10%
|
||||
Threshold: 5%
|
||||
Decision: +10% > 5% → Proceed
|
||||
```
|
||||
|
||||
**Step 2: EV**
|
||||
```
|
||||
Bet $100 at 1.82 odds → Win $82
|
||||
EV = (0.65 × $82) - (0.35 × $100) = +$18.30 = +18.3%
|
||||
```
|
||||
|
||||
**Step 3: Kelly**
|
||||
```
|
||||
Full Kelly: 22.3%
|
||||
Half Kelly: 11.2%
|
||||
Bet: $2000 × 11.2% = $224
|
||||
```
|
||||
|
||||
### Example 2: Brier Score Tracking
|
||||
|
||||
**50 forecasts, goal: Brier < 0.15**
|
||||
|
||||
| Forecast | Your P | Outcome | (P-O)² |
|
||||
|----------|--------|---------|--------|
|
||||
| Event A | 80% | YES (1) | 0.04 |
|
||||
| Event B | 30% | NO (0) | 0.09 |
|
||||
| Event C | 90% | YES (1) | 0.01 |
|
||||
| Event D | 60% | NO (0) | 0.36 |
|
||||
| Event E | 70% | YES (1) | 0.09 |
|
||||
|
||||
**Brier:** 0.59 / 5 = 0.118 (Excellent)
|
||||
|
||||
**Analysis:** Event D large error normal (40% events happen). Don't game metric by avoiding 60% predictions.
|
||||
|
||||
### Example 3: Extremizing
|
||||
|
||||
**Forecasts:** You 72%, A 68%, B 75%, C 70%, Market 71%
|
||||
**Average:** 71.2%
|
||||
|
||||
**Extremize:**
|
||||
```
|
||||
Factor: 1.3 (moderate)
|
||||
Extremized = 50% + (71.2% - 50%) × 1.3 = 77.6% ≈ 78%
|
||||
|
||||
Edge: 78% - 71% = +7%
|
||||
Half Kelly ≈ 3.5% of $5000 = $175 bet
|
||||
```
|
||||
|
||||
### Example 4: Correlated Portfolio
|
||||
|
||||
**Scenario:** Democrats House (60% yours, 55% market) + Senate (55% yours, 50% market)
|
||||
**Correlation:** 0.7 (high)
|
||||
|
||||
**Naive (WRONG):**
|
||||
```
|
||||
Bet A: 5% × $10k = $500
|
||||
Bet B: 5% × $10k = $500
|
||||
Total: $1000 (10%)
|
||||
```
|
||||
|
||||
**Correct:**
|
||||
```
|
||||
Adjust for correlation: 1 - (0.7 × 0.5) = 0.65
|
||||
Bet A: $500 × 0.65 = $325
|
||||
Bet B: $500 × 0.65 = $325
|
||||
Total: $650 (6.5%)
|
||||
```
|
||||
|
||||
**Reasoning:** Positive correlation amplifies risk. Reduce sizing to maintain tolerance.
|
||||
|
||||
---
|
||||
|
||||
## Key Takeaways
|
||||
|
||||
### The 10 Commandments
|
||||
|
||||
1. **Expected Value is King** - Judge decisions by EV, not outcomes
|
||||
2. **Variance is Inevitable** - Embrace it; don't fight it
|
||||
3. **Bankroll is Sacred** - Protect it above all else
|
||||
4. **Kelly is Your Guide** - But use fractional (1/4 to 1/2)
|
||||
5. **Market is Usually Right** - You need edge to beat it
|
||||
6. **Discipline Over Impulse** - System beats emotion
|
||||
7. **Sample Size Matters** - 50+ bets before judgment
|
||||
8. **Calibration is Honesty** - Track it religiously
|
||||
9. **Correlations Kill** - Adjust for portfolio risk
|
||||
10. **Survival Enables Profit** - Can't win if bankrupt
|
||||
|
||||
### Mental Models
|
||||
|
||||
**Betting = Business**
|
||||
- Bankroll = Working capital
|
||||
- EV = Profit margin
|
||||
- Variance = Market volatility
|
||||
- Kelly = Capital allocation
|
||||
|
||||
**Decision Quality ≠ Outcome Quality**
|
||||
- Good decisions lose sometimes (variance)
|
||||
- Bad decisions win sometimes (luck)
|
||||
- Process > Results (small samples)
|
||||
- Results > Process (large samples 100+)
|
||||
|
||||
### Integration Workflow
|
||||
|
||||
**Before betting:**
|
||||
1. Make forecast (Bayesian, reference class)
|
||||
2. Calculate edge vs market
|
||||
3. Check edge > threshold (5%+)
|
||||
4. Use Kelly for sizing
|
||||
5. Execute and log
|
||||
|
||||
**After betting:**
|
||||
1. Track outcome
|
||||
2. Update calibration
|
||||
3. Calculate Brier score
|
||||
4. Don't judge single bet
|
||||
5. Evaluate after 50+ bets
|
||||
|
||||
---
|
||||
|
||||
**Return to:** [Main Skill](../SKILL.md#interactive-menu)
|
||||
494
skills/market-mechanics-betting/resources/kelly-criterion.md
Normal file
494
skills/market-mechanics-betting/resources/kelly-criterion.md
Normal file
@@ -0,0 +1,494 @@
|
||||
# Kelly Criterion Deep Dive
|
||||
|
||||
Mathematical foundation for optimal bet sizing under uncertainty.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Mathematical Derivation](#1-mathematical-derivation)
|
||||
2. [Formula Variations](#2-formula-variations)
|
||||
3. [Fractional Kelly](#3-fractional-kelly)
|
||||
4. [Extensions](#4-extensions)
|
||||
5. [Common Mistakes](#5-common-mistakes)
|
||||
6. [Practical Implementation](#6-practical-implementation)
|
||||
7. [Historical Examples](#7-historical-examples)
|
||||
8. [Comparison to Other Methods](#8-comparison-to-other-methods)
|
||||
|
||||
---
|
||||
|
||||
## 1. Mathematical Derivation
|
||||
|
||||
### The Core Question
|
||||
|
||||
**Problem**: What fraction of your bankroll maximizes long-term growth?
|
||||
|
||||
**Why it matters**: Bet too little → Leave money on the table. Bet too much → Risk ruin, high variance.
|
||||
|
||||
### Logarithmic Utility Framework
|
||||
|
||||
**Key insight**: Maximize expected logarithm of wealth, not expected wealth.
|
||||
|
||||
**Why log utility?**
|
||||
- Captures diminishing marginal utility ($1 matters more when you have $100 vs $1M)
|
||||
- Makes repeated multiplicative bets additive: log(AB) = log(A) + log(B)
|
||||
- Geometric mean emerges naturally (what matters for repeated bets)
|
||||
- Prevents betting 100% (avoids ruin)
|
||||
|
||||
### Derivation for Binary Bet
|
||||
|
||||
**Setup**:
|
||||
- Current bankroll: W
|
||||
- Bet fraction: f
|
||||
- Win probability: p, Loss probability: q = 1 - p
|
||||
- Net odds: b (bet $1, win $b net)
|
||||
|
||||
**Outcomes**:
|
||||
- Win (probability p): New wealth = W(1 + fb)
|
||||
- Lose (probability q): New wealth = W(1 - f)
|
||||
|
||||
**Expected log utility**:
|
||||
```
|
||||
E[log(W_new)] = p × log(1 + fb) + q × log(1 - f) + log(W)
|
||||
```
|
||||
|
||||
**Objective**: Maximize g(f) = p × log(1 + fb) + q × log(1 - f)
|
||||
|
||||
### Finding the Optimum
|
||||
|
||||
**Take derivative**:
|
||||
```
|
||||
dg/df = pb/(1 + fb) - q/(1 - f)
|
||||
```
|
||||
|
||||
**Set equal to zero and solve**:
|
||||
```
|
||||
pb/(1 + fb) = q/(1 - f)
|
||||
pb(1 - f) = q(1 + fb)
|
||||
pb - pbf = q + qfb
|
||||
pb - q = f(pb + qb) = fb(p + q) = fb
|
||||
|
||||
f* = (pb - q) / b = (bp - q) / b
|
||||
```
|
||||
|
||||
**The Kelly Criterion**:
|
||||
```
|
||||
f* = (bp - q) / b
|
||||
|
||||
Where:
|
||||
f* = Optimal fraction to bet
|
||||
b = Net odds received
|
||||
p = Win probability
|
||||
q = 1 - p
|
||||
```
|
||||
|
||||
### Alternative Form
|
||||
|
||||
**Edge** = Expected return per dollar bet = bp - q
|
||||
|
||||
**Kelly formula**: f* = Edge / Odds = (bp - q) / b
|
||||
|
||||
**Example**: p = 60%, b = 1.0 (even money)
|
||||
- Edge = 0.6 × 1 - 0.4 = 0.2
|
||||
- f* = 0.2 / 1 = 20%
|
||||
|
||||
### Optimality
|
||||
|
||||
**Second derivative**: d²g/df² < 0 at f = f* → Maximum confirmed
|
||||
|
||||
**Growth rate**: G(f*) maximizes long-run geometric growth
|
||||
|
||||
**Comparison**:
|
||||
- f < f*: Lower growth (too conservative)
|
||||
- f > f*: Lower growth (too aggressive, variance dominates)
|
||||
- f > 2f*: Negative growth (eventual ruin)
|
||||
|
||||
---
|
||||
|
||||
## 2. Formula Variations
|
||||
|
||||
### Converting Market Odds
|
||||
|
||||
**Decimal odds** (e.g., 2.50): b = Decimal - 1 = 1.50
|
||||
|
||||
**American odds**:
|
||||
- Positive (+150): b = 150/100 = 1.50
|
||||
- Negative (-150): b = 100/150 = 0.667
|
||||
|
||||
**Fractional odds** (3/1): b = 3.0
|
||||
|
||||
**Implied probability**: Market p = 1/(b + 1)
|
||||
|
||||
### Multi-Outcome Bet
|
||||
|
||||
**Horse race**: Multiple options, bet on any with positive Kelly
|
||||
|
||||
**Formula for outcome i**:
|
||||
```
|
||||
f_i* = (p_i(b_i + 1) - 1) / b_i
|
||||
|
||||
If f_i* > 0: Bet f_i* on outcome i
|
||||
If f_i* ≤ 0: Don't bet
|
||||
```
|
||||
|
||||
### Continuous Outcomes (Merton's Formula)
|
||||
|
||||
**Stock market application**:
|
||||
```
|
||||
f* = μ / σ²
|
||||
|
||||
Where:
|
||||
μ = Expected return (drift)
|
||||
σ² = Variance of returns
|
||||
```
|
||||
|
||||
**Example**: μ = 8%, σ = 20% → f* = 0.08/0.04 = 2.0 (200%, use leverage)
|
||||
|
||||
**Reality**: Too aggressive, use fractional Kelly → 50-100% more reasonable
|
||||
|
||||
---
|
||||
|
||||
## 3. Fractional Kelly
|
||||
|
||||
### Why Fractional Kelly?
|
||||
|
||||
**Problems with full Kelly**:
|
||||
1. **Extreme volatility**: Wild swings, can lose 50%+ in bad runs
|
||||
2. **Model error**: If probability estimate wrong, full Kelly overbets dramatically
|
||||
3. **Practical ruin**: 20% chance of 50% drawdown before doubling
|
||||
4. **Non-ergodic**: Most can't bet infinitely many times
|
||||
|
||||
### Formula
|
||||
|
||||
```
|
||||
Fractional Kelly = f* × Fraction
|
||||
|
||||
Common choices:
|
||||
- Half Kelly: f*/2
|
||||
- Quarter Kelly: f*/4 (recommended)
|
||||
- Third Kelly: f*/3
|
||||
```
|
||||
|
||||
### Growth vs. Variance Trade-off
|
||||
|
||||
| Strategy | Growth Rate | Volatility | Max Drawdown |
|
||||
|----------|-------------|------------|--------------|
|
||||
| Full Kelly | 100% | 100% | -50% |
|
||||
| Half Kelly | ~75% | 50% | -25% |
|
||||
| Quarter Kelly | ~50% | 25% | -12% |
|
||||
|
||||
**Key**: Half Kelly gives 75% of growth with 25% of variance → Better risk-adjusted return
|
||||
|
||||
### Robustness to Error
|
||||
|
||||
**Example**: You think p = 0.60, true p = 0.55, even money bet
|
||||
|
||||
**Full Kelly** (f = 20%):
|
||||
- Growth rate = 0.55×log(1.20) + 0.45×log(0.80) ≈ 0 (breakeven!)
|
||||
|
||||
**Half Kelly** (f = 10%):
|
||||
- Growth rate = 0.55×log(1.10) + 0.45×log(0.90) ≈ 0.005 (still positive)
|
||||
|
||||
**Lesson**: Overbetting much worse than underbetting. Fractional Kelly provides buffer.
|
||||
|
||||
### Recommended Fractions
|
||||
|
||||
| Situation | Fraction | Reasoning |
|
||||
|-----------|----------|-----------|
|
||||
| Professional gambler | 1/4 to 1/3 | Reduces career risk |
|
||||
| High model uncertainty | 1/4 or less | Error buffer crucial |
|
||||
| High confidence | 1/2 to 2/3 | Can use more aggression |
|
||||
| Institutional | 1/4 to 1/3 | Drawdown = career risk |
|
||||
|
||||
**Default**: Quarter Kelly (1/4) for most real-world situations
|
||||
|
||||
---
|
||||
|
||||
## 4. Extensions
|
||||
|
||||
### Multiple Simultaneous Bets
|
||||
|
||||
**Matrix form** (N assets):
|
||||
```
|
||||
f* = Σ⁻¹ × μ
|
||||
|
||||
Where:
|
||||
Σ = Covariance matrix
|
||||
μ = Expected returns vector
|
||||
```
|
||||
|
||||
**Key insight**: Correlated bets reduce optimal sizing
|
||||
|
||||
**Heuristic**: Adjusted Kelly = Individual Kelly × (1 - ρ/2), where ρ = correlation
|
||||
|
||||
**Example**: ρ = 0.6, Individual Kelly = 15%
|
||||
- Adjusted: 15% × (1 - 0.3) = 10.5%
|
||||
|
||||
### Correlated Outcomes
|
||||
|
||||
**Common correlations**:
|
||||
- Political: Presidential + Senate races
|
||||
- Sports: Team championship + Player MVP
|
||||
- Markets: Tech stock A + Tech stock B
|
||||
|
||||
**Extreme cases**:
|
||||
- ρ = 1 (perfect correlation): Only bet on one
|
||||
- ρ = -1 (negative correlation): Bets hedge, can bet more
|
||||
- ρ = 0 (independent): No adjustment
|
||||
|
||||
### Dynamic Kelly
|
||||
|
||||
**Problem**: Probability changes over time (new information)
|
||||
|
||||
**Process**:
|
||||
1. Start with p₀, bet f₀*
|
||||
2. New information → Update to p₁ (Bayesian)
|
||||
3. Recalculate f₁*
|
||||
4. Rebalance (adjust bet size)
|
||||
|
||||
**Consideration**: Transaction costs limit rebalancing frequency
|
||||
|
||||
---
|
||||
|
||||
## 5. Common Mistakes
|
||||
|
||||
### Mistake 1: Full Kelly Overbet
|
||||
|
||||
**The error**: Using full Kelly in practice
|
||||
|
||||
**Why wrong**: Assumes perfect probability estimate (never true)
|
||||
|
||||
**Impact**: Bet 2×f* → Negative growth rate
|
||||
|
||||
**Fix**: Always use fractional Kelly (1/4 to 1/2)
|
||||
|
||||
### Mistake 2: Ignoring Model Error
|
||||
|
||||
**The error**: Treating probability as certain
|
||||
|
||||
**Adjustment**:
|
||||
```
|
||||
Uncertain Kelly = f* × Confidence
|
||||
|
||||
Example: f* = 20%, 80% confident → Bet 16%
|
||||
```
|
||||
|
||||
**Better**: Use fractional Kelly (implicitly adjusts)
|
||||
|
||||
### Mistake 3: Neglecting Bankruptcy
|
||||
|
||||
**Reality**: Finite games + estimation error → real ruin risk
|
||||
|
||||
**Drawdown stats** (full Kelly, p=0.55):
|
||||
- 25% chance of -40% before recovery
|
||||
- 10% chance of -50% before recovery
|
||||
|
||||
**Practical bankruptcy**: Client fires you, forced liquidation, can't maintain discipline
|
||||
|
||||
**Fix**: Use fractional Kelly, set stop-loss (if down 25%, pause)
|
||||
|
||||
### Mistake 4: Ignoring Correlation
|
||||
|
||||
**Example disaster**:
|
||||
- 10 bets, each Kelly 10%
|
||||
- All highly correlated (same theme)
|
||||
- Bet 100% total → Single adverse event → Large loss
|
||||
|
||||
**Fix**: Measure correlations, use portfolio Kelly, diversify themes
|
||||
|
||||
### Mistake 5: Misestimating Odds
|
||||
|
||||
**Common confusion**:
|
||||
- Decimal 2.0: b = 1.0 (not 2.0)
|
||||
- "3-to-1": b = 3.0 ✓
|
||||
- American +200: b = 2.0 (not 200)
|
||||
|
||||
**Fix**: Always convert to NET payout (b = total return - 1)
|
||||
|
||||
### Mistake 6: Static Bankroll
|
||||
|
||||
**Problem**: Calculate once, never update
|
||||
|
||||
**Fix**: Recalculate before each bet using current bankroll
|
||||
|
||||
---
|
||||
|
||||
## 6. Practical Implementation
|
||||
|
||||
### Step-by-Step Process
|
||||
|
||||
**1. Convert odds to decimal**:
|
||||
```python
|
||||
# Decimal odds: b = decimal - 1
|
||||
# American +150: b = 150/100 = 1.50
|
||||
# American -150: b = 100/150 = 0.667
|
||||
# Fractional 3/1: b = 3.0
|
||||
```
|
||||
|
||||
**2. Determine probability**: Use forecasting process (base rates, Bayesian updating, etc.)
|
||||
|
||||
**3. Calculate edge**:
|
||||
```python
|
||||
edge = net_odds * probability - (1 - probability)
|
||||
```
|
||||
|
||||
**4. Calculate Kelly**:
|
||||
```python
|
||||
kelly_fraction = edge / net_odds
|
||||
```
|
||||
|
||||
**5. Apply fractional Kelly**:
|
||||
```python
|
||||
fraction = 0.25 # Quarter Kelly recommended
|
||||
adjusted_kelly = kelly_fraction * fraction
|
||||
```
|
||||
|
||||
**6. Calculate bet size**:
|
||||
```python
|
||||
bet_size = current_bankroll * adjusted_kelly
|
||||
```
|
||||
|
||||
**7. Execute and track**:
|
||||
- Record: Date, event, probability, odds, edge, Kelly%, bet
|
||||
- Set reminder for resolution
|
||||
- Note new information
|
||||
|
||||
### Position Tracking Template
|
||||
|
||||
```
|
||||
Date: 2024-01-15
|
||||
Event: Candidate A wins
|
||||
Your probability: 65%
|
||||
Market odds: 2.20 (implied 45%)
|
||||
Net odds (b): 1.20
|
||||
Edge: 0.43 (43%)
|
||||
Full Kelly: 35.8%
|
||||
Fractional (1/4): 8.9%
|
||||
Bankroll: $10,000
|
||||
Bet size: $890
|
||||
Resolution: 2024-11-05
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Historical Examples
|
||||
|
||||
### Ed Thorp - Blackjack (1960s)
|
||||
|
||||
**Application**: Card counting edge varies with count → Dynamic Kelly
|
||||
|
||||
**Implementation**:
|
||||
- True count +1: Edge ~0.5%, bet ~0.5% of bankroll
|
||||
- True count +5: Edge ~2.5%, bet ~2.5% of bankroll
|
||||
|
||||
**Results**: Turned $10k into $100k+, proved Kelly works in practice
|
||||
|
||||
**Lessons**: Used fractional Kelly (~1/2), dynamic sizing, managed "heat" (detection risk)
|
||||
|
||||
### Princeton-Newport Partners (1970s-1980s)
|
||||
|
||||
**Strategy**: Statistical arbitrage, convertible bonds
|
||||
|
||||
**Kelly application**: 1-3% per position, 50-100 positions (diversification)
|
||||
|
||||
**Results**: 19.1% annual (1969-1988), only 4 down months in 19 years, <5% max drawdown
|
||||
|
||||
**Lessons**: Fractional Kelly + diversification = low volatility, dominant strategy
|
||||
|
||||
### Renaissance Technologies / Medallion Fund
|
||||
|
||||
**Strategy**: Thousands of small edges, high frequency
|
||||
|
||||
**Kelly application**:
|
||||
- Each signal: 0.1-0.5% (tiny fractional Kelly)
|
||||
- Portfolio: 10,000+ positions
|
||||
- Leverage: 2-4× (portfolio Kelly supports with diversification)
|
||||
|
||||
**Results**: 66% annual (gross) over 30+ years, never down year
|
||||
|
||||
**Lessons**: Kelly optimal for repeated bets with edge. Diversification enables leverage. Discipline crucial.
|
||||
|
||||
### Warren Buffett (Implicit Kelly)
|
||||
|
||||
**Concentrated bets**: American Express (40%), Coca-Cola (25%), Apple (40%)
|
||||
|
||||
**Why Kelly-like**: High conviction → High p → Large Kelly → Large position
|
||||
|
||||
**Quote**: "Diversification is protection against ignorance."
|
||||
|
||||
**Lessons**: Kelly justifies concentration with edge. Still uses fractional (~40% max, not 100%).
|
||||
|
||||
---
|
||||
|
||||
## 8. Comparison to Other Methods
|
||||
|
||||
### Fixed Fraction
|
||||
|
||||
**Method**: Always bet same percentage
|
||||
|
||||
**Pros**: Simple, prevents ruin
|
||||
|
||||
**Cons**: Ignores edge, suboptimal growth
|
||||
|
||||
**When to use**: Don't trust probability estimates, want simplicity
|
||||
|
||||
### Martingale (Double After Loss)
|
||||
|
||||
**Method**: Double bet after each loss
|
||||
|
||||
**Fatal flaws**:
|
||||
- Requires infinite bankroll
|
||||
- Exponential growth (10 losses → need $10,240)
|
||||
- Negative edge → lose faster
|
||||
- Betting limits prevent recovery
|
||||
|
||||
**Conclusion**: **NEVER use**. Mathematically certain to fail.
|
||||
|
||||
### Fixed Amount
|
||||
|
||||
**Method**: Always bet same dollar amount
|
||||
|
||||
**Cons**: As bankroll changes, fraction changes inappropriately
|
||||
|
||||
**When to use**: Very small recreational betting
|
||||
|
||||
### Constant Proportion
|
||||
|
||||
**Method**: Fixed percentage, not optimized for edge
|
||||
|
||||
**Difference from Kelly**: Doesn't adjust for edge/odds
|
||||
|
||||
**Conclusion**: Better than fixed dollar, worse than Kelly
|
||||
|
||||
### Risk Parity
|
||||
|
||||
**Method**: Allocate to equalize risk contribution
|
||||
|
||||
**Difference from Kelly**: Doesn't use expected returns (ignores edge)
|
||||
|
||||
**When better**: Don't have reliable return estimates, defensive portfolio
|
||||
|
||||
**When Kelly better**: Have edge estimates, goal is growth
|
||||
|
||||
### Summary Comparison
|
||||
|
||||
| Method | Growth | Ruin Risk | When to Use |
|
||||
|--------|--------|-----------|-------------|
|
||||
| **Kelly** | Highest | None* | Active betting with edge |
|
||||
| **Fractional Kelly** | High | Very low | **Real-world (recommended)** |
|
||||
| **Fixed Fraction** | Medium | Low | Simple discipline |
|
||||
| **Fixed Amount** | Low | Medium | Recreational only |
|
||||
| **Martingale** | Negative | Certain | **NEVER** |
|
||||
| **Risk Parity** | Low-Med | Low | Defensive portfolios |
|
||||
|
||||
*Kelly theoretically no ruin risk, but model error creates practical risk → use fractional Kelly
|
||||
|
||||
**Final Recommendation**: **Quarter Kelly (f*/4)** for nearly all real-world scenarios.
|
||||
|
||||
---
|
||||
|
||||
## Return to Main Skill
|
||||
|
||||
[← Back to Market Mechanics & Betting](../skill.md)
|
||||
|
||||
**Related resources**:
|
||||
- [Betting Theory Fundamentals](betting-theory.md)
|
||||
- [Scoring Rules and Calibration](scoring-rules.md)
|
||||
494
skills/market-mechanics-betting/resources/scoring-rules.md
Normal file
494
skills/market-mechanics-betting/resources/scoring-rules.md
Normal file
@@ -0,0 +1,494 @@
|
||||
# Scoring Rules and Calibration
|
||||
|
||||
Comprehensive guide to proper scoring rules, calibration measurement, and forecast accuracy improvement.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Proper Scoring Rules](#1-proper-scoring-rules)
|
||||
2. [Brier Score Deep Dive](#2-brier-score-deep-dive)
|
||||
3. [Log Score](#3-log-score-logarithmic-scoring-rule)
|
||||
4. [Calibration Curves](#4-calibration-curves)
|
||||
5. [Resolution Analysis](#5-resolution-analysis)
|
||||
6. [Sharpness](#6-sharpness)
|
||||
7. [Practical Calibration Training](#7-practical-calibration-training)
|
||||
8. [Comparison Table](#8-comparison-table-of-scoring-rules)
|
||||
|
||||
---
|
||||
|
||||
## 1. Proper Scoring Rules
|
||||
|
||||
### What is a Scoring Rule?
|
||||
|
||||
A **scoring rule** assigns a numerical score to a probabilistic forecast based on the forecast and actual outcome.
|
||||
|
||||
**Purpose:** Measure accuracy, incentivize honesty, enable comparison, track calibration over time.
|
||||
|
||||
### Strictly Proper vs Quasi-Proper
|
||||
|
||||
**Strictly Proper:** Reporting your true belief uniquely maximizes your expected score. No other probability gives better expected score.
|
||||
|
||||
**Why it matters:** Incentivizes honesty, eliminates gaming, optimizes for accurate beliefs.
|
||||
|
||||
**Quasi-Proper:** True belief maximizes score, but other probabilities might tie. Less desirable for forecasting.
|
||||
|
||||
### Common Proper Scoring Rules
|
||||
|
||||
**1. Brier Score** (strictly proper)
|
||||
```
|
||||
Score = -(p - o)²
|
||||
p = Your probability (0 to 1)
|
||||
o = Outcome (0 or 1)
|
||||
```
|
||||
|
||||
**2. Logarithmic Score** (strictly proper)
|
||||
```
|
||||
Score = log(p) if outcome occurs
|
||||
Score = log(1-p) if outcome doesn't occur
|
||||
```
|
||||
|
||||
**3. Spherical Score** (strictly proper)
|
||||
```
|
||||
Score = p / √(p² + (1-p)²) if outcome occurs
|
||||
```
|
||||
|
||||
### Common IMPROPER Scoring Rules (Avoid)
|
||||
|
||||
**Absolute Error:** `Score = -|p - o|` → Incentivizes extremes (NOT proper)
|
||||
|
||||
**Threshold Accuracy:** Binary right/wrong → Ignores calibration (NOT proper)
|
||||
|
||||
**Example of gaming improper rules:**
|
||||
```
|
||||
Using absolute error (improper):
|
||||
True belief: 60% → Optimal report: 100% (dishonest)
|
||||
|
||||
Using Brier score (proper):
|
||||
True belief: 60% → Optimal report: 60% (honest)
|
||||
```
|
||||
|
||||
**Key Principle:** Only use strictly proper scoring rules for forecast evaluation.
|
||||
|
||||
---
|
||||
|
||||
## 2. Brier Score Deep Dive
|
||||
|
||||
### Formula
|
||||
|
||||
**Single forecast:** `Brier = (p - o)²`
|
||||
|
||||
**Multiple forecasts:** `Brier = (1/N) × Σ(pi - oi)²`
|
||||
|
||||
**Range:** 0.00 (perfect) to 1.00 (worst). Lower is better.
|
||||
|
||||
### Calculation Examples
|
||||
|
||||
```
|
||||
90% Yes → (0.90-1)² = 0.01 (good) | 90% No → (0.90-0)² = 0.81 (bad)
|
||||
60% Yes → (0.60-1)² = 0.16 (medium) | 50% Any → 0.25 (baseline)
|
||||
```
|
||||
|
||||
### Brier Score Decomposition
|
||||
|
||||
**Murphy Decomposition:**
|
||||
```
|
||||
Brier Score = Reliability - Resolution + Uncertainty
|
||||
```
|
||||
|
||||
**Reliability (Calibration Error):** Are your probabilities correct on average? (Lower is better)
|
||||
|
||||
**Resolution:** Do you assign different probabilities to different outcomes? (Higher is better)
|
||||
|
||||
**Uncertainty:** Base rate variance (uncontrollable, depends on problem)
|
||||
|
||||
**Improving Brier:**
|
||||
1. Minimize reliability (fix calibration)
|
||||
2. Maximize resolution (differentiate forecasts)
|
||||
|
||||
### Brier Score Interpretation
|
||||
|
||||
| Brier Score | Quality | Description |
|
||||
|-------------|---------|-------------|
|
||||
| 0.00 - 0.05 | Exceptional | Near-perfect |
|
||||
| 0.05 - 0.10 | Excellent | Top tier |
|
||||
| 0.10 - 0.15 | Good | Skilled |
|
||||
| 0.15 - 0.20 | Average | Better than random |
|
||||
| 0.20 - 0.25 | Below Average | Approaching random |
|
||||
| 0.25+ | Poor | At or worse than random |
|
||||
|
||||
**Context matters:** Easy questions expect lower scores. Compare to baseline (0.25) and other forecasters.
|
||||
|
||||
### Improving Your Brier Score
|
||||
|
||||
**Path 1: Fix Calibration**
|
||||
|
||||
**If overconfident:** 80% predictions happen 60% → Be less extreme, widen intervals
|
||||
|
||||
**If underconfident:** 60% predictions happen 80% → Be more extreme when you have evidence
|
||||
|
||||
**Path 2: Improve Resolution**
|
||||
|
||||
**Problem:** All forecasts near 50% → Differentiate easy vs hard questions, research more, be bold when warranted
|
||||
|
||||
**Balance:** `Good Forecaster = Well-Calibrated + High Resolution`
|
||||
|
||||
### Brier Skill Score
|
||||
|
||||
```
|
||||
BSS = 1 - (Your Brier / Baseline Brier)
|
||||
|
||||
Example:
|
||||
Your Brier: 0.12, Baseline: 0.25
|
||||
BSS = 1 - 0.48 = 0.52 (52% improvement over baseline)
|
||||
```
|
||||
|
||||
**Interpretation:** BSS = 1.00 (perfect), 0.00 (same as baseline), <0 (worse than baseline)
|
||||
|
||||
---
|
||||
|
||||
## 3. Log Score (Logarithmic Scoring Rule)
|
||||
|
||||
### Formula
|
||||
|
||||
```
|
||||
Log Score = log₂(p) if outcome occurs
|
||||
Log Score = log₂(1-p) if outcome doesn't occur
|
||||
|
||||
Range: -∞ (worst) to 0 (perfect)
|
||||
Higher (less negative) is better
|
||||
```
|
||||
|
||||
### Calculation Examples
|
||||
|
||||
```
|
||||
90% Yes → -0.15 | 90% No → -3.32 (severe) | 50% Yes → -1.00
|
||||
99% No → -6.64 (catastrophic penalty for overconfidence)
|
||||
```
|
||||
|
||||
### Relationship to Information Theory
|
||||
|
||||
**Log score measures bits of surprise:**
|
||||
```
|
||||
Surprise = -log₂(p)
|
||||
|
||||
p = 50% → 1 bit surprise
|
||||
p = 25% → 2 bits surprise
|
||||
p = 12.5% → 3 bits surprise
|
||||
```
|
||||
|
||||
**Connection to entropy:** Log score equals cross-entropy between forecast distribution and true outcome.
|
||||
|
||||
### When to Use Log Score vs Brier
|
||||
|
||||
**Use Log Score when:**
|
||||
- Severe penalty for overconfidence desired
|
||||
- Tail risk matters (rare events important)
|
||||
- Information-theoretic interpretation useful
|
||||
- Comparing probabilistic models
|
||||
|
||||
**Use Brier Score when:**
|
||||
- Human forecasters (less punishing)
|
||||
- Easier interpretation (squared error)
|
||||
- Standard benchmark (more common)
|
||||
- Avoiding extreme penalties
|
||||
|
||||
**Key Difference:**
|
||||
|
||||
Brier: Quadratic penalty (grows with square)
|
||||
```
|
||||
Error: 10% → 0.01, 20% → 0.04, 30% → 0.09, 40% → 0.16
|
||||
```
|
||||
|
||||
Log: Logarithmic penalty (grows faster for extremes)
|
||||
```
|
||||
Forecast: 90% wrong → -3.3, 95% wrong → -4.3, 99% wrong → -6.6
|
||||
```
|
||||
|
||||
**Recommendation:** Default to Brier. Add Log for high-stakes or to penalize overconfidence. Track both for complete picture.
|
||||
|
||||
---
|
||||
|
||||
## 4. Calibration Curves
|
||||
|
||||
### What is a Calibration Curve?
|
||||
|
||||
**Visualization of forecast accuracy:**
|
||||
```
|
||||
Y-axis: Actual frequency (how often outcome occurred)
|
||||
X-axis: Stated probability (your forecasts)
|
||||
Perfect calibration: Diagonal line (y = x)
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Actual %
|
||||
100 ┤ ╱
|
||||
80 ┤ ●
|
||||
60 ┤ ●
|
||||
40 ┤ ● ← Perfect calibration line
|
||||
20 ┤ ●
|
||||
0 └───────────────────────
|
||||
0 20 40 60 80 100
|
||||
Stated probability %
|
||||
```
|
||||
|
||||
### How to Create
|
||||
|
||||
**Step 1:** Collect 50+ forecasts and outcomes
|
||||
|
||||
**Step 2:** Bin by probability (0-10%, 10-20%, ..., 90-100%)
|
||||
|
||||
**Step 3:** For each bin, calculate actual frequency
|
||||
```
|
||||
Example: 60-70% bin
|
||||
Forecasts: 15 total, Outcomes: 9 Yes, 6 No
|
||||
Actual frequency: 9/15 = 60%
|
||||
Plot point: (65, 60)
|
||||
```
|
||||
|
||||
**Step 4:** Draw perfect calibration line (diagonal from (0,0) to (100,100))
|
||||
|
||||
**Step 5:** Compare points to line
|
||||
|
||||
### Over/Under Confidence Detection
|
||||
|
||||
**Overconfidence:** Points below diagonal (said 90%, happened 70%). Fix: Be less extreme, widen intervals.
|
||||
|
||||
**Underconfidence:** Points above diagonal (said 90%, happened 95%). Fix: Be more extreme when evidence is strong.
|
||||
|
||||
**Sample size:** <10/bin unreliable, 10-20 weak, 20-50 moderate, 50+ strong evidence
|
||||
|
||||
---
|
||||
|
||||
## 5. Resolution Analysis
|
||||
|
||||
### What is Resolution?
|
||||
|
||||
**Resolution** measures ability to assign different probabilities to outcomes that actually differ.
|
||||
|
||||
**High resolution:** Events you call 90% happen much more than events you call 10% (good)
|
||||
|
||||
**Low resolution:** All forecasts near 50%, can't discriminate (bad)
|
||||
|
||||
### Formula
|
||||
|
||||
```
|
||||
Resolution = (1/N) × Σ nk(ok - ō)²
|
||||
|
||||
nk = Forecasts in bin k
|
||||
ok = Actual frequency in bin k
|
||||
ō = Overall base rate
|
||||
|
||||
Higher is better
|
||||
```
|
||||
|
||||
### How to Improve Resolution
|
||||
|
||||
**Problem: Stuck at 50%**
|
||||
|
||||
Bad pattern: All forecasts 48-52% → Low resolution
|
||||
|
||||
Good pattern: Range from 20% to 90% → High resolution
|
||||
|
||||
**Strategies:**
|
||||
|
||||
1. **Gather discriminating information** - Find features that distinguish outcomes
|
||||
2. **Use decomposition** - Fermi, causal models, scenarios
|
||||
3. **Be bold when warranted** - If evidence strong → Say 85% not 65%
|
||||
4. **Update with evidence** - Start with base rate, update with Bayesian reasoning
|
||||
|
||||
### Calibration vs Resolution Tradeoff
|
||||
|
||||
```
|
||||
Perfect Calibration Only: Say 60% for everything when base rate is 60%
|
||||
→ Calibration: Perfect
|
||||
→ Resolution: Zero
|
||||
→ Brier: 0.24 (bad)
|
||||
|
||||
High Resolution Only: Say 10% or 90% (extremes) incorrectly
|
||||
→ Calibration: Poor
|
||||
→ Resolution: High
|
||||
→ Brier: Terrible
|
||||
|
||||
Optimal Balance: Well-calibrated AND high resolution
|
||||
→ Calibration: Good
|
||||
→ Resolution: High
|
||||
→ Brier: Minimized
|
||||
```
|
||||
|
||||
**Best forecasters:** Well-calibrated (low reliability error) + High resolution (discriminate events) = Low Brier
|
||||
|
||||
**Recommendation:** Don't sacrifice resolution for perfect calibration. Be bold when evidence warrants.
|
||||
|
||||
---
|
||||
|
||||
## 6. Sharpness
|
||||
|
||||
### What is Sharpness?
|
||||
|
||||
**Sharpness** = Tendency to make extreme predictions (away from 50%) when appropriate.
|
||||
|
||||
**Sharp:** Predicts 5% or 95% when evidence supports it (decisive)
|
||||
|
||||
**Unsharp:** Stays near 50% (plays it safe, indecisive)
|
||||
|
||||
### Why Sharpness Matters
|
||||
|
||||
```
|
||||
Scenario: Base rate 60%
|
||||
|
||||
Unsharp forecaster: 50% for every event → Brier: 0.24, Usefulness: Low
|
||||
Sharp forecaster: Range 20-90% → Brier: 0.12 (if calibrated), Usefulness: High
|
||||
```
|
||||
|
||||
**Insight:** Extreme predictions (when accurate) improve Brier significantly. When wrong, hurt badly. Solution: Be sharp when you have evidence.
|
||||
|
||||
### Measuring Sharpness
|
||||
|
||||
```
|
||||
Sharpness = Variance of forecast probabilities
|
||||
|
||||
Forecaster A: [0.45, 0.50, 0.48, 0.52, 0.49] → Var = 0.0007 (unsharp)
|
||||
Forecaster B: [0.15, 0.85, 0.30, 0.90, 0.20] → Var = 0.1150 (sharp)
|
||||
```
|
||||
|
||||
### When to Be Sharp
|
||||
|
||||
**Be sharp (extreme probabilities) when:**
|
||||
- Strong discriminating evidence (multiple independent pieces align)
|
||||
- Easy questions (outcome nearly certain)
|
||||
- You have expertise (domain knowledge, track record)
|
||||
|
||||
**Stay moderate (near 50%) when:**
|
||||
- High uncertainty (limited information, conflicting evidence)
|
||||
- Hard questions (true probability near 50%)
|
||||
- No expertise (unfamiliar domain)
|
||||
|
||||
**Goal:** Sharp AND well-calibrated (extreme when warranted, accurate probabilities)
|
||||
|
||||
---
|
||||
|
||||
## 7. Practical Calibration Training
|
||||
|
||||
### Calibration Exercises
|
||||
|
||||
**Exercise Set 1:** Make 10 forecasts on verifiable questions (fair coin 50%, Paris capital 99%, two heads 25%, die shows 6 at 16.67%). Check: Did 99% come true 9-10 times? Did 50% come true ~5 times?
|
||||
|
||||
**Exercise Set 2:** Make 20 "80% confident" predictions. Expected: 16/20 correct. Common: 12-14/20 (overconfident). What feels "80%" should be reported as "65%".
|
||||
|
||||
### Tracking Methods
|
||||
|
||||
**Method 1: Spreadsheet**
|
||||
```
|
||||
| Date | Question | Prob | Outcome | Brier | Notes |
|
||||
Monthly: Calculate mean Brier
|
||||
Quarterly: Generate calibration curve
|
||||
```
|
||||
|
||||
**Method 2: Apps**
|
||||
- PredictionBook.com (free, tracks calibration)
|
||||
- Metaculus.com (forecasting platform)
|
||||
- Good Judgment Open (tournament)
|
||||
|
||||
**Method 3: Focused Practice**
|
||||
- Week 1: Make 20 predictions (focus on honesty)
|
||||
- Week 2: Check calibration curve (identify bias)
|
||||
- Week 3: Increase resolution (be bold)
|
||||
- Week 4: Balance calibration + resolution
|
||||
|
||||
### Training Drills
|
||||
|
||||
**Drill 1:** Generate 10 "90% CIs" for unknowns. Target: 9/10 contain true value. Common mistake: Only 5-7 (overconfident). Fix: Widen by 1.5×.
|
||||
|
||||
**Drill 2:** Bayesian practice - State prior, observe evidence, update posterior, check calibration.
|
||||
|
||||
**Drill 3:** Make 10 predictions >80% or <20%. Force extremes when "pretty sure". Track: Are >80% happening >80%?
|
||||
|
||||
---
|
||||
|
||||
## 8. Comparison Table of Scoring Rules
|
||||
|
||||
### Summary
|
||||
|
||||
| Feature | Brier | Log | Spherical | Threshold |
|
||||
|---------|-------|-----|-----------|-----------|
|
||||
| **Proper** | Strictly | Strictly | Strictly | NO |
|
||||
| **Range** | 0 to 1 | -∞ to 0 | 0 to 1 | 0 to 1 |
|
||||
| **Penalty** | Quadratic | Logarithmic | Moderate | None |
|
||||
| **Interpretation** | Squared error | Bits surprise | Geometric | Binary |
|
||||
| **Usage** | Default | High-stakes | Rare | Avoid |
|
||||
| **Human-friendly** | Yes | Somewhat | No | Yes (misleading) |
|
||||
|
||||
### Detailed Comparison
|
||||
|
||||
**Brier Score**
|
||||
|
||||
Pros: Easy to interpret, standard in competitions, moderate penalty, good for humans
|
||||
|
||||
Cons: Less severe penalty for overconfidence
|
||||
|
||||
Best for: General forecasting, calibration training, standard benchmarking
|
||||
|
||||
**Log Score**
|
||||
|
||||
Pros: Severe penalty for overconfidence, information-theoretic, strongly incentivizes honesty
|
||||
|
||||
Cons: Too punishing for humans, infinite at 0%/100%, less intuitive
|
||||
|
||||
Best for: High-stakes forecasting, penalizing overconfidence, ML models, tail risk
|
||||
|
||||
**Spherical Score**
|
||||
|
||||
Pros: Strictly proper, bounded, geometric interpretation
|
||||
|
||||
Cons: Uncommon, complex formula, rarely used
|
||||
|
||||
Best for: Theoretical analysis only
|
||||
|
||||
**Threshold / Binary Accuracy**
|
||||
|
||||
Pros: Very intuitive, easy to explain
|
||||
|
||||
Cons: NOT proper (incentivizes extremes), ignores calibration, can be gamed
|
||||
|
||||
Best for: Nothing (don't use for forecasting)
|
||||
|
||||
### When to Use Each
|
||||
|
||||
| Your Situation | Recommended |
|
||||
|----------------|-------------|
|
||||
| Starting out | **Brier** |
|
||||
| Experienced forecaster | **Brier** or **Log** |
|
||||
| High-stakes decisions | **Log** |
|
||||
| Comparing to benchmarks | **Brier** |
|
||||
| Building ML model | **Log** |
|
||||
| Personal tracking | **Brier** |
|
||||
| Teaching others | **Brier** |
|
||||
|
||||
**Recommendation:** Use **Brier** as default. Add **Log** for high-stakes or to penalize overconfidence.
|
||||
|
||||
### Conversion Example
|
||||
|
||||
**Forecast: 80%, Outcome: Yes**
|
||||
```
|
||||
Brier: (0.80-1)² = 0.04
|
||||
Log (base 2): log₂(0.80) = -0.322
|
||||
Spherical: 0.80/√(0.80²+0.20²) = 0.971
|
||||
```
|
||||
|
||||
**Forecast: 80%, Outcome: No**
|
||||
```
|
||||
Brier: (0.80-0)² = 0.64
|
||||
Log (base 2): log₂(0.20) = -2.322 (much worse penalty)
|
||||
Spherical: 0.20/√(0.80²+0.20²) = 0.243
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Return to Main Skill
|
||||
|
||||
[← Back to Market Mechanics & Betting](../SKILL.md)
|
||||
|
||||
**Related Resources:**
|
||||
- [Betting Theory Fundamentals](betting-theory.md)
|
||||
- [Kelly Criterion Deep Dive](kelly-criterion.md)
|
||||
|
||||
Reference in New Issue
Block a user