NHL Player Props Data Audit — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: Opus (3 of 5 peer review votes) Status: PENDING BOSS RULING on open questions


COUNCIL SUMMARY

Where Advisors Agreed

  1. Player game logs are the #1 missing data — need per-game G, A, P, SOG, TOI, PP TOI from NHL API
  2. EWMA at player level is essential — with separate decay rates per stat type
  3. Poisson for goals, Negative Binomial for SOG — consensus on distribution models
  4. Power play unit tracking is critical — PP1 vs PP2 assignment changes prop projections by 30-65%
  5. Saves props need opponent shot volume as primary input — not just goalie SV%
  6. Matchup cards must separate skater format from goalie format
  7. Minimum sample thresholds before any scanner fires (10+ games suggested)
  8. 7 separate edge scanners — one per prop market type

Where Advisors Disagreed

  1. Player vs goalie history: Gemini says "mathematically toxic, drop completely." Other 4 say use with regression to mean and minimum sample thresholds. Council verdict: Use with minimum sample (15+ shots) and Bayesian regression — do NOT drop.
  2. Points distribution: gpt-oss proposes additive delta that can go negative (breaks Poisson). Opus uses convolution of goals + assists. Sonnet uses Monte Carlo with correlation. Council verdict: Monte Carlo with ~0.4 correlation is most practical.
  3. Database engine: gpt-oss recommends PostgreSQL, Gemini's schema was broken (missing prop_type column on EWMA table). Council verdict: SQLite WAL is sufficient for current scale.

Strongest Arguments (from peer review)

Opus wins with the most production-ready design:

Biggest Blind Spot (4/5 reviewers)

Gemini: Categorically rejected player-vs-goalie matchup history as "mathematically toxic." This is overconfident and wrong — the correct approach is minimum sample thresholds with Bayesian regression, not deletion. Also, Gemini's EWMA table schema was broken (no prop_type column).

What Everyone Missed

  1. Empty net dynamics — 10-15% of goals scored on empty nets. EN minutes inflate Goals/Points/SOG. Must calculate P(close game) and adjust 3rd-period conversion rates.
  2. Kalshi order book liquidity — Kelly sizing on thin exchange is dangerous. Must ingest bid/ask depth and constrain sizing to available liquidity.
  3. Portfolio-level correlation — Multiple props on same player are NOT independent bets. Need correlation matrix and aggregate exposure limits.
  4. In-season regime changes — Mid-season coaching changes, line reshuffles, and trade deadline moves structurally break historical baselines.

BUILD PLAN

Phase 1: Player Data Tables

nhl_player_game_logs:

nhl_player_baselines:

nhl_player_matchup_context:

Phase 2: Prop-Specific Metrics

Custom metrics to compute:

Metric Formula Purpose
Goals/60 EWMA EWMA(goals / (TOI/60), α=0.12) Primary rate for goals Poisson λ
Assists/60 EWMA EWMA(assists / (TOI/60), α=0.12) Primary rate for assists NB
SOG/60 EWMA EWMA(shots / (TOI/60), α=0.10) Higher stability for shots
PP Production Share PP_points / total_points How dependent on power play
Shot Quality Index xG / SOG ratio from MoneyPuck Measures shot quality vs volume
Boom/Bust Ratio StdDev(stat) / Mean(stat) CV — high = inconsistent
Hot/Cold Streak Z-score of last 5 vs season Detect streaks for context flags
Line Chemistry Score Individual production with linemate set vs without Measures line-dependent output

Phase 3: Distribution Models Per Prop

Prop Distribution Parameters Notes
Goals Zero-Inflated Poisson λ from goals/60 × proj_TOI × matchup_mult; π from 4th-line probability ZIP handles the ~70% of players who score 0 in a game
Assists Negative Binomial μ from assists/60 × proj_TOI; k from player variance Overdispersion from teammate dependency
Points Monte Carlo (10K sims) Correlated draws from Goals + Assists (r ≈ 0.4) Joint distribution, NOT independent sum
SOG Negative Binomial μ from shots/60 × proj_TOI; k per player High overdispersion in shot counts
Saves Normal μ = opp_SA/G × SV%; σ from goalie's game-to-game variance High count, roughly symmetric
Anytime GS Bernoulli P = 1 - P(0) from Goals ZIP Derived, not independent model
First Goal Weighted Bernoulli P(first goal) = P(any goal) × first_period_share × deployment_weight Very low base rate, high variance

Phase 4: Edge Scanner Architecture

Common scanner engine:

  1. Ingest FanDuel alt lines (3+ thresholds required)
  2. De-vig each threshold (multiplicative 2-way)
  3. Fit distribution curve to de-vigged probabilities
  4. Compare fitted curve to Kalshi contract prices
  5. Calculate edge after 7% Kalshi fee
  6. Apply minimum edge (4 cents) and minimum sample (10 games) gates
  7. Output: {player, prop_type, threshold, model_prob, kalshi_price, edge, confidence}

Per-prop scanner differences:

Scanner Unique Logic
Goals Goalie quality multiplier, PP time boost, EN probability adjustment
Assists Linemate shooting talent factor, PP quarterback bonus
Points Joint Goals+Assists simulation with correlation
SOG Coaching system factor (shot-heavy teams), matchup pace
Saves Opponent shot volume is primary driver, goalie pull caps upside
Anytime GS Derived from goals scanner P(goals ≥ 1)
First Goal First-shift deployment, first-period scoring tendencies, opening faceoff team

Phase 5: Matchup Card Format

(See Research Pipeline ruling for full card format — same cards serve both rulings)

Phase 6: Dashboard


OPEN QUESTIONS FOR BOSS RULING

  1. Player vs goalie matchup history: Council says use it with 15+ shot minimum and Bayesian regression. Confirm?

  2. Empty net modeling: Should we explicitly model P(empty net game state) and adjust 3rd-period scoring rates, or is this over-engineering for now?

  3. In-season regime detection: Should the system auto-detect coaching changes, line reshuffles, and trade deadline moves and reset baselines? Or just let EWMA naturally adjust?

  4. Data history depth: How far back for player game logs? 2 seasons? 3 seasons? (Gemini said 2 years, others said 3+)

  5. EWMA decay rates: Track all 3 (0.10, 0.12, 0.15) as decided for team-level, or pick one for player-level props?


COUNCIL METADATA

Detail Value
Council date 2026-04-01
Advisory responses 5 (all completed)
Peer reviews 5 (all completed)
Strongest advisor Opus (3/5 votes)
Runner-up Sonnet (1/5), Grok (1/5)
Biggest blind spot Gemini (4/5 votes)
Full council data /home/ubuntu/edgeclaw/data/councils/2026-04-01/nhl-player-props-data-audit/
Source: ~/edgeclaw/results/panel-results/nhl-player-props-data-audit-ruling.md