NHL Data Audit — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: Sonnet (4 of 5 peer review votes) Status: BOSS RULING COMPLETE (2026-04-01)


SOURCE CONSTRAINT

NaturalStatTrick is NOT available as a data source. All advanced stats (5v5 splits, shot quality, HDCF%, score-adjusted metrics) must come from MoneyPuck (free CSV downloads) or the NHL API. MoneyPuck covers nearly everything NST provided.

Primary Data Sources:


CONSENSUS (All 5 Advisors Agreed)

  1. PP/PK and 5v5 splits are the #1 critical gap. Aggregate stats without strength-state splits are contaminated.
  2. Goaltending is the highest-leverage single variable (25-35% of between-team goal variance).
  3. EWMA decay-weighted form is essential. Replace season-long averages with recency-weighted metrics.
  4. Shot quality (HDCF%, high-danger chances) must be added. Raw Corsi is outdated.
  5. Drop nhl_one_goal_stats. Pure luck/variance, not signal.
  6. Research timing: 3 passes — morning skate, afternoon confirmation, pre-game lock.
  7. Goalie confirmation is the #1 research priority.
  8. Structured prose format (not JSON) for AI analyst matchup cards — 4/5 peer reviewers endorsed this.

BUILD PLAN — DATA COLLECTION

Tier 1 — Must Build First

Data What Source Free?
PP/PK Rates PP%, PK%, PP xG/60, PK xGA/60 NHL API + MoneyPuck CSV Yes
5v5 Splits All stats split by 5v5/PP/PK MoneyPuck CSV (situation columns) Yes
Shot Quality HDCF%, HDCA%, high-danger chances for/against MoneyPuck CSV Yes
EWMA Form Decay-weighted xGF%, goals, save%, PP%, PK% Computed from existing data Yes
Goalie GSAx Goals Saved Above Expected per 60 MoneyPuck goalies.csv Yes

Tier 2 — Build Second

Data What Source Free?
Goalie Matchup History Save%, GAA vs specific opponents (N>=5) NHL API game logs filtered Yes
Schedule Strength Avg points% of recent opponents Computed from standings+schedule Yes
H2H Season Series Record, GF/GA between tonight's teams NHL API Yes
PDO / Luck Metric 5v5 shooting% + save% (regression flag) Computed from MoneyPuck Yes
Player Game Logs Top-6 F + Top-4 D: TOI, goals, assists, xG NHL API + MoneyPuck Yes

Tier 3 — Build Later

Data What Source Free?
Referee Tracking Penalties/game, PP opportunities generated NHL API officials Yes
Line Combinations Projected lines from DailyFaceoff Already scraped Yes
Coaching Changes Before/after performance flags Computed Yes
Roster Changes Trade deadline impact, call-ups NHL API roster endpoint Yes

DELTA FORMULA

Start with additive approach (most implementable), test multiplicative in parallel:

Additive (implement first):

Matchup_xG_A = League_Avg
  + (Team_A_5v5_xGF - League_Avg) * opp_quality_scalar
  + ST_Impact_A
  + Goalie_Adjustment
  + Fatigue_Penalty
  + Venue_Adjustment

Where:
  opp_quality_scalar = 1 + 0.15 * (Team_B_def_rating - League_Avg) / League_Avg
  All inputs use EWMA values, not season-long

Multiplicative (test in parallel):

Team_A_5v5_xG = (Team_A_5v5_xGF/60) * (Team_B_5v5_xGA/60) / (League_Avg_5v5_xG/60)
+ PP modifier computed separately

CUSTOM CALCULATED METRICS (Top 10)

  1. Goalie-Adjusted xGA: Team_xGA * (Starter_sv% / League_avg_sv%)
  2. Special Teams Impact Score: (PP% - Lg_PP%) * PP_opps/game + (PK% - Lg_PK%) * PK_times/game — expressed as goals above/below average
  3. Net Goalie Matchup Delta: Starter_A_GSAx/60 - Starter_B_GSAx/60
  4. Fatigue xG Penalty: B2B = -0.25 xG, 3-in-4 = -0.35 xG, with travel modifier
  5. PDO Regression Flag: Teams with 5v5 PDO outside 0.985-1.015 flagged as "due for regression"
  6. EWMA Form (10-game window, alpha 0.10-0.15)
  7. Comeback Probability Index: 3rd period scoring + trailing performance + EN pull tendencies
  8. Score-State Adjusted Metrics: Separate xGF/xGA when leading, tied, trailing
  9. Starter-Backup Delta: Save% gap between starter and backup (for B2B assessment)
  10. Playoff Context Flags: Clinched/eliminated/magic number/games remaining

DATA FORMAT — Structured Prose Matchup Card

Format: Structured prose blocks with league percentiles, NOT raw JSON.

GAME: Team A @ Team B — [Date] [Time] ET
VENUE: [Arena] | HOME ADV: +0.2 goals (league standard)

GOALIE MATCHUP [CONFIRMED/EXPECTED/UNCONFIRMED]
  Team A: [Name] — sv% .921 (78th pct) | GSAx +4.2 (85th pct) | L10: .918
    vs Team B: .908 in 6 starts (BELOW career avg)
    Workload: 7 starts in 14d (HIGH — decay flag)
  Team B: [Name] — sv% .914 (55th pct) | GSAx -1.1 (38th pct) | L10: .920
  NET GOALIE EDGE: Team A +0.3 goals

[INTEL & CONTEXT]
  [CRITICAL]: None
  [MODERATE]: Team A missing Top-4 D (est. -0.15 xG)
  [CONTEXT]: Team B on 2nd of B2B (fatigue: -0.25 xG applied)

5v5 ENGINE (EWMA last 10)
  Team A: xGF/60 2.71 (72nd pct) | HDCF% 54.1% (80th pct) | PDO 1.008 (NORMAL)
  Team B: xGF/60 2.48 (45th pct) | HDCF% 48.2% (30th pct) | PDO 1.022 (LUCKY — regression likely)
  5v5 EDGE: Team A +0.23 xG/60

SPECIAL TEAMS
  Team A: PP 24.2% (68th pct) vs Team B PK 78.1% (25th pct) — MISMATCH FAVORS A
  Team B: PP 19.8% (40th pct) vs Team A PK 82.4% (65th pct) — NEUTRAL
  ST IMPACT: Team A +0.18 expected goals

MATCHUP DELTAS
  5v5 xG delta: +0.23 (Team A)
  Shot quality delta: +5.9% HDCF (Team A)
  Pace mismatch: Both moderate tempo (neutral)
  Season series: Team A leads 2-1, +1.3 GF differential

SITUATIONAL
  Fatigue: Team B on B2B (-0.25 xG applied)
  Referee: [Crew] — 4.2 penalties/game (BELOW avg, suppresses PP value)
  Playoff implications: Both in wild card race
  EWMA trend: Team A rising (3W streak), Team B flat

PROJECTED: Team A 3.1 — Team B 2.5 | Total 5.6 | Spread: Team A -0.6
QUALITY: HIGH (both goalies confirmed)

NOISE TO TRIM


NHL-SPECIFIC FACTORS

  1. Goaltending dominance — single position explains 25-35% of variance
  2. Highest parity/randomness of major sports — better team wins ~55-58%
  3. 3-point game distortion — use regulation win% over raw points%
  4. B2B severity — key variable is starter-to-backup save% delta
  5. Score effects — leading teams turtle, trailing teams open up (violates Poisson independence)
  6. Shootout is noise — strip from goal metrics
  7. Trade deadline roster flux — aggressive EWMA weighting post-deadline
  8. April playoff context — clinched/eliminated/fighting changes everything
  9. Arena scorekeeper bias — public xG inherits rink bias

RESEARCH PIPELINE

Timing (3 passes + optional 4th)

Pass Time (ET) Purpose
1 10:00-11:30 AM Morning skate: goalie confirmations, injury updates, lineup news
2 2:00-3:00 PM Afternoon lock: goalie ~80% confirmed, AHL call-ups, roster moves
3 5:30-6:00 PM Pre-game: late scratches, game-time decisions, final confirmations
3.5 8:00 PM West Coast late games only

Search Query Templates

Auto-Adjust vs Context-Only

Finding Action Cap
Goalie change (confirmed) FULL RECOMPUTE — not a percentage adjustment N/A
Key player OUT (>15% team xGF) Auto-adjust 15% single, 25% cumulative
B2B confirmed Apply team-specific B2B penalty Pre-computed
Goalie "expected" (not confirmed) HALVE the adjustment Half of computed delta
Goalie "unconfirmed" Context only, no auto-adjust
Line changes Context only
Coach quotes/motivation Context only
Referee assignment Context only (flag for analysts)
AHL call-up (bottom 6) Context only
AHL call-up (replaces top 6 F) Treat as injury adjustment 15% cap

Confidence Gating

Key Sources


PEER REVIEW FINDINGS (What All 5 Advisors Missed)

1. Calibration & Feedback Loops

No way to measure if changes improve the model. Need: Brier scores, closing line value (CLV) tracking vs Pinnacle, incremental feature rollout.

2. Poisson Independence Violation

NHL goals are NOT independent events. Score effects are systematic — leading teams turtle, trailing teams open up. Standard Poisson with blended rates systematically misprices totals. Need score-state-separated xG rates.

3. Arena Scorekeeper Bias ("Rink Bias")

Public play-by-play is manually charted by home arena officials with known biases. Without normalization, xG models inherit geographical biases.

4. No Proprietary xG Model

Everyone uses the same public MoneyPuck models. True alpha = building a proprietary model from raw play-by-play with novel features. Major effort but major edge.

5. Data Quality & Validation Framework

No health checks on scrapers. Need: automated ingestion checks, null-field thresholds, duplicate detection, schema versioning, alerting on failures.


BOSS RULINGS (2026-04-01)

  1. Build order — All tiers at once, in phases. Each phase gets 2 simplify passes, a math check, and test verification before moving to the next phase.
  2. Delta formula — Both. Multiplicative is primary (council-endorsed). Log additive result alongside for comparison in post-mortem.
  3. EWMA decay factor — Track multiple alphas (0.10, 0.12, 0.15) simultaneously. Primary feeds the model, others logged for post-mortem comparison to find the best fit over time.
  4. Poisson model — Stick with Poisson for now. Build it clean enough to swap in negative binomial later if post-mortems show Poisson is consistently off.
  5. Proprietary xG — Use MoneyPuck public xG as baseline now. Start collecting raw shot data from NHL API in the background. Build our own xG model when enough data is collected and pipeline is stable.
  6. CLV tracking — Yes, build it.
  7. Data validation — Build as a shared service for ALL desks (not per-desk). DeepSeek R1 assigned. Added to Future Additions Tracker on dashboard.
  8. Rink bias — No. Do not normalize for scorekeeper effects.

COUNCIL METADATA

Detail Value
Council date 2026-04-01
Advisory responses 5 (all completed)
Peer reviews 5 (all completed)
Strongest advisor Sonnet (4/5 votes)
Biggest blind spot gpt-oss (broken delta formula), Grok (no self-critique)
Full council data /home/ubuntu/edgeclaw/data/councils/2026-04-01/nhl-data-audit/
Original prompt docs/nhl-panel-prompt-filled.md
Gemini response (reference) /tmp/gemini-nhl-response.txt
Grok response (reference) /tmp/grok-nhl-response.txt
Source: ~/edgeclaw/results/panel-results/nhl-data-audit-ruling.md