MLB Player Props Data Audit — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: gpt-oss (3 of 5 peer review votes) Status: PENDING BOSS RULING on open questions


COUNCIL SUMMARY

Where Advisors Agreed

  1. Game-log data is the #1 missing layer — per-game stats for every pitcher and batter from MLB Stats API
  2. Statcast data essential — exit velocity, barrel rate, launch angle, sprint speed from Baseball Savant
  3. EWMA with prop-specific decay rates — different spans for different stat types (K=8 games, HR=20, SB=25)
  4. 8 separate edge scanners — one per prop type with appropriate distribution
  5. FanDuel alt lines as sharp anchor for player prop pricing
  6. Platoon splits (L/R) critical at individual batter level
  7. Catcher framing and pop time data needed for pitcher K props and SB props
  8. Park factors at prop level (HR factor hand-specific, K factor by park)
  9. 3-season backfill recommended for historical data
  10. Cross-prop correlation tracking needed (K-over + Outs-over = ~0.6 correlation)

Where Advisors Disagreed

  1. Database engine: gpt-oss recommended PostgreSQL + TimescaleDB, Sonnet specified exact SQLite schemas. Council verdict: SQLite WAL for current scale.
  2. Strikeouts distribution: Gemini proposed Conway-Maxwell-Poisson, Sonnet used Negative Binomial. Council verdict: NB is sufficient and simpler — COM-Poisson adds complexity without clear benefit.
  3. Total bases distribution: Gemini used Zero-Inflated NB, Grok used multinomial per-PA outcomes. Council verdict: Multinomial per-PA is more correct (each at-bat produces discrete base outcomes).
  4. Architecture scope: gpt-oss built full production system (ETL, caching, Celery workers, React dashboard). Others focused on data/modeling only. Council verdict: Hybrid — gpt-oss architecture with Sonnet's distribution choices and EWMA specifics.

Strongest Arguments (from peer review)

gpt-oss wins with the only complete full-stack architecture:

Sonnet runner-up with deepest analytical specifics:

Biggest Blind Spot

Opus: Response appeared truncated/incomplete in peer review. When complete, focused heavily on matchup card formatting over data architecture and pipeline design.

What Everyone Missed (from peer reviews)

  1. Player identity mapping system — MLB ID, FanGraphs ID, Savant ID, Kalshi ID, retail betting IDs are all different. Need canonical player_master table with ID cross-references, alias mapping, and trade/call-up handling.
  2. Umpire home plate assignment — Strike zone changes every night. Pitcher K prop and BB prop are massively impacted by umpire zone size. Need daily umpire-assignment pipeline.
  3. Retractable roof status — 6 MLB stadiums have retractable roofs. Open vs closed drastically changes HR/TB probabilities. NWS API won't tell you roof status — need separate data feed.
  4. Real-time odds ingestion — Polling FanDuel/Kalshi via REST every few minutes is too slow. Need streaming odds architecture for edge detection.
  5. Manager hook probability — Pitcher outs/K totals depend on when manager pulls the pitcher. Need historical hook-point model per manager.

BUILD PLAN

Phase 1: Player Master & Game Logs

mlb_player_master:

mlb_pitcher_game_logs:

mlb_batter_game_logs:

Phase 2: Baselines & EWMA

mlb_pitcher_prop_baselines:

mlb_batter_prop_baselines:

Phase 3: Matchup Context

mlb_prop_matchup_context:

Phase 4: Distribution Models

Prop Distribution EWMA Span Key Parameters
Pitcher Strikeouts Negative Binomial 8 games μ from K/9 × proj_IP × opp_K_rate × ump_zone; k from variance
Pitcher Outs Truncated Normal 12 games μ from avg outs; σ from variance; hook probability model
Batter Hits Beta-Binomial 12 games α, β from hit rate history; n from projected PA × platoon
Batter HRs Zero-Inflated Poisson 20 games λ from HR/PA × PA × park_HR × weather; π from ~85% zero
Batter RBIs Monte Carlo (10K) 15 games Lineup simulation with baserunner states
Batter Total Bases Multinomial per PA 15 games P(1B), P(2B), P(3B), P(HR) summed over projected PAs
Batter Runs Monte Carlo (10K) 15 games Lineup simulation — depends on subsequent batters
Stolen Bases Bernoulli per opp 25 games P(attempt) × P(success) × opportunities; catcher pop time

Phase 5: Edge Scanners (8)

Common engine: FanDuel alt lines → de-vig → fit distribution → compare Kalshi → min 4c edge after 7% fee

Cross-prop correlation rules:

Phase 6: Dashboard


OPEN QUESTIONS FOR BOSS RULING

  1. 3-season backfill: Council recommends downloading 3 full seasons of game logs (2023-2025) for every pitcher and batter. This is ~150K pitcher games + ~500K batter games. Confirm?

  2. Statcast data depth: Full pitch-by-pitch data is massive. Should we collect summary Statcast (per-game exit velo, barrel rate) or full pitch-level data?

  3. Player ID mapping: Need canonical player_master table mapping MLB, FanGraphs, Savant, and Kalshi IDs. Build manually or find existing crosswalk?

  4. Umpire data pipeline: Should we build daily umpire assignment scraper now, or defer to later phase?

  5. Manager hook model: Should we build a logistic regression model predicting when managers pull pitchers (affects outs/K totals)? Or use historical averages?

  6. Monte Carlo expense: RBI and Runs props need full game simulation (10K runs per game). Build now or defer?


COUNCIL METADATA

Detail Value
Council date 2026-04-01
Advisory responses 5 (all completed)
Peer reviews 5 (all completed)
Strongest advisor gpt-oss (3/5 votes)
Runner-up Sonnet (2/5 votes)
Biggest blind spot Opus (truncated response)
Full council data /home/ubuntu/edgeclaw/data/councils/2026-04-01/mlb-player-props-data-audit/
Source: ~/edgeclaw/results/panel-results/mlb-player-props-data-audit-ruling.md