MLB Data Audit — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: gpt-oss (3 of 5 peer review votes) Status: PENDING BOSS RULING on open questions

COUNCIL SUMMARY

Where Advisors Agreed

Starting pitcher is the #1 data input — SP quality metrics (ERA, FIP, xFIP, WHIP, K/9, BB/9) are foundational
Bullpen availability tracking is #2 — pitch counts, days rest, multi-day workload all must be tracked daily
Weather data from NWS API — critical for outdoor parks, affects totals significantly
Park factors per stadium — Coors (extreme), Yankee Stadium (short porch), Oracle Park (marine layer)
Platoon splits (L/R matchups) — 30-50 point wOBA swings at team level
EWMA with stat-specific decay rates — different alphas for SP metrics vs team batting vs bullpen
4 separate edge scanners — Moneyline, Run Line, Totals, First 5 Innings
Poisson/Negative Binomial for run distributions — vanilla Poisson insufficient due to overdispersion
First-time-through-order (FTTO) splits essential for F5 market
Early-season protocol — limited SP sample in April requires blending with projections

Where Advisors Disagreed

Database engine: gpt-oss recommended PostgreSQL + TimescaleDB with 20+ tables. Opus used SQLite with 14 tables. Council verdict: SQLite WAL for current scale, design migration-ready.
Distribution model: gpt-oss used Zero-Inflated Negative Binomial, Opus used Poisson with walk-off correction, Gemini used basic Poisson. Council verdict: Negative Binomial preferred for run scoring (overdispersion), with walk-off correction for run line.
BvP (batter vs pitcher) data: Opus explicitly disqualified individual BvP due to small samples. Others included it with caveats. Council verdict: Drop individual BvP, use platoon splits at team level instead.
Weather source: Grok recommended OpenWeatherMap, Opus specified NWS API only. Council verdict: NWS API only (free, reliable, already established policy).

Strongest Arguments (from peer review)

gpt-oss wins with the most complete data architecture design:

End-to-end data flow: raw → clean → gold layer with audit trail
20+ tables with PK/FK, explicit data types, and change-history tables
Row-level checksums on daily pulls, staleness alarms, schema drift detection
Automated data quality pipeline with alerting
Four ready-to-run edge scanner blueprints per market type
Distribution model justification with empirical variance-to-mean ratios
Exact API endpoints, pagination limits, rate-limit handling, fallback plans

Opus runner-up with deepest baseball analytics knowledge:

Walk-off truncation problem for run line fully worked through
Humidity physics (humid air LESS dense = ball carries farther)
Bullpen 4-tier system with specific pitch-count thresholds
Dual-EWMA crossover system (fast vs slow signal)
Opener/bullpen game detection (unique gap identification)
Self-assessment of own weaknesses

Biggest Blind Spot

Gemini: Skeleton schema (4 tables, no indexes, no constraints), recommended wrong weather API (OpenWeatherMap instead of NWS), no formulas for distribution parameters, vague source references without API specs.

What Everyone Missed (from peer reviews)

Real audit pipeline vs data feeds — All advisors designed data collection but none built proper data quality observability: freshness SLAs, source reconciliation, anomaly detection, data lineage, reproducibility.
Market data integration layer — Real-time odds ingestion, line movement tracking, de-vigging architecture, steam/reverse-edge alerts.
P&L attribution per data input — No way to measure whether SPQC, bullpen availability, or weather adjustments are actually profitable over time.
Lineup delta detection — Star position player rest days happen daily; need automated parser comparing official vs projected lineup.
Kalshi-specific liquidity constraints — Thin exchange, position sizing must account for market impact.

BUILD PLAN

Phase 1: Core MLB Data Tables

mlb_sp_game_logs:

pitcher_id, pitcher_name, date, game_id, team, opponent
innings_pitched, earned_runs, hits_allowed, walks, strikeouts, pitches_thrown
game_score, era_after, fip_after, xfip_after, whip_after
k_per_9, bb_per_9, hr_per_9, gb_rate
ftto_woba (first time through order), ftto_k_rate, ftto_bb_rate
Source: MLB Stats API + Baseball Savant

mlb_sp_baselines:

pitcher_id, date, stat_type
last_3_starts, last_5_starts, last_10_starts, season_avg
ewma_015 (fast, SP form), ewma_010 (standard), ewma_005 (slow, bullpen ERA)
steamer_projection, zips_projection (preseason/updating)
games_started, season_ip
early_season_flag (boolean — fewer than 5 starts)

mlb_team_batting:

team, date, opponent_sp_hand (L/R)
team_woba, team_ops, team_wrc_plus
vs_lhp_woba, vs_rhp_woba (platoon splits)
home_woba, away_woba
last_7_woba, last_14_woba, last_30_woba
iso_power, k_rate, bb_rate, barrel_rate

mlb_bullpen_status:

team, date, pitcher_id, pitcher_name, role (closer/setup/middle/long/opener)
availability_tier (GREEN/YELLOW/RED/BLACK)
yesterday_pitches, two_days_pitches, three_days_pitches
appearances_last_3d, appearances_last_7d, pitches_last_7d
high_leverage_innings_last_7d
criteria: BLACK=unavailable, RED=30+ pitches yesterday OR 3 of last 4 days, YELLOW=20-29 yesterday, GREEN=available

mlb_game_weather:

game_id, date, park_id
temperature_f, humidity_pct, wind_speed_mph
wind_direction_relative (OUT_TO_CF/IN_FROM_CF/CROSSWIND_LR/CROSSWIND_RL/CALM)
wind_run_impact (park-specific multiplier × wind component)
precip_probability, precip_type
air_density_adjustment (altitude + humidity + temp)
roof_status (open/closed/retractable_open/retractable_closed/dome)
Source: NWS API only

mlb_park_factors:

park_id, park_name, team, season
runs_factor, hr_factor_lhb, hr_factor_rhb, hits_factor
dimensions_lf, dimensions_cf, dimensions_rf
altitude_ft, roof_type
Source: FanGraphs park factors

mlb_umpire_data:

umpire_id, umpire_name, date, game_id
career_k_above_avg, career_bb_above_avg, career_runs_above_avg
season_k_rate, season_bb_rate
abs_challenge_rate, abs_overturn_rate (2026 new)
zone_size_index (relative to league average)

mlb_lineups:

game_id, date, team, batting_order (1-9)
player_id, player_name, position
confirmed (boolean), source, timestamp
season_wrc_plus, vs_hand_wrc_plus
lineup_total_wrc_plus, projected_total_wrc_plus
delta_wrc_plus (flags rest days)

Phase 2: Derived Metrics

Metric	Formula	Purpose
SP Quality Composite (SPQC)	Weighted: 0.3×xFIP + 0.3×FIP + 0.2×ERA + 0.2×EWMA_GS	Single SP quality number
Bullpen Availability Index (BAI)	Weighted avg of available arms × role importance	Team bullpen readiness score
Weather Run Factor (WRF)	wind_component × park_multiplier + temp_adj + humidity_adj + altitude_adj	Total weather impact on runs
Platoon Advantage Score	Team wOBA vs SP hand − team season wOBA	Measures platoon edge
FTTO Decay Rate	SP's innings 1-3 wOBA vs innings 4-5 wOBA	How much SP degrades through order
Day-Night Fatigue	Team batting stats in day-after-night games vs baseline	Quantified fatigue effect
Lineup Strength Delta	Actual lineup wRC+ − projected lineup wRC+	Detects star rest days

Phase 3: Distribution Models Per Market

Market	Distribution	Parameters	Notes
Moneyline	Negative Binomial (each team's runs)	μ from SPQC × batting × park × weather; k from team variance	Win prob = P(runs_home > runs_away)
Run Line (-1.5)	Negative Binomial with walk-off correction	Same μ, k + home walk-off truncation	Home teams don't bat bottom 9th if leading → reduces home -1.5 cover prob
Totals	Negative Binomial (combined runs)	μ_total = μ_home + μ_away; adjusted for weather, park, bullpen	Over/under probability at each threshold
First 5 Innings	Modified NB (SP-only, no bullpen)	μ from FTTO splits × batting vs SP hand × park; NO bullpen component	Isolates SP — use innings 1-5 specific rates only

Phase 4: Edge Scanners (4 scanners)

Common engine:

Ingest Pinnacle odds for all 4 markets
De-vig using Shin + Power methods
Build NB probability curves with all adjustments
Compare to Kalshi contract prices
Min edge: 4 cents after Kalshi 7% fee
Min sample: SP must have 5+ starts this season (early-season gate)
Output: {game_id, market_type, side, model_prob, kalshi_price, edge, confidence, sp_status, weather_flag}

Per-market unique logic:

Scanner	Unique Logic
Moneyline	SP quality is primary driver, bullpen quality secondary, weather minimal impact
Run Line	Walk-off correction for home favorites, bullpen quality MORE important (late-game leverage)
Totals	Weather is PRIMARY driver (wind × park × temp × humidity), bullpen quality important, umpire zone
First 5	SP-only — FTTO splits, umpire zone, NO bullpen factor, weather less impactful (fewer innings)

Phase 5: Matchup Card Format

GAME: [Away] @ [Home] | [Date] [Time ET] | [Park]
WEATHER: [Temp]°F | Wind: [Speed]mph [Direction] | Humidity: [%] | WRF: [+/-runs]
ROOF: [Status] | PARK: Runs [factor] | HR-L [factor] | HR-R [factor]
UMPIRE: [Name] | K+[adj] | BB+[adj] | R+[adj] | ABS Overturn: [rate]

HOME SP: [Name] ([L/R]) | Status: [Confirmed/Probable/TBD]
  SPQC: [composite] | xFIP: [val] | FIP: [val] | ERA: [val] | WHIP: [val]
  K/9: [val] | BB/9: [val] | HR/9: [val] | GB%: [val]
  FTTO: wOBA [val] | K% [val] (innings 1-3 vs 4-5)
  Last 3 Starts: [date, opp, IP, ER, K, pitches] × 3
  Days Rest: [n] | Season IP: [total] | Trend: [up/stable/down]

AWAY SP: [Name] ([L/R]) | Status: [Confirmed/Probable/TBD]
  [Same fields]

HOME BATTING vs [Away SP Hand]:
  Team wOBA: [season] | vs [L/R]HP: [platoon] | Platoon Advantage: [+/- pts]
  Last 7 wOBA: [val] | Barrel Rate: [val] | K Rate: [val]
  Lineup wRC+: [total] | Delta from Projected: [+/-]
  Key Rest Day: [player name if delta > 15 wRC+]

AWAY BATTING vs [Home SP Hand]:
  [Same fields]

HOME BULLPEN: [GREEN/YELLOW/RED/BLACK]
  BAI: [score] | Closer: [Name]-[status] | Setup: [Names]-[status]
  Team Pitches Last 3 Days: [total] | High-Leverage Available: [Y/N]

AWAY BULLPEN: [Status]
  [Same fields]

SCHEDULING:
  Day Game After Night Game: [Home Y/N] [Away Y/N]
  Series Game: [1/2/3/4] | Travel: [arrived yesterday/same city/off day]

INTELLIGENCE:
  [Findings tagged CRITICAL/MODERATE/CONTEXT]

Phase 6: Dashboard

Daily slate: All games with SP status, weather flags, bullpen status, edge counts per market
Game drill-down: Full matchup card + all 4 market edges + research findings + lineup delta
SP tracker: All 30 teams' probable pitchers with status color coding and SPQC rankings
Bullpen board: Team-by-team availability tiers, pitch counts, trending fatigue
Weather map: Outdoor parks with wind/temp/precip impact visualization
Lineup monitor: Delta alerts when actual lineup deviates from projected
Edge alerts: Sorted by magnitude, filterable by market type, with staleness timestamps
P&L tracker: Performance by market type, by edge bucket, Brier scores
Data quality dashboard: Source freshness, pull failures, staleness alarms, schema drift alerts

OPEN QUESTIONS FOR BOSS RULING

Walk-off correction for Run Line: Opus identified that standard Poisson/NB overstates home -1.5 cover probability because home teams stop batting when leading. Should we implement a correction formula now, or build full inning-by-inning Markov simulation later?
Early-season protocol: Recommended blending Steamer/ZiPS projections with actual stats in April-May (weighted 70/30 projections/actual with 3 starts, shifting to 30/70 by 10 starts). Confirm?
Individual BvP data: Council says drop it (sample size too small to be reliable). Use platoon splits at team level instead. Confirm?
Data history depth: How many seasons of SP game logs? 2 seasons? 3 seasons?
Umpire zone impact: Track ABS Challenge System data starting 2026 but treat as CONTEXT only through May, actionable June+. Confirm?
Opener/bullpen game detection: Should the system auto-detect when a team announces an "opener" (1-2 inning starter followed by bulk reliever) and treat it differently from a traditional start?

COUNCIL METADATA

Detail	Value
Council date	2026-04-01
Advisory responses	5 (all completed)
Peer reviews	5 (all completed)
Strongest advisor	gpt-oss (3/5 votes)
Runner-up	Opus (2/5 votes)
Biggest blind spot	Gemini (2/5 votes)
Full council data	`/home/ubuntu/edgeclaw/data/councils/2026-04-01/mlb-data-audit/`

Source: ~/edgeclaw/results/panel-results/mlb-data-audit-ruling.md