Motorsports Data Audit — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: Sonnet (2 of 5 peer review votes — Opus endorsement) Status: PENDING BOSS RULING on open questions


COUNCIL SUMMARY

Where Advisors Agreed

  1. Qualifying data is #1 missing input — grid position is 60-70% predictive in F1
  2. DNF probability must be tracked per driver/constructor — rolling 20-race window, split mechanical vs crash
  3. Safety car probability per circuit — historical base rate × weather adjustment (Poisson model)
  4. Track type classification critical — F1: power/downforce/street; NASCAR: superspeedway/short/intermediate/road
  5. Practice long-run pace is primary intelligence for race-day prediction
  6. Tire degradation rates from practice — compound-specific, determines strategy trees
  7. Constructor/manufacturer tiers — F1 car performance rated A/B/C tier
  8. Grid penalty tracking — engine penalties, sporting penalties affect start position
  9. Series-specific models required — F1, NASCAR, IndyCar, MotoGP each fundamentally different
  10. Weather integration — rain creates regime changes in competitive order

Where Advisors Disagreed

  1. VSC vs full SC impact: Sonnet correctly identified bookmakers misprice VSC vs SC differently. Others treated SC as monolithic. Council verdict: Model VSC, SC, and red flag separately.
  2. Simulation granularity: Some proposed position-at-end, others lap-by-lap. Council verdict: Lap-by-lap for F1 (strategy matters each lap), position-at-end for NASCAR (field too large).
  3. Data sources: gpt-oss proposed comprehensive dimensional model (dim/fact tables). Others used simpler schemas. Council verdict: Lean schema with series-specific tables where needed.

Strongest Arguments (from peer review)

Sonnet wins with the most analytically precise design:

Biggest Blind Spot

No validation framework — How to backtest race simulations against historical outcomes, verify SC probability calibration, confirm DNF model accuracy. No Brier scores, no CLV tracking.

What Everyone Missed (from peer reviews)

  1. Regulation changes create structural breaks — New regulations (ground effect 2022, 2026 engine regs) invalidate all historical data. Pipeline needs regulation-era flag and model reset.
  2. Reverse-grid sprint qualifying — Some series use reverse-grid formats that completely change grid prediction methodology.
  3. Penalty decisions are discretionary — Steward decisions (time penalties, grid drops) are subjective and unpredictable. Need to model penalty probability per incident type.
  4. Driver market/contract dynamics — Contract year drivers may take more risks. Team dynamics change when a driver is leaving.

BUILD PLAN

Phase 1: Core Data Tables

motorsport_drivers: driver_id, name, series, team_id, nationality, car_number, contract_end, rookie, active motorsport_teams: team_id, name, series, manufacturer, performance_tier, reliability_rating, pit_crew_rating motorsport_tracks: track_id, name, country, track_type, length_km, turns, surface, sc_prob_dry, sc_prob_wet, red_flag_prob, overtaking_difficulty motorsport_qualifying: race_id, driver_id, session, position, time, gap_to_pole, grid_penalty motorsport_practice: race_id, session, driver_id, best_time, long_run_pace, tire_compound, fuel_corrected_pace motorsport_race_results: race_id, driver_id, grid_pos, finish_pos, status (finished/DNF_mech/DNF_crash/DSQ), laps, fastest_lap, pit_stops, tire_strategy motorsport_reliability: driver_id, team_id, season, races, mech_dnfs, crash_dnfs, dnf_rate motorsport_safety_car: race_id, type (VSC/SC/red_flag), lap_deployed, lap_ended, cause motorsport_weather: race_id, session, temp, track_temp, wind, rain_prob, conditions motorsport_championships: season, series, driver_id, points, position, title_prob

Phase 2: Custom Metrics

Metric Formula Notes
Qualifying Gap Score (Driver quali time - pole time) / pole time × 100 Normalized qualifying deficit
Tire Degradation Rate Linear regression on long-run laps: time = a + b×lap Per-compound, per-driver from practice
SC Probability Poisson(λ_circuit × weather_mult) per race Historical λ + rain adjustment
DNF Probability Beta(α, β) updated Bayesian from rolling 20 races Split mechanical vs crash
Constructor Pace Delta Team avg quali gap to pole, EWMA 5-race Car performance baseline
Strategy Flexibility Crossover lap between 1-stop and 2-stop from tire deg Higher = more strategic options
Grid-to-Finish Conversion Historical regression: finish ~ f(grid, track_type, SC_prob) Per-series, per-track type

Phase 3: 7 Edge Scanners

Scanner Min Edge Unique Logic
Outright 5% (F1 20-way), 8% (NASCAR 40-way) Grid × pace × DNF × SC; constructor tier primary for F1
Podium/Top 5/10 4% MC finish distribution; DNF removes from contention
H2H 3% Teammate vs cross-team; team order probability; void risk for DNF
Qualifying/Pole 4% Practice pace → quali prediction; track-specific qualifying mode
Fastest Lap 5% Late pit strategy, tire freshness, SC timing
Championship 4% Season-long MC; regulation-era handling
Sprint Race 4% Shorter format, sprint-specific grid

Phase 4: Dashboard


OPEN QUESTIONS FOR BOSS RULING

  1. Series priority: F1 + NASCAR first, or all 4 at launch?
  2. Practice data sourcing: Requires live timing scraping. Worth cost?
  3. VSC/SC/Red flag split modeling: Build separate impact models?
  4. Tire degradation model: Build from practice long-run data each race weekend?
  5. Regulation-era model resets: Auto-flag when new regulations invalidate historical data?
  6. Sprint race scanner: Separate scanner worth building?
  7. Historical depth: How many seasons to backfill per series?

COUNCIL METADATA

Detail Value
Council date 2026-04-01
Advisory responses 5 (all completed)
Peer reviews 5 (all completed)
Strongest advisor Sonnet (2/5 votes — Opus endorsement)
Runner-up Gemini, Grok (1 genuine cross-vote each)
Biggest blind spot No validation/backtesting framework
Full council data /home/ubuntu/edgeclaw/data/councils/2026-04-01/motorsports-data-audit/
Source: ~/edgeclaw/results/panel-results/motorsports-data-audit-ruling.md