WNBA Data Audit — Council Ruling

Date: 2026-04-01 Process: Full 5-phase council (Advisory → Anonymization → Peer Review → Chairman Synthesis → Boss Ruling) Advisors: Opus, Sonnet, Gemini 3.1 Pro, Grok 4.20 Reasoning, gpt-oss-120b Winner: Gemini (2 of 5 genuine votes — from Grok and gpt-oss) Status: PENDING BOSS RULING on open questions

COUNCIL SUMMARY

Where Advisors Agreed

Starting from zero — no WNBA data in DB at all, everything must be built
40-game season = small samples — Bayesian shrinkage mandatory, priors from prior seasons
Player availability is dominant signal — smaller roster = higher individual impact per absence
Four Factors model adapted for WNBA — eFG%, TOV%, ORB%, FT rate, with WNBA-specific weights
Home court advantage is real but smaller — ~3-4 points (vs NBA's 3.5-4.5)
Overseas performance data is intelligence source — offseason leagues inform preseason projections
Commissioner's Cup changes team motivation — need to track as separate competition flag
Her Hoop Stats is gold standard WNBA source — better than raw Basketball Reference for analytics
WNBA Stats API mirrors NBA structure — similar endpoints, transferable scraping code
Expansion teams need prior-based estimation — wide confidence intervals, roster-component approach

Where Advisors Disagreed

Score distribution model: Some proposed normal, others Skellam (discrete), one proposed log-normal. Council verdict: Skellam for spread/totals (WNBA's lower possession count makes discrete distribution more appropriate).
Overseas adjustment factors: Opus provided specific conversion rates (EuroLeague 0.85x, Turkish 0.75x, WNBL 0.70x). Others left vague. Council verdict: Use Opus's tiered conversion factors as starting point, refine with data.
Per-minute normalization: Gemini proposed Per-40 (WNBA standard), others used Per-36 or raw. Council verdict: Per-40 minutes for all WNBA stats.
Charter flight impact: Gemini identified 2024 charter transition as structural break invalidating pre-2024 rest/travel data. Council verdict: Flag 2024+ as new era for travel models.

Strongest Arguments (from peer review)

Gemini wins with the most structurally aware analysis:

Caught charter flight transition as paradigm shift that invalidates historical HCA models
Per-40 normalization (not Per-36 or Per-48)
Modified Skellam distribution for lower-possession WNBA scoring
Her Hoop Stats identified as gold-standard data source
Practical architecture without over-engineering

Opus strong runner-up (endorsed by Sonnet):

Caught 2025 three-point line distance change as structural break (all pre-2025 3PT data non-comparable)
Pinnacle coverage gaps identified (not every game covered = no SCL anchor)
4-5 cent minimum edge threshold for thin WNBA lines (2 cents = noise)
Specific overseas adjustment factors
"You have nothing" intellectual honesty

Biggest Blind Spot

Player prop architecture completely absent — All advisors focused on game-level markets (ML, spread, totals) while ignoring player props, which are the most inefficient WNBA market. With 12-woman rosters, predictable 7-8 player rotations, and extreme offensive concentration in stars, player prop pricing is easier to beat than game-level markets. Need Minutes Projection Engine accounting for foul trouble (referee crew correlated) and blowout risk.

What Everyone Missed (from peer reviews)

Hard Cap + Emergency Hardship Exception — WNBA's hard cap means teams can't add players without dropping below 10 healthy. Teams routinely play with 9-10 players, forcing starters to 38-40 minutes. This creates predictable late-game fatigue affecting 4th quarter pace, defense, and 2nd-half totals. No advisor modeled this roster constraint.
Live in-game betting — WNBA in-game lines are the softest in North American sports. No advisor discussed real-time play-by-play modeling, foul trouble tracking, or live lineup detection.
Arena-sharing conflicts — Teams sharing venues with NBA teams get suboptimal game times, court setups, or alternative venues during NBA playoff overlap. Affects practice access, shootarounds, and attendance.
2025 three-point line structural break — All pre-2025 three-point data is non-comparable (line moved to NBA distance).
Charter flight 2024 structural break — Pre-2024 travel/rest data overvalues HCA.

BUILD PLAN

Phase 1: Core Data Tables

wnba_teams: team_id, name, abbreviation, conference, arena, shares_arena_with, charter_flight, expansion_year, coach_id, cap_space, active wnba_players: player_id, name, team_id, position, height, age, experience, salary, overseas_team, national_team, star_tier (1-3), active wnba_games: game_id, season, date, time, home_team, away_team, home_score, away_score, attendance, commissioner_cup, era (pre/post_charter, pre/post_3pt_change) wnba_player_game_stats: stat_id, game_id, player_id, minutes, pts, reb, ast, stl, blk, tov, fg_made, fg_att, 3p_made, 3p_att, ft_made, ft_att, plus_minus, usage_rate, per_40_pts, per_40_reb, per_40_ast wnba_team_game_stats: stat_id, game_id, team_id, pace, off_rtg, def_rtg, net_rtg, efg_pct, tov_pct, orb_pct, ft_rate, active_roster_count wnba_injuries: injury_id, player_id, status, reason, first_reported, last_updated, games_missed, hardship_eligible wnba_schedule: game_id, rest_days_home, rest_days_away, travel_distance, timezone_change, arena_conflict_flag wnba_rosters: roster_id, team_id, player_id, joined_date, left_date, transaction_type, hardship_exception wnba_referees: ref_id, game_id, referee_name, home_cover_pct, over_pct, avg_fouls_called wnba_overseas: overseas_id, player_id, league, team, season, games, stats_json, wnba_equiv_factor wnba_draft: draft_id, year, round, pick, player_name, college, mock_consensus_pick, workout_reports wnba_awards: award_id, season, award_type, player_id, votes_or_shares, rank wnba_odds: odds_id, game_id, market_type, book, selection, odds, timestamp, pinnacle_covered

Phase 2: Custom Metrics

Metric	Formula	Notes
Star Impact (On/Off)	Team net rating WITH - WITHOUT player	Per-40 minute basis
Bayesian Team Rating	Shrunk four-factors with prior-season data	WNBA-specific weights
Overseas Import Score	Stats × league_conversion_factor (0.65-0.85)	Preseason projections
Roster Depth Index	Sum of player impact ratings for players 6-12	Hardship vulnerability
HCA (Charter Era)	2024+ only data, venue-specific crowd factor	Ignore pre-2024 travel
3PT Era Adjustment	Pre/post 2025 three-point line structural break	Separate models
Minutes Fatigue Model	f(active_roster_count, minutes_played, game_time)	Hard cap effect
Sample Confidence	Games / stability_threshold (15-20)	Scale edge thresholds
Pinnacle Coverage Flag	Binary: is Pinnacle covering this game?	Falls back to synthetic line

Phase 3: 7 Edge Scanners

Scanner	Min Edge	Unique Logic
Moneyline	4%	Four-factors × availability × HCA (charter era); 5 cent min on Pinnacle
Spread	5%	Star absence margin shift; Skellam distribution
Totals	5%	Pace matchup × rest × roster count (hardship); Skellam
Series	6%	MC simulation; best-of-3 or best-of-5; home court pattern
MVP	8%	Stats + narrative + voting history; Dirichlet-multinomial
ROY	10%	Draft position × usage opportunity × team context
Draft #1	12%	Mock consensus × GM signals × workout reports

Phase 4: Dashboard

Game board: upcoming games with availability, rest, ref crew, edge flags
Player drill-down: per-40 stats, on/off splits, overseas form, injury timeline
Team ratings: four-factors, pace, net rating, roster depth, hardship risk
MVP/ROY tracker: running projections + narrative score
Referee tendencies: home cover %, over %, foul rates
Commissioner's Cup standings and motivation flags
Expansion team calibration progress
P&L by market type and edge bucket

OPEN QUESTIONS FOR BOSS RULING

Her Hoop Stats subscription: Required for best WNBA analytics. Cost?
Player props: Build prop architecture or focus on game-level markets first?
Live in-game betting: Build real-time model for WNBA in-game lines?
Historical depth: How many seasons to backfill? (Note: pre-2024 charter and pre-2025 3PT breaks)
Overseas league coverage: Which leagues to track (EuroLeague Women, Turkish, WNBL, Chinese)?
Pinnacle gap strategy: When Pinnacle doesn't cover a game, use synthetic line or skip?
Commissioner's Cup: Model as separate competition or flag only?

COUNCIL METADATA

Detail	Value
Council date	2026-04-01
Advisory responses	5 (all completed)
Peer reviews	5 (all completed)
Strongest advisor	Gemini (2/5 genuine votes — from Grok and gpt-oss)
Runner-up	Opus (1/5 genuine from Sonnet — deepest domain specifics)
Biggest blind spot	Player prop architecture absent
Full council data	`/home/ubuntu/edgeclaw/data/councils/2026-04-01/wnba-data-audit/`

Source: ~/edgeclaw/results/panel-results/wnba-data-audit-ruling.md