Purpose: This document is a universal template for building a new player props desk in EdgeClaw. An AI reading this should know exactly what to build, how to wire it up, and what questions to ask before starting.
Reference implementations: MLB Player Props Desk (docs/mlb-player-props-desk-construction-spec.md), MLB Desk (docs/mlb-desk-construction-spec.md)
Every player props desk follows the same 4-table pipeline:
Table 1: Kalshi Prop Data (soft book — what we trade against)
↓
Table 2: Anchor Book Prop Data (sharp book — our fair value reference)
↓
Table 3: Prop Probability Curves (book curve + model curve vs Kalshi price)
↓
Table 4: Prop Edge Scanner Output (mispricings found)
The rule: Each table is sport-specific. No mixing data across sports. No mixing props data with game-level data. Each props desk gets its own isolated databases, separate from the game desk for the same sport.
What: Raw player prop contract prices pulled from the Kalshi API.
Database: kalshi-{sport}-props.db
Table name: kalshi_{sport}_props
Standard columns:
| Column | Type | Description |
|---|---|---|
| ticker | TEXT | Full Kalshi ticker (e.g., KXMLBHR-26APR07-A.BREGMAN-O1) |
| player_name | TEXT | Parsed player name (standardized) |
| prop_type | TEXT | What stat: pitcher_strikeouts, batter_hits, batter_home_runs, etc. |
| threshold | REAL | The line (e.g., 6.5 strikeouts, 1.5 hits) |
| yes_bid | INTEGER | Cents (0-100) |
| yes_ask | INTEGER | Cents (0-100) |
| yes_exec | REAL | Midpoint of yes bid/ask |
| no_bid | INTEGER | Cents (0-100) |
| no_ask | INTEGER | Cents (0-100) |
| no_exec | REAL | Midpoint of no bid/ask |
| spread | INTEGER | Ask minus bid |
| volume | INTEGER | Contracts traded |
| scan_type | TEXT | Collection window: 6am, 8am, 10am, 2pm, 6pm, close |
| snapshot_type | TEXT | scheduled / event_triggered / closing |
| captured_at | TEXT | ISO timestamp |
Data quality rules:
Player name parsing: Kalshi tickers encode player names in abbreviated format (e.g., "A.BREGMAN"). You need a parser to extract this and a crosswalk table to map it to the full name used by other sources. This is critical — without it, you can't match Kalshi props to anchor book props.
What: Sharp book prop lines that serve as our fair value reference.
For player props desks, the anchor is typically FanDuel (sharpest prop book). SBR multi-book lines serve as alternative reference for cross-validation.
Database: fd-{sport}-props.db (FanDuel)
Table names: fd_{sport}_prop_lines
Standard columns:
| Column | Type | Description |
|---|---|---|
| player_name | TEXT | Full player name as listed by the book |
| market | TEXT | Base prop type: pitcher_strikeouts, batter_hits, batter_home_runs, etc. |
| threshold | REAL | The rung: 1 (1+), 2 (2+), 3 (3+), etc. |
| side | TEXT | Yes (always for FD direct data) |
| price | INTEGER | American odds |
| implied_prob | REAL | Implied probability for Yes (0-1) |
| no_implied_prob | REAL | Implied probability for No = 1 - implied_prob |
| line | REAL | Raw handicap from API (0 for batter props, not used) |
| event_id | TEXT | FD event ID |
| home_team | TEXT | |
| away_team | TEXT | |
| game_date | TEXT | YYYY-MM-DD |
| scan_type | TEXT | Collection window |
| captured_at | TEXT | ISO timestamp |
Key insight from MLB build: FanDuel's direct API gives you a full prop ladder as separate Yes/No markets at each threshold. "To Record A Hit" (1+), "To Record 2+ Hits" (2+), etc. The FD price IS the implied probability at each threshold — no distribution math needed for the book side. Store with market = batter_hits and threshold = 1, 2, 3, 4 — NOT as separate market types per threshold.
FD Direct API pattern: sbapi.{state}.sportsbook.fanduel.com/api/event-page?_ak={key}&eventId={id}&tab=batter-props (and tab=pitcher-props). No auth needed. Returns markets with runners, each runner has American odds. Use tab=batter-props and tab=pitcher-props per event.
Cross-validation rule: If FanDuel and SBR multi-book consensus both offer the same prop and their de-vigged probabilities agree within 2%, that edge gets tagged HIGH confidence. If they disagree by more than 10%, tag it LOW confidence.
What: For each player, for each prop type, for each game, build TWO probability curves and compare both against every Kalshi threshold:
Book curve — Derived from FanDuel's de-vigged prop ladder. FD gives you the probability at each threshold directly. De-vig their over/under prices to get true probability.
Model curve — Derived from the desk's own player model. This uses player baselines (EWMA of recent performance), matchup adjustments (opponent quality, venue factors), and a statistical distribution to generate an independent probability at each threshold.
Both curves are compared against Kalshi's price at each threshold. When either curve says a Kalshi contract is mispriced by more than the fee (typically 7%), that's an edge.
Database: {sport}-prop-edges.db
Table name: {sport}_prop_probability_curves
Standard columns:
| Column | Type | Description |
|---|---|---|
| scan_type | TEXT | Collection window |
| game_date | TEXT | YYYY-MM-DD |
| game_time | TEXT | HH:MM ET |
| player_name | TEXT | Standardized player name |
| player_id | TEXT | Crosswalk ID |
| prop_type | TEXT | pitcher_strikeouts, batter_hits, etc. |
| threshold | REAL | The alt-line value (e.g., 5.5, 6.5, 7.5 for strikeouts) |
| fd_anchor | REAL | FanDuel's posted line for this prop |
| fd_yes | REAL | FD de-vigged probability of Over at this threshold |
| fd_no | REAL | FD de-vigged probability of Under at this threshold |
| model_yes | REAL | Model probability of Over at this threshold |
| model_no | REAL | Model probability of Under at this threshold |
| kalshi_yes | REAL | Kalshi exec price for YES |
| kalshi_no | REAL | Kalshi exec price for NO |
| rung | INTEGER | 0 = main line, positive = further from 50/50 |
| is_main_line | INTEGER | 1 if this threshold is the book's headline line (closest to 50/50), 0 otherwise |
| actual_stat | REAL | What actually happened (filled by settlement after game ends) |
| outcome | TEXT | "over" or "under" relative to this row's threshold |
| fd_was_right | INTEGER | 1 if FD's implied probability favored the correct side, 0 if not |
| model_was_right | INTEGER | 1 if the model favored the correct side, 0 if not |
| fd_error | REAL | How far off FD was (positive = underconfident on correct side, negative = wrong side) |
| model_error | REAL | Same for model |
| settled_at | TEXT | Timestamp when result was recorded |
| captured_at | TEXT | ISO timestamp |
The model curve needs a statistical distribution to convert a player baseline into probabilities at each threshold. The right distribution depends on the stat type:
General rules:
| Stat characteristic | Distribution | Why |
|---|---|---|
| Counting stats, low mean (0-2 range) | Poisson or Zero-Inflated Poisson | Many zeros, rare events (home runs, stolen bases) |
| Counting stats, medium mean (2-8 range) | Negative Binomial | Overdispersed counts — variance > mean (strikeouts, hits) |
| Sum of multiple stats | Normal approximation | Central limit theorem (H+R+RBI, PRA) |
| High-count stats (20+ range) | Normal | Large enough for Normal to work (points, fantasy score) |
MLB examples (for reference):
| Prop type | Distribution | Parameters |
|---|---|---|
| Pitcher strikeouts | Negative Binomial | mean from EWMA, dispersion from game log variance |
| Batter hits | Negative Binomial | mean from EWMA |
| Batter home runs | Zero-Inflated Poisson | lambda from per-PA rate × projected PA |
| Batter total bases | Negative Binomial | mean from EWMA |
| H+R+RBI combo | Normal | mean and sigma from component sums |
| Stolen bases | Zero-Inflated Negative Binomial | Very rare event, many zeros |
| Runs scored | Poisson | Low-count event |
| Pitcher outs | Normal | High enough count for Normal |
The model curve is built from player-specific baselines adjusted for matchup context. Standard inputs:
Player baselines:
blended = (career × k + season × n) / (k + n) where k varies by stat (HR needs ~170 PA to stabilize, K needs ~60 PA, Hits needs ~800 PA)Matchup adjustments (sport-specific):
The adjustment formula:
adjusted_rate = blended_rate × opponent_factor × venue_factor × other_factors
Cap adjustments to reasonable range (e.g., 0.75 to 1.30) to prevent extreme outputs.
What: The final output — player prop mispricings found by comparing Kalshi prices against both the book curve and model curve.
Database: {sport}-prop-edges.db (same DB as curves)
Table name: player_prop_edges
Standard columns:
| Column | Type | Description |
|---|---|---|
| player_name | TEXT | |
| prop_type | TEXT | pitcher_strikeouts, batter_hits, etc. |
| side | TEXT | over / under |
| line | REAL | Kalshi threshold |
| anchor_line | REAL | FD's posted line |
| fd_prob | REAL | FD de-vigged fair probability |
| sbr_prob | REAL | SBR multi-book consensus probability (cross-validation) |
| model_prob | REAL | Model probability |
| kalshi_price | REAL | What Kalshi is offering |
| execution_price | REAL | What you'd actually pay |
| raw_edge_book | REAL | fd_prob - kalshi_price |
| raw_edge_model | REAL | model_prob - kalshi_price |
| net_edge | REAL | Edge after fee (7% Kalshi fee) |
| confidence_tier | TEXT | HIGH / MEDIUM / LOW |
| executable | BOOLEAN | Spread tight enough to trade? |
| distribution_type | TEXT | Which distribution was used |
| player_sigma | REAL | Player-specific variance parameter |
| scan_type | TEXT | Collection window |
| detected_at | TEXT | ISO timestamp |
| actual_outcome | TEXT | win / loss / push (filled after settlement) |
| settled_at | TEXT | |
| closing_price | REAL | For CLV tracking |
| clv | REAL | Closing line value |
Confidence tier rules:
In addition to edges, track sharp line movement:
Table: {sport}_prop_steam
Signals to detect:
Everything runs on Eastern Time (ET).
| Minute | What fires | Cron pattern |
|---|---|---|
| :00 | FanDuel prop lines pull | 0 6,8,10,14,18 * * * |
| :00 | Kalshi prop contracts pull | 0 6,8,10,14,18 * * * |
| :00 | Game-level anchor (Pinnacle) for matchup context | 0 6,8,10,14,18 * * * |
Game-day windows: 6 AM, 8 AM, 10 AM, 2 PM, 6 PM ET Closing snapshot: 1 minute before each game's start time (staggered per game)
Three staggered groups:
| Group | Time | What runs |
|---|---|---|
| Group 1 — Raw Data | 9:00 AM | Stats API pulls: player game logs, season stats, external scrapes |
| Group 2 — Baselines | 9:05 AM | EWMA baselines, career stats, blended rates, player crosswalk |
| Group 3 — Derived | 9:10 AM | Matchup context, player variance, adjusted rates, correlations |
| Minute | What fires | Cron pattern |
|---|---|---|
| :03 | Steam detection (compare consecutive snapshots) | 3 6,8,10,14,18 * * * |
| :10 | Prop probability curves rebuild | 10 6,8,10,14,18 * * * |
| :12 | Prop edge scanner | 12 6,8,10,14,18 * * * |
Critical: Same rule as game desks — edge scanner runs AFTER curves, curves run AFTER data collection. The timing chain is: data (:00) → steam (:03) → curves (:10) → edges (:12).
If the sport has season-long player props or award futures:
| Phase | Frequency | When to transition |
|---|---|---|
| Phase 1 | Weekly (Monday) | Start of season → ~2 months before end |
| Phase 2 | Every 3 days | ~2 months before end → ~1 month before end |
| Phase 3 | Daily | ~1 month before end → season end |
| Expired | Stop scanning | After season ends |
Database: kalshi-{sport}-prop-futures.db (season props), kalshi-{sport}-awards.db (awards)
{
name: '{Sport} Player Props',
slug: '{sport}-player-props',
kalshi: [{
category: 'Sports / {Category} / {Sport} Props',
hasScraper: true,
scraperName: 'collector.ts (Kalshi sports cron)',
freshnessKey: 'kalshi-{sport}-props',
series: [
{ ticker: 'KXSPORTPROPTYPE', label: 'Human Label', dataViewKey: 'kalshi-{sport}-prop-type' },
// one per Kalshi prop series
],
}],
sources: [
// Sources organized by group — see Standard Groups below
],
}
Every player props desk should have these groups on the dashboard:
Same pattern as game desks. Every freshnessKey needs a mapping:
'freshnessKey': {
db: 'database-name',
tables: ['table_name'],
filter?: { column: 'col', value: 'val' },
}
Naming convention for freshnessKeys:
kalshi-{sport}-props — all Kalshi props for this sportkalshi-{sport}-prop-{type} — filtered by prop type (hr, hits, ks, etc.)fd-{sport}-props — FanDuel prop linesprop-edge-{sport}-{type} — edges per prop type{sport}-prop-curves-{type} — probability curves per prop type{sport}-prop-model — player baselines and model data{sport}-prop-matchup-context — matchup adjustments{sport}-batter-baselines / {sport}-player-baselines — player stat baselines{sport}-blended-rates — Bayesian shrinkage output{sport}-adjusted-rates — matchup-adjusted ratesSame pattern as game desks. Register each cron job, update freshness after each run.
| Source type | Yellow (stale) | Red (alert) |
|---|---|---|
| FD prop lines (6/8/10/2/6) | 30 min after window | 90 min after window |
| Kalshi props (6/8/10/2/6) | 30 min after window | 90 min after window |
| Season props/awards (adaptive) | 2 days after expected | 7 days after expected |
| Prop edge scanner | 30 min after window | 90 min after window |
| Player baselines (9 AM) | 60 min after expected | 180 min after expected |
| Matchup context (9 AM) | 60 min after expected | 180 min after expected |
| Steam detection | 60 min after last signal | 180 min after last signal |
Player props desks READ from the parent game desk's databases but don't write to them. This gives props access to game-level context without duplicating scrapers.
What the props desk reads from the game desk:
The props desk does NOT duplicate these scrapers. It just reads the tables that the game desk already populates.
| # | Item | What to check |
|---|---|---|
| 1 | Databases isolated | Props in own .db files, separate from game desk |
| 2 | Scraper queue independent | Props scrapers don't depend on game desk timing |
| 3 | Recovery queue | Props scrapers in recovery queue |
| 4 | Freshness tracking | Every props source has freshnessKey on dashboard |
| 5 | Dashboard views | Every freshnessKey resolves to working view |
| 6 | Edge scanner | Props scanner separate from game scanner |
| 7 | Scan windows | Every row tagged with scan_type |
| 8 | Data cleanliness | No live, no settled, same-day for game-day props |
| 9 | Column formatting | Human-readable on dashboard |
| 10 | Filters | Column filters on all views |
| 11 | No cross-desk dependencies | Props desk works even if game desk scraper fails |
| 12 | Player name crosswalk | Names standardized across all sources |
| 13 | Alerts | Freshness alerts for all props sources |
| 14 | Season props isolated | Separate DB from game-day props |
| 15 | Awards isolated | Separate DB from game-day props |
| 16 | Probability curves | Both book and model curves, per prop type |
| 17 | Matchup context | Daily refresh wired to Group 3 cron |
Hard-won lessons from building the MLB Player Props desk. Read these before starting any new props desk.
Player name mismatch is the #1 problem. Kalshi uses abbreviated tickers ("A. BREGMAN"), FanDuel uses full names ("Alex Bregman"), and stats APIs use yet another format. Without a crosswalk table that maps names across ALL sources, you can't match Kalshi props to FD props, and your curves table will have nulls everywhere. Build the crosswalk FIRST. Handle edge cases: Jr./Sr. suffixes, accented characters, misspellings (Kalshi had "SUREZ" for Suarez), nicknames.
Wrong scanner wired to cron. The MLB prop cron was calling the generic scanner (scanPropEdges('mlb') which read from an empty player_props table) instead of the dedicated MLB scanner (scanMlbPropEdges() which read from mlb_prop_lines). The edges table was empty for days. Always test the cron manually after wiring.
Filter too aggressive on minimum price. Initial 5-cent minimum filter was dropping valid alt lines. Lowered to 1 cent. Be conservative with data-side filters — you can always filter tighter on the dashboard.
Kalshi API pagination. Some sports have 5+ pages of prop contracts. The initial pull only fetched page 1 and missed most data. Always paginate fully. Add 500ms+ delays between pages to avoid rate limits.
FD price IS the probability (for book curve). FanDuel gives you a full ladder of over/under lines at different thresholds. You de-vig each line and that's your book probability. No distribution math needed for the book side. The distribution math is only for the MODEL curve.
FanDuel direct API vs Odds API — use the direct API. The Odds API only returned 3 prop types for MLB FanDuel (stolen bases, strikeouts, pitcher outs). FanDuel's own public API (sbapi.il.sportsbook.fanduel.com/api/event-page) returns 25 prop types including all hits, HR, TB, RBI, runs, doubles, triples, singles, combos, and alt strikeout lines. No auth needed — just the public _ak key. The Odds API is a backup, not the primary source. Build a direct scraper first.
FD stores one market name per prop type with a threshold column. FanDuel's data uses a single market value per prop type (e.g. batter_hits, pitcher_strikeouts, batter_total_bases) with a separate threshold column (1, 2, 3, 4, etc.) for each rung. Do NOT create separate market names per threshold like batter_hits_1plus, batter_hits_2plus — that creates 25 market types instead of 11 and makes filtering/grouping impossible. When building probability curves, query by market and group by threshold. The line column is always 0 for props — the real value is in threshold.
Filter out combined player props. FanDuel offers combined pitcher props like "Zac Gallen & Freddy Peralta" for total strikeouts between both starters. Skip any player name containing " & " — there's no model rate for pairs, Kalshi doesn't offer combined markets, and they produce meaningless curve rows.
Store both Yes and No implied probabilities. MLB props on Kalshi have both Yes and No contracts. Store implied_prob (Yes) and no_implied_prob (1 - Yes) for FD data. For Kalshi, store yes_bid, yes_ask, no_bid (= 100 - yes_ask), no_ask (= 100 - yes_bid). The edge scanner needs both sides to find the best trade.
FD runner name parsing is tricky. For batter props, the runner name IS the player name ("Alex Bregman"). For pitcher K alt lines, the runner name includes the threshold ("Gavin Williams 3+ Strikeouts") — strip the "3+ Strikeouts" part. For pitcher K O/U, the runner name includes Over/Under ("Mike Burrows Over") — if you don't catch this, "Mike Burrows Over" becomes a player name. Handle all three cases in the parser.
Edge scanner must run AFTER curves. Curves must run AFTER data collection. The timing chain is data (:00) → curves (:10) → edges (:12). If the scanner runs at :00, it reads yesterday's curves.
Staggered morning schedule matters. Raw stats (Group 1, 9:00) → baselines (Group 2, 9:05) → derived metrics & matchup context (Group 3, 9:10). If you compute matchup adjustments before baselines are updated, you get yesterday's matchup context.
Live data leaks in without explicit checks. Always check game start time. Skip props for games that have already started. Filter settled contracts (bid <= 5 or bid >= 95). This isn't just about cleanliness — live data corrupts your curves and produces fake edges.
Kalshi API rate limits are aggressive. Kalshi returns 429 (too many requests) if you hit their API too fast. When scanning multiple series in a loop, use at least 2-second delays between calls. The collector already handles this with sleep between series, but one-off scripts and new scrapers need it too. If you're scanning 15+ series (like league leaders), expect it to take 30+ seconds. Don't retry 429s immediately — wait and try on the next cron cycle. The cadence system handles this automatically for scheduled runs.
An AI starting a new player props desk MUST ask and get answers for these before writing any code:
This template is updated as new desks are built and new lessons are learned. The "Things to Watch Out For" section grows with every desk.
Last updated: 2026-04-08
Running log of changes and decisions for the player props pipeline. Each entry has two sections:
If you're building this from scratch, read every Boss Notes section first — that's how the boss thinks about this system.
Changes:
The probability curves table had threshold = 0 for all pitcher strikeout rows, and batter prop rows were mostly missing entirely. Three bugs:
fdData.line (always 0) instead of fdData.threshold (the actual number like 5, 6, 7). Fix: use fdData.threshold - 0.5 for exceedance calculation.market name only — for pitcher strikeouts, every threshold shares the same market name, so only 1 threshold per pitcher was kept. Fix: key by market + threshold.batter_hits_1plus, pitcher_alt_strikeouts, etc.). Actual FD data uses single market names (batter_hits, pitcher_strikeouts) with a threshold column. Fix: updated all 5 prop type configs.Also filtered out combined pitcher props ("Zac Gallen & Freddy Peralta") — useless for curves (no model rate for pairs, Kalshi doesn't offer them).
Result: 424 broken rows → 4,257 correct rows. Lessons #7 and #8 in "Things to Watch Out For" rewritten.
Files: src/pipeline/data/scrapers/mlb-prop-probability-curves.ts
Boss Notes:
/data-status/view/. If a column looks wrong (all zeros, weird names, missing data), he'll flag it. The dashboard is his primary QA tool.Changes:
Built MLB prop settlement and added tracking columns to the probability curves table.
New columns on mlb_prop_probability_curves:
is_main_line — flags which threshold is FD's headline line (closest to 50/50 implied probability)actual_stat — what actually happened (e.g., 7 strikeouts)outcome — "over" or "under" relative to each thresholdfd_was_right / model_was_right — who called the correct sidefd_error / model_error — how far off each probability was from realitysettled_at — when result was recordedNew file: src/pipeline/data/scrapers/settle-mlb-props.ts
Cron: 0 15,17,19,21,23,1 * * * (every 2h from 3PM-1AM ET) — settles as games finish, not waiting until morning.
Schema update added to ensureTable() with migration logic (ALTERs if columns missing).
Updated Table 3 standard schema in this template with all settlement columns.
Files: src/pipeline/data/scrapers/mlb-prop-probability-curves.ts, src/pipeline/data/scrapers/settle-mlb-props.ts (new), src/cron/scheduler.ts
Boss Notes:
is_main_line = 1) so he can filter the dashboard to just headline lines and see the scorecard at a glance. Not a value on every row — just a flag on the one threshold closest to 50/50.fd_main_threshold column — redundant, just filter by is_main_line = 1 and the threshold column tells you what it is.Changes:
Added normalizeName() to the curve builder and settlement function. Strips accents (ñ→n, é→e), removes Jr./Sr. suffixes, lowercases. Applied on both sides — when building the model rates map and when looking up rates for FD players. Same normalizer added to settlement for matching box score names.
Players like "Ronald Acuña Jr." (model) now match "Ronald Acuna Jr." (FanDuel). Missing model data went from 42 players to 14. Remaining 14 are genuinely not in the model (rookies, bench players, or unusual name formats like "C.J. Abrams", "Max P. Muncy").
Files: src/pipeline/data/scrapers/mlb-prop-probability-curves.ts, src/pipeline/data/scrapers/settle-mlb-props.ts
Boss Notes:
Changes:
The dashboard's percentage formatter was multiplying edge_yes/edge_no by 100, but these columns are already stored as 0-100 percentages. A value of 1.25 (meaning 1.25%) was displayed as 125%. Removed edge_yes and edge_no from the explicit ×100 formatting list in data-view.ts. These columns only exist in probability curves tables, which all store values as 0-100.
Files: src/pipeline/data-status/data-view.ts
Boss Notes:
Changes:
mlb-tier2-metrics.ts now reads actual SP hand from sp_baselines and passes it through to matchup context. Was missing, so platoon adjustments had no hand to work with.mlb-prop-matchup-context.ts now reads individual batter splits (vs LHP/RHP) from mlb-batting.db and computes a platoon_factor. Team-level platoon splits moved off the props desk (too noisy).mlb-prop-edge-scanner.ts and mlb-props-edge-scanner.ts now apply the platoon factor to adjust model probability before computing edges.batter_game_logs. Column naming fixed: vsSplit→statSplits, LHP/RHP→L/R.savant_batter_stats table in mlb-batting.db. Scrapes Statcast leaderboard for barrel%, exit velo, hard hit%, xBA, xSLG, etc. Replaced old team batting scraper that was pulling wrong data.final_total, result, and settlement cron now work against mlb-edges.db (was only running against research-pipeline.db, which was empty for MLB).mlb-pitching.db). Cron schedules fixed to match actual run times.Files: mlb-tier2-metrics.ts, mlb-prop-matchup-context.ts, mlb-prop-edge-scanner.ts, mlb-props-edge-scanner.ts, scrape-baseball-savant.ts, data-view.ts, desk-config.ts, source-tables.ts, scheduler.ts
Boss Notes:
Changes:
Added all 15 KXLEADERMLB series to the collector, dual-writer, and dashboard. New DB: kalshi-mlb-leaders.db with kalshi_mlb_leaders table. Columns: ticker, leader_type, player_name, yes_bid, yes_ask, volume, snapshot_type, captured_at. Initial scan pulled 951 contracts across 13 active series (Saves and Batter Strikeouts have 0 markets).
Series: HR, hits, RBI, runs, steals, doubles, triples, batting avg, OPS, ERA, pitcher wins, saves, pitcher K, batter K, WAR.
Dashboard: new "League Leaders" group with filtered views for HR, hits, ERA, WAR.
Adaptive schedule via existing cadence system: weekly when 3+ months out, every 3 days at 1-3 months, daily in final month.
Added lesson #9 to "Things to Watch Out For": Kalshi API rate limits — 2-second delays between series scans.
Files: src/pipeline/sports/collector.ts, src/pipeline/sports/mlb-dual-write.ts, src/pipeline/mlb-db.ts, src/pipeline/data-status/source-tables.ts, src/pipeline/data-status/desk-config.ts, scripts/scan-mlb-leaders.ts (new)
Boss Notes: