Certainty Gap Desk — Final Merged Specification

Version: 1.0 — Merged Final Date: 2026-03-15 Panel: Claude Opus 4.6 (base), Gemini 3.1 Pro, Grok 4.1, DeepSeek V3.1, Sonar Reasoning Pro Status: Authoritative build document for coding bot

This document merges all four panel responses into one complete spec. Opus 4.6 is the complete base — nothing removed. Unique contributions from Gemini, Grok, and DeepSeek are tagged [PANEL: Model Name]. This is a data collection spec targeting Kalshi and Polymarket (both ladder-structured prediction markets), plus traditional financial markets. The Certainty Gap IS the edge: where AI/models are highly certain but prediction market prices still imply doubt, you trade the gap.


Table of Contents

  1. What Is the Certainty Gap Desk
  2. Architecture Decision
  3. System Architecture
  4. Data Collection Spec
  5. SQLite Schema — CREATE Statements
  6. API Endpoints and Collection Cadence
  7. Storage Estimates
  8. All Strategies — Complete Catalog
  9. Prediction Market Execution — Kalshi and Polymarket
  10. Portfolio Allocation
  11. Build Priority
  12. Key SQL Queries
  13. Implementation Checklist

What Is the Certainty Gap Desk

A cross-market desk that looks for ONE pattern everywhere:

Something is near-100% certain (because of physics, math, or rules), but the market is still pricing it like there's doubt. That gap between certainty and market price is where money lives.

This came out of reviewing the weather lock-in strategy. Temperature always drops after the daily high — that is physics. But Kalshi still prices "over" contracts like the temp might come back up. The same pattern appears in options, crypto, sports, bonds, forex, and prediction markets.

The desk is specifically optimized to find situations where:

The gap IS the edge. We are not forecasting. We are exploiting certainty the market has not priced in.


Architecture Decision

3 out of 4 panel members: STANDALONE desk.

Breakdown:

Why this is NOT arbitrage:

Dimension Arbitrage Desk Certainty Gap Desk
Time horizon Milliseconds to minutes Hours to weeks
Edge source Price discrepancy between venues Knowledge asymmetry about outcome certainty
Risk profile Near-zero (simultaneous execution) Small but real (the 0.02% tail)
Data needed Price feeds from multiple venues Physics models, rules databases, statistical distributions
Execution style Speed-critical, simultaneous legs Patient entry, time-based exit
Capital management Brief exposure, rapid recycling Capital locked for holding period
Failure mode Execution risk (leg risk) The rare event where "certain" was not certain

[PANEL: Gemini 3.1 Pro] Matrix Architecture framing: Think of the system as a grid. Columns are execution environments (Crypto Desk, Options Desk, Sports Desk). They handle API connections, liquidity checks, and order routing. Rows are Strategy Engines (Arbitrage Engine, Momentum Engine, Convergence/Certainty Engine). The Certainty Gap engine scans for gaps and sends trade instructions to the relevant execution desk. This avoids duplicating infrastructure.


System Architecture

Certainty Gap Desk
├── Pattern Library (the "certainty database" — the moat)
│   ├── Physics-based certainties (weather, thermodynamics)
│   ├── Mathematical certainties (options expiry, theta decay, futures convergence)
│   ├── Rule/structural certainties (sports rules, contracts, redemptions, index rules)
│   └── Statistical near-certainties (>99.5% historical base rate)
│
├── Scanner Layer (per-market monitors)
│   ├── Options scanner (theta, pin risk, expiry mechanics, IV percentile)
│   ├── Crypto scanner (pegs, funding rates, liquidation cascades, futures basis)
│   ├── Volatility scanner (VIX term structure, backwardation detection)
│   ├── Prediction market scanner (Kalshi, Polymarket — ladder structure aware)
│   ├── Sports scanner (live game states, garbage time detection)
│   ├── Weather scanner (NWS physical models vs market prices)
│   ├── Forex/rates scanner (central bank rate differentials, forward curves)
│   ├── Equity scanner (index rebalancing announcements, earnings IV)   [PANEL: Gemini 3.1 Pro]
│   └── Fixed income scanner (callable bond call dates, CEF NAV discounts, tender offers)   [PANEL: Gemini 3.1 Pro]
│
├── Gap Detector (compares certainty level vs market price)
│   ├── Certainty score: how sure are we? (99.5%? 99.99%?)
│   ├── Market implied probability: what does the price say?
│   ├── Gap size: certainty_score - market_implied_prob
│   ├── Threshold filter: only act when gap > minimum edge
│   └── Fee-adjusted net edge: gap minus round-trip trading cost
│
├── Risk Manager
│   ├── Per-trade max exposure
│   ├── Correlation monitor (are multiple bets on same underlying risk?)
│   ├── Tail risk budget (what if the 0.02% event hits?)
│   ├── Kelly criterion position sizing adjusted for certainty level
│   └── Per-strategy capital caps
│
└── Execution Layer
    ├── Kalshi adapter (ladder-structure aware, REST API)
    ├── Polymarket adapter (ladder-structure aware, REST/CLOB API)
    ├── Crypto exchange adapters (Binance, Bybit, Coinbase, DEX)
    ├── Options/equities adapter (IBKR API)
    └── Holding period manager (time-based exits)

The Pattern Library is the moat. Every new certainty gap discovered gets added. The desk becomes a living encyclopedia of market mispricing about known outcomes. Over time, the library compounds in value.

[PANEL: Gemini 3.1 Pro] Execution routing model: The scanner layer does not execute directly. It emits a gap_signal record with: market, instrument, certainty_score, market_implied_prob, gap_size, suggested_action, target_desk. The target desk executes via its own adapters. This keeps execution logic centralized per asset class.


Data Collection Spec

What Gets Collected

For every market we monitor, we collect:

  1. Raw market prices — what the market currently implies as probability or price
  2. Certainty scores — our computed probability from physics/math/rules/statistics
  3. Gap signals — when certainty > market_price by threshold
  4. Positions and outcomes — what we traded, at what price, what it settled at
  5. Pattern performance — which pattern types produce real edges vs. false positives

Collection feeds by market:

Market Data Feed Cadence Key Fields
Kalshi prediction markets REST API (no auth for market data) 60s for active contracts, 5min for inactive ticker, yes_price, no_price, volume, close_time, result
Polymarket REST API + CLOB 60s for active condition_id, yes_price, no_price, volume, end_date, resolved
NWS weather api.weather.gov 15min temperature, forecast_temp, station_id, observation_time
VIX spot + futures CBOE REST or IBKR 5min during market hours vix_spot, m1_futures, m2_futures, term_structure_slope
SPX/SPY 0DTE options IBKR or Tradier API 1min last 3 hours of trading day strike, expiry, iv, delta, theta, gamma, bid, ask, oi
Crypto funding rates Binance/Bybit WebSocket Real-time (every 8h funding, 1min spot) symbol, funding_rate, mark_price, spot_price, next_funding_time
Crypto liquidations Coinglass API + exchange WebSocket Real-time symbol, liq_amount_usd, direction, cumulative_1h, cumulative_4h
Stablecoin prices Binance/Kraken/Coinbase REST 60s pair, price, volume_24h, deviation_from_peg
Crypto futures basis Binance quarterly futures 5min symbol, spot_price, futures_price, basis_pct, expiry_date
Index rebalancing events SEC EDGAR + S&P/FTSE press releases scrape Daily at 6AM ET company, index, announcement_date, effective_date, sector
CEF NAV and price Morningstar or fund sponsor Daily at 5PM ET ticker, nav, market_price, discount_pct, discount_vs_historical
CEF tender offers SEC EDGAR Form TO-I scrape Daily at 6AM ET fund_ticker, tender_price_pct_nav, expiry_date, announced
Callable bond call dates FINRA TRACE or Bloomberg Daily cusip, issuer, call_date, call_price, current_price, ytc
Earnings IV data IBKR or Tradier Weekly (earnings calendar sweep) ticker, earnings_date, current_iv, historical_move_avg, implied_move
Forex forward rates + central bank rates OANDA REST 1h pair, spot_rate, 1m_forward, 3m_forward, base_rate, quote_rate, differential
Sports live game states The Odds API + ESPN live 30s during games game_id, sport, score_diff, time_remaining, live_moneyline, live_spread
LST (liquid staking token) prices CoinGecko + Curve pool 5min token, underlying, discount_pct, redemption_queue_days

SQLite Schema

All tables go into a single database: /home/ubuntu/edgeclaw/data/db/certainty-gap.db

-- ============================================================
-- CERTAINTY GAP DESK — SQLite Schema
-- /home/ubuntu/edgeclaw/data/db/certainty-gap.db
-- ============================================================

PRAGMA journal_mode=WAL;
PRAGMA foreign_keys=ON;

-- ============================================================
-- PATTERN LIBRARY
-- The growing database of "certainty types" the desk knows about.
-- Each row is a class of certainty gap, not an individual trade.
-- ============================================================

CREATE TABLE IF NOT EXISTS pattern_library (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    pattern_name        TEXT NOT NULL UNIQUE,          -- e.g. "weather_daily_high_lockin"
    category            TEXT NOT NULL,                 -- physics | math | rule | statistical
    description         TEXT NOT NULL,
    source_of_certainty TEXT NOT NULL,                 -- NWS thermodynamics, options math, etc.
    markets             TEXT NOT NULL,                 -- JSON array: ["kalshi","polymarket","options"]
    typical_certainty   REAL NOT NULL,                 -- 0.0-1.0, e.g. 0.9970
    typical_gap_size    REAL,                          -- avg gap seen historically
    win_rate_observed   REAL,                          -- updated from outcomes table
    trade_count         INTEGER DEFAULT 0,
    is_active           INTEGER DEFAULT 1,             -- 0 = retired
    notes               TEXT,
    created_at          TEXT DEFAULT (datetime('now')),
    updated_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_pattern_library_category ON pattern_library(category);
CREATE INDEX IF NOT EXISTS idx_pattern_library_active ON pattern_library(is_active);

-- ============================================================
-- MARKET PRICE SNAPSHOTS
-- Raw price data from all monitored markets.
-- This is the "what does the market say" side of the gap.
-- ============================================================

CREATE TABLE IF NOT EXISTS market_snapshots (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    market_type         TEXT NOT NULL,                 -- kalshi | polymarket | options | vix | crypto_funding | stablecoin | lst | cef | forex | sports_live | earnings_iv
    market_source       TEXT NOT NULL,                 -- exchange or API name
    instrument_id       TEXT NOT NULL,                 -- ticker, contract ID, market slug
    instrument_label    TEXT,                          -- human-readable description
    snapshot_time       TEXT NOT NULL,                 -- ISO8601
    -- prediction market fields
    yes_price           REAL,                          -- 0-1 implied probability (or 0-100 cents)
    no_price            REAL,
    volume_usd          REAL,
    open_interest       REAL,
    close_time          TEXT,                          -- when market resolves
    -- options fields
    option_type         TEXT,                          -- call | put
    strike              REAL,
    expiry              TEXT,
    iv                  REAL,
    delta               REAL,
    theta               REAL,
    gamma               REAL,
    bid                 REAL,
    ask                 REAL,
    -- VIX fields
    vix_spot            REAL,
    vix_m1_futures      REAL,
    vix_m2_futures      REAL,
    term_structure_slope REAL,                         -- m1-m2, negative = backwardation
    -- crypto fields
    funding_rate        REAL,                          -- 8h rate, e.g. 0.0001 = 0.01%
    funding_rate_annualized REAL,
    spot_price          REAL,
    futures_price       REAL,
    basis_pct           REAL,                          -- (futures-spot)/spot
    liquidations_1h_usd REAL,
    liquidations_4h_usd REAL,
    -- stablecoin / LST fields
    peg_target          REAL DEFAULT 1.0,
    peg_deviation       REAL,                          -- current_price - peg_target
    redemption_queue_days INTEGER,
    -- CEF / bond fields
    nav                 REAL,
    market_price        REAL,
    discount_pct        REAL,                          -- (market_price - nav) / nav
    call_date           TEXT,
    call_price          REAL,
    -- forex fields
    spot_rate           REAL,
    forward_1m          REAL,
    forward_3m          REAL,
    base_interest_rate  REAL,
    quote_interest_rate REAL,
    rate_differential   REAL,
    -- sports fields
    score_diff          REAL,                          -- home minus away
    time_remaining_sec  INTEGER,
    win_prob_live       REAL,                          -- sportsbook implied
    -- earnings fields
    historical_move_avg REAL,                          -- avg actual move over N earnings
    implied_move        REAL,                          -- what options are pricing
    move_overpricing    REAL,                          -- implied - historical
    raw_payload         TEXT,                          -- full JSON from API for reprocessing
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_snapshots_market_type ON market_snapshots(market_type);
CREATE INDEX IF NOT EXISTS idx_snapshots_instrument ON market_snapshots(instrument_id);
CREATE INDEX IF NOT EXISTS idx_snapshots_time ON market_snapshots(snapshot_time);
CREATE INDEX IF NOT EXISTS idx_snapshots_type_time ON market_snapshots(market_type, snapshot_time);

-- ============================================================
-- CERTAINTY SCORES
-- Our computed probability from physics/math/rules/statistics.
-- This is the "what do WE know" side of the gap.
-- ============================================================

CREATE TABLE IF NOT EXISTS certainty_scores (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    pattern_id          INTEGER REFERENCES pattern_library(id),
    instrument_id       TEXT NOT NULL,                 -- same key as market_snapshots
    computed_at         TEXT NOT NULL,
    certainty_score     REAL NOT NULL,                 -- 0.0-1.0, our confidence in outcome
    outcome_direction   TEXT NOT NULL,                 -- YES | NO | UP | DOWN | REPEG | DECAY
    certainty_source    TEXT NOT NULL,                 -- NWS_API | options_math | kelly | historical_base_rate
    model_inputs        TEXT,                          -- JSON of the inputs used to compute
    supporting_evidence TEXT,                          -- JSON array of evidence items
    expiry              TEXT,                          -- when this certainty expires (trade window closes)
    is_stale            INTEGER DEFAULT 0,             -- 1 if certainty_score was revised
    notes               TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_certainty_pattern ON certainty_scores(pattern_id);
CREATE INDEX IF NOT EXISTS idx_certainty_instrument ON certainty_scores(instrument_id);
CREATE INDEX IF NOT EXISTS idx_certainty_time ON certainty_scores(computed_at);

-- ============================================================
-- GAP SIGNALS
-- When certainty_score - market_implied_prob exceeds threshold.
-- These are the actionable alerts.
-- ============================================================

CREATE TABLE IF NOT EXISTS gap_signals (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    pattern_id          INTEGER REFERENCES pattern_library(id),
    certainty_score_id  INTEGER REFERENCES certainty_scores(id),
    snapshot_id         INTEGER REFERENCES market_snapshots(id),
    instrument_id       TEXT NOT NULL,
    instrument_label    TEXT,
    market_type         TEXT NOT NULL,
    detected_at         TEXT NOT NULL,
    -- gap math
    certainty_score     REAL NOT NULL,
    market_implied_prob REAL NOT NULL,
    gap_size            REAL NOT NULL,                 -- certainty_score - market_implied_prob
    fee_estimate        REAL DEFAULT 0.0,              -- round-trip cost as fraction
    net_edge            REAL,                          -- gap_size - fee_estimate
    -- signal metadata
    recommended_action  TEXT,                          -- BUY_YES | BUY_NO | SELL_PUT | SHORT_VIX | etc.
    position_size_kelly REAL,                          -- Kelly fraction of bankroll
    max_capital_usd     REAL,                          -- hard cap regardless of Kelly
    holding_period_est  TEXT,                          -- e.g. "2-6 hours" "1-3 days"
    target_desk         TEXT,                          -- which desk should execute
    -- outcome
    signal_status       TEXT DEFAULT 'open',           -- open | traded | expired | dismissed
    dismissed_reason    TEXT,
    trade_id            INTEGER,                       -- FK to trades table, set on execution
    resolved_at         TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_signals_status ON gap_signals(signal_status);
CREATE INDEX IF NOT EXISTS idx_signals_detected ON gap_signals(detected_at);
CREATE INDEX IF NOT EXISTS idx_signals_market ON gap_signals(market_type);
CREATE INDEX IF NOT EXISTS idx_signals_net_edge ON gap_signals(net_edge DESC);
CREATE INDEX IF NOT EXISTS idx_signals_instrument ON gap_signals(instrument_id);

-- ============================================================
-- TRADES
-- Every position taken by the desk.
-- ============================================================

CREATE TABLE IF NOT EXISTS trades (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    signal_id           INTEGER REFERENCES gap_signals(id),
    pattern_id          INTEGER REFERENCES pattern_library(id),
    instrument_id       TEXT NOT NULL,
    instrument_label    TEXT,
    market_type         TEXT NOT NULL,
    direction           TEXT NOT NULL,                 -- YES | NO | LONG | SHORT | SELL_CALL | etc.
    entry_time          TEXT NOT NULL,
    entry_price         REAL NOT NULL,                 -- in probability (0-1) or dollar price
    position_size_usd   REAL NOT NULL,
    max_loss_usd        REAL,
    -- exit
    exit_time           TEXT,
    exit_price          REAL,
    exit_reason         TEXT,                          -- expiry | target_hit | stop_loss | manual
    -- P&L
    gross_pnl_usd       REAL,
    fees_paid_usd       REAL DEFAULT 0.0,
    net_pnl_usd         REAL,
    return_pct          REAL,                          -- net_pnl / position_size_usd
    -- context at entry
    certainty_at_entry  REAL,
    market_price_at_entry REAL,
    gap_at_entry        REAL,
    -- context at exit
    outcome             TEXT,                          -- win | loss | partial | pushed
    notes               TEXT,
    created_at          TEXT DEFAULT (datetime('now')),
    updated_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_trades_signal ON trades(signal_id);
CREATE INDEX IF NOT EXISTS idx_trades_market ON trades(market_type);
CREATE INDEX IF NOT EXISTS idx_trades_time ON trades(entry_time);
CREATE INDEX IF NOT EXISTS idx_trades_outcome ON trades(outcome);
CREATE INDEX IF NOT EXISTS idx_trades_pattern ON trades(pattern_id);

-- ============================================================
-- PATTERN PERFORMANCE SUMMARY
-- Materialized view (maintained by code) of per-pattern stats.
-- Updated after each trade settlement.
-- ============================================================

CREATE TABLE IF NOT EXISTS pattern_performance (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    pattern_id          INTEGER REFERENCES pattern_library(id) UNIQUE,
    total_trades        INTEGER DEFAULT 0,
    wins                INTEGER DEFAULT 0,
    losses              INTEGER DEFAULT 0,
    pushes              INTEGER DEFAULT 0,
    win_rate            REAL,
    total_gross_pnl     REAL DEFAULT 0.0,
    total_fees          REAL DEFAULT 0.0,
    total_net_pnl       REAL DEFAULT 0.0,
    avg_return_pct      REAL,
    avg_certainty_at_entry REAL,
    avg_gap_at_entry    REAL,
    avg_holding_hours   REAL,
    sharpe_ratio        REAL,
    max_single_loss_usd REAL,
    max_drawdown_usd    REAL,
    last_trade_at       TEXT,
    updated_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_perf_pattern ON pattern_performance(pattern_id);

-- ============================================================
-- KALSHI MARKET INDEX
-- Tracks known Kalshi markets relevant to certainty gap strategies.
-- Refreshed daily.
-- ============================================================

CREATE TABLE IF NOT EXISTS kalshi_markets (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    ticker              TEXT NOT NULL UNIQUE,
    title               TEXT,
    category            TEXT,                          -- weather | economic | sports | politics | etc.
    yes_price           REAL,                          -- latest cents (0-100)
    no_price            REAL,
    volume_24h          REAL,
    open_interest       REAL,
    close_time          TEXT,
    result              TEXT,                          -- null until settled
    pattern_match       TEXT,                          -- which pattern_library entry this maps to
    is_active           INTEGER DEFAULT 1,
    last_checked        TEXT,
    created_at          TEXT DEFAULT (datetime('now')),
    updated_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_kalshi_category ON kalshi_markets(category);
CREATE INDEX IF NOT EXISTS idx_kalshi_close ON kalshi_markets(close_time);
CREATE INDEX IF NOT EXISTS idx_kalshi_active ON kalshi_markets(is_active);
CREATE INDEX IF NOT EXISTS idx_kalshi_pattern ON kalshi_markets(pattern_match);

-- ============================================================
-- POLYMARKET MARKET INDEX
-- Tracks known Polymarket markets relevant to certainty gap strategies.
-- ============================================================

CREATE TABLE IF NOT EXISTS polymarket_markets (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    condition_id        TEXT NOT NULL UNIQUE,
    slug                TEXT,
    title               TEXT,
    category            TEXT,
    yes_price           REAL,                          -- 0.0-1.0 USDC
    no_price            REAL,
    volume_24h          REAL,
    liquidity_usd       REAL,
    end_date            TEXT,
    resolved            INTEGER DEFAULT 0,
    resolution          TEXT,                          -- YES | NO | null
    pattern_match       TEXT,
    is_active           INTEGER DEFAULT 1,
    last_checked        TEXT,
    created_at          TEXT DEFAULT (datetime('now')),
    updated_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_poly_category ON polymarket_markets(category);
CREATE INDEX IF NOT EXISTS idx_poly_end ON polymarket_markets(end_date);
CREATE INDEX IF NOT EXISTS idx_poly_active ON polymarket_markets(is_active);
CREATE INDEX IF NOT EXISTS idx_poly_pattern ON polymarket_markets(pattern_match);

-- ============================================================
-- WEATHER OBSERVATIONS
-- NWS data for weather-based certainty strategies.
-- ============================================================

CREATE TABLE IF NOT EXISTS weather_observations (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    station_id          TEXT NOT NULL,
    city                TEXT,
    observation_time    TEXT NOT NULL,
    temperature_f       REAL,
    temperature_c       REAL,
    daily_high_f        REAL,                          -- max so far today
    daily_low_f         REAL,                          -- min so far today
    forecast_high_f     REAL,                          -- NWS forecast for today
    forecast_low_f      REAL,
    sunrise_time        TEXT,
    sunset_time         TEXT,
    is_post_sunrise     INTEGER,                       -- 1 if past sunrise
    is_post_peak        INTEGER,                       -- 1 if temp has started falling
    raw_payload         TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_weather_station ON weather_observations(station_id);
CREATE INDEX IF NOT EXISTS idx_weather_time ON weather_observations(observation_time);
CREATE INDEX IF NOT EXISTS idx_weather_station_time ON weather_observations(station_id, observation_time);

-- ============================================================
-- VIX TIMESERIES
-- Dedicated table for VIX data (high write frequency).
-- ============================================================

CREATE TABLE IF NOT EXISTS vix_timeseries (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    observed_at         TEXT NOT NULL,
    vix_spot            REAL,
    vix_m1              REAL,
    vix_m2              REAL,
    vix_m3              REAL,
    contango_m1_m2      REAL,                          -- m1-m2 spread (positive = contango, negative = backwardation)
    realized_vol_30d    REAL,
    vol_risk_premium    REAL,                          -- vix_spot - realized_vol_30d
    spike_flag          INTEGER DEFAULT 0,             -- 1 if vix_spot > 30
    reversion_signal    INTEGER DEFAULT 0,             -- 1 if spike is fading (first down day)
    raw_payload         TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_vix_time ON vix_timeseries(observed_at);
CREATE INDEX IF NOT EXISTS idx_vix_spike ON vix_timeseries(spike_flag, observed_at);

-- ============================================================
-- CRYPTO FUNDING RATES
-- Dedicated table for funding rate data across exchanges.
-- ============================================================

CREATE TABLE IF NOT EXISTS crypto_funding_rates (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    exchange            TEXT NOT NULL,                 -- binance | bybit | dydx | hyperliquid
    symbol              TEXT NOT NULL,                 -- BTCUSDT, ETHUSDT, etc.
    funding_time        TEXT NOT NULL,
    funding_rate        REAL NOT NULL,                 -- 8h rate
    funding_rate_annualized REAL,                      -- funding_rate * 3 * 365
    mark_price          REAL,
    spot_price          REAL,
    basis              REAL,                           -- mark - spot
    open_interest_usd   REAL,
    is_favorable        INTEGER DEFAULT 0,             -- 1 if rate > threshold (worthwhile to harvest)
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE UNIQUE INDEX IF NOT EXISTS idx_funding_unique ON crypto_funding_rates(exchange, symbol, funding_time);
CREATE INDEX IF NOT EXISTS idx_funding_symbol ON crypto_funding_rates(symbol);
CREATE INDEX IF NOT EXISTS idx_funding_favorable ON crypto_funding_rates(is_favorable, funding_time);

-- ============================================================
-- CRYPTO LIQUIDATIONS
-- Liquidation cascade data for cascade recovery strategy.
-- ============================================================

CREATE TABLE IF NOT EXISTS crypto_liquidations (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    observed_at         TEXT NOT NULL,
    symbol              TEXT NOT NULL,
    exchange            TEXT,
    liq_usd_1h          REAL,                          -- rolling 1h liquidation total
    liq_usd_4h          REAL,                          -- rolling 4h liquidation total
    liq_usd_24h         REAL,
    long_liq_usd_1h     REAL,
    short_liq_usd_1h    REAL,
    spot_price          REAL,
    cascade_signal      INTEGER DEFAULT 0,             -- 1 if >$500M in 1-4h for BTC
    funding_rate        REAL,                          -- negative = shorts paying longs (max bearish)
    raw_payload         TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_liq_symbol ON crypto_liquidations(symbol);
CREATE INDEX IF NOT EXISTS idx_liq_time ON crypto_liquidations(observed_at);
CREATE INDEX IF NOT EXISTS idx_liq_cascade ON crypto_liquidations(cascade_signal, observed_at);

-- ============================================================
-- STABLECOIN PRICES
-- Price monitoring for depeg detection.
-- ============================================================

CREATE TABLE IF NOT EXISTS stablecoin_prices (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    observed_at         TEXT NOT NULL,
    symbol              TEXT NOT NULL,                 -- USDC, USDT, DAI
    exchange            TEXT NOT NULL,
    price_usd           REAL NOT NULL,
    volume_24h          REAL,
    peg_deviation       REAL,                          -- price - 1.0
    depeg_alert_level   TEXT,                          -- none | alert | buy_tranche_1 | buy_tranche_2 | buy_tranche_3
    -- tranche thresholds: alert=$0.995, t1=$0.98, t2=$0.97, t3=$0.96
    is_algorithmic      INTEGER DEFAULT 0,             -- 0 = collateralized, 1 = never trade
    raw_payload         TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_stable_symbol ON stablecoin_prices(symbol);
CREATE INDEX IF NOT EXISTS idx_stable_time ON stablecoin_prices(observed_at);
CREATE INDEX IF NOT EXISTS idx_stable_alert ON stablecoin_prices(depeg_alert_level, observed_at);

-- ============================================================
-- CEF (CLOSED-END FUND) TRACKING
-- NAV discount monitoring and tender offer tracking.
-- [PANEL: Gemini 3.1 Pro]
-- ============================================================

CREATE TABLE IF NOT EXISTS cef_snapshots (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    ticker              TEXT NOT NULL,
    fund_name           TEXT,
    category            TEXT,                          -- muni | equity | bond | etc.
    nav                 REAL NOT NULL,
    market_price        REAL NOT NULL,
    discount_pct        REAL NOT NULL,                 -- (market-nav)/nav, negative = discount
    historical_avg_discount REAL,
    discount_z_score    REAL,                          -- how unusual is this discount
    has_tender_offer    INTEGER DEFAULT 0,
    tender_offer_price  REAL,                          -- % of NAV if tender exists
    tender_expiry       TEXT,
    observed_at         TEXT NOT NULL,
    raw_payload         TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_cef_ticker ON cef_snapshots(ticker);
CREATE INDEX IF NOT EXISTS idx_cef_discount ON cef_snapshots(discount_pct);
CREATE INDEX IF NOT EXISTS idx_cef_tender ON cef_snapshots(has_tender_offer);

-- ============================================================
-- INDEX REBALANCING EVENTS
-- Tracks S&P 500 and Russell additions/deletions.
-- [PANEL: Gemini 3.1 Pro]
-- ============================================================

CREATE TABLE IF NOT EXISTS index_rebalancing (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    ticker              TEXT NOT NULL,
    company_name        TEXT,
    index_name          TEXT NOT NULL,                 -- SP500 | Russell2000 | Russell1000 | etc.
    action              TEXT NOT NULL,                 -- ADD | REMOVE
    announcement_date   TEXT NOT NULL,
    effective_date      TEXT NOT NULL,
    sector              TEXT,
    sector_etf_hedge    TEXT,                          -- which ETF to short as hedge
    -- opportunity window
    entry_signal_date   TEXT,                          -- when to buy (after announcement)
    exit_signal_date    TEXT,                          -- when to sell (effective_date close)
    estimated_forced_buying_usd REAL,                 -- approx. how much index funds must buy
    -- outcome tracking
    price_at_announcement REAL,
    price_at_effective   REAL,
    pct_return          REAL,
    trade_id            INTEGER REFERENCES trades(id),
    notes               TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_rebal_ticker ON index_rebalancing(ticker);
CREATE INDEX IF NOT EXISTS idx_rebal_effective ON index_rebalancing(effective_date);
CREATE INDEX IF NOT EXISTS idx_rebal_index ON index_rebalancing(index_name);

-- ============================================================
-- EARNINGS IV TRACKING
-- Historical move vs implied move for earnings IV crush.
-- ============================================================

CREATE TABLE IF NOT EXISTS earnings_iv_history (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    ticker              TEXT NOT NULL,
    earnings_date       TEXT NOT NULL,
    -- pre-earnings state
    implied_move_pct    REAL,                          -- what options implied before earnings
    iv_rank             REAL,                          -- IV rank 0-100
    straddle_price      REAL,
    -- post-earnings actual
    actual_move_pct     REAL,                          -- what actually happened
    direction           TEXT,                          -- UP | DOWN
    iv_crush_pct        REAL,                          -- how much IV dropped after
    -- analysis
    move_overpriced     REAL,                          -- implied - actual (positive = options were expensive)
    edge_available      INTEGER DEFAULT 0,             -- 1 if historical pattern suggests selling is profitable
    -- trade outcome if we traded
    trade_id            INTEGER REFERENCES trades(id),
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE UNIQUE INDEX IF NOT EXISTS idx_earnings_unique ON earnings_iv_history(ticker, earnings_date);
CREATE INDEX IF NOT EXISTS idx_earnings_ticker ON earnings_iv_history(ticker);
CREATE INDEX IF NOT EXISTS idx_earnings_edge ON earnings_iv_history(edge_available);

-- ============================================================
-- LST (LIQUID STAKING TOKEN) PRICES
-- stETH, cbETH, rETH tracking for redemption arbitrage.
-- [PANEL: Gemini 3.1 Pro]
-- ============================================================

CREATE TABLE IF NOT EXISTS lst_prices (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    token               TEXT NOT NULL,                 -- stETH | cbETH | rETH
    underlying          TEXT NOT NULL,                 -- ETH
    exchange_or_pool    TEXT NOT NULL,                 -- curve | coinbase | uniswap
    price_in_underlying REAL NOT NULL,                 -- e.g. 0.9700 for 1 stETH = 0.97 ETH
    discount_pct        REAL NOT NULL,                 -- 1 - price_in_underlying
    redemption_queue_days INTEGER,
    smart_contract      TEXT,                          -- for direct redemption calls
    redemption_possible INTEGER DEFAULT 1,             -- 0 if contract paused
    observed_at         TEXT NOT NULL,
    raw_payload         TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_lst_token ON lst_prices(token);
CREATE INDEX IF NOT EXISTS idx_lst_discount ON lst_prices(discount_pct DESC);
CREATE INDEX IF NOT EXISTS idx_lst_time ON lst_prices(observed_at);

-- ============================================================
-- SPORTS LIVE GAME STATES
-- For garbage time detection and live sports certainty gaps.
-- ============================================================

CREATE TABLE IF NOT EXISTS sports_live_states (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    game_id             TEXT NOT NULL,
    sport               TEXT NOT NULL,                 -- NBA | NFL | NHL | MLB | NCAAB
    observed_at         TEXT NOT NULL,
    home_team           TEXT,
    away_team           TEXT,
    home_score          REAL,
    away_score          REAL,
    score_diff          REAL,                          -- home - away
    time_remaining_sec  INTEGER,
    quarter_period      INTEGER,
    -- derived
    win_prob_true       REAL,                          -- our computed true win probability
    win_prob_live_market REAL,                         -- what sportsbooks say
    garbage_time_flag   INTEGER DEFAULT 0,             -- 1 if score_diff and time trigger near-certain win
    kalshi_market_id    TEXT,
    polymarket_id       TEXT,
    raw_payload         TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_sports_game ON sports_live_states(game_id);
CREATE INDEX IF NOT EXISTS idx_sports_time ON sports_live_states(observed_at);
CREATE INDEX IF NOT EXISTS idx_sports_garbage ON sports_live_states(garbage_time_flag, sport);

-- ============================================================
-- COLLECTION LOG
-- Tracks every data collection run for debugging and monitoring.
-- ============================================================

CREATE TABLE IF NOT EXISTS collection_log (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    collector_name      TEXT NOT NULL,                 -- which collector ran
    run_at              TEXT NOT NULL,
    records_written     INTEGER DEFAULT 0,
    duration_ms         INTEGER,
    success             INTEGER DEFAULT 1,             -- 0 = error
    error_message       TEXT,
    created_at          TEXT DEFAULT (datetime('now'))
);

CREATE INDEX IF NOT EXISTS idx_log_collector ON collection_log(collector_name);
CREATE INDEX IF NOT EXISTS idx_log_time ON collection_log(run_at);
CREATE INDEX IF NOT EXISTS idx_log_success ON collection_log(success);

API Endpoints and Collection Cadence

Kalshi

Base URL: https://api.elections.kalshi.com/trade-api/v2
Auth: RSA-PSS signing (API Key ID: 54bc18ae-be42-49a7-a498-d336818c0d3b)

GET /markets
  - Params: limit=200, status=open, category={weather|economic|sports|...}
  - Cadence: Every 5 minutes for active markets, every 60 minutes for sweep of all categories
  - Fields to store: ticker, title, yes_ask, no_ask, volume, close_time, category

GET /markets/{ticker}/orderbook
  - Cadence: Every 60 seconds for markets flagged as gap_candidate
  - Fields: best_yes, best_no, depth

GET /markets/{ticker}
  - Cadence: On resolution (settled markets)
  - Fields: result, settled_price

Polymarket

Base URL: https://gamma-api.polymarket.com
No auth for market data reads.

GET /markets?active=true&closed=false&limit=100
  - Cadence: Every 5 minutes
  - Fields: condition_id, question, outcomes, outcomePrices, volume24hr, endDate

CLOB API: https://clob.polymarket.com
GET /markets/{condition_id}
  - Cadence: Every 60 seconds for gap candidates
  - Fields: yes, no prices from order book

Websocket: wss://clob.polymarket.com/ws/market
  - Subscribe to specific condition_ids for near-real-time prices
  - Use for any active trade positions

NWS Weather

Base URL: https://api.weather.gov
No auth required.

GET /stations/{station_id}/observations/latest
  - Stations: NWS station IDs for cities with active Kalshi weather markets
  - Cadence: Every 15 minutes
  - Fields: temperature, timestamp

GET /gridpoints/{office}/{gridX},{gridY}/forecast/hourly
  - Cadence: Every 60 minutes
  - Fields: temperature (hourly forecast), startTime

GET /points/{lat},{lon}
  - One-time call per city to get gridpoint coordinates

Sunrise/sunset: api.sunrise-sunset.org/json?lat={}&lng={}&date=today
  - Cadence: Once daily at midnight

CBOE / VIX Data

VIX spot:
  URL: https://cdn.cboe.com/api/global/delayed_quotes/charts/historical/_VIX.json
  Cadence: Every 5 minutes during market hours (9:30AM-4PM ET weekdays)

VIX futures term structure:
  URL: https://www.cboe.com/delayed_quotes/vx/term_structure  (scrape)
  OR: IBKR API for real-time VIX futures (requires account)
  Cadence: Every 5 minutes during market hours

Realized vol:
  Compute from SPX daily closes (calculate 30-day trailing HV)
  Source: Yahoo Finance or IBKR historical data

Crypto — Binance

Base URL: https://api.binance.com/api/v3
WebSocket: wss://stream.binance.com:9443/ws

Funding rates:
  REST: GET /fapi/v1/fundingRate?symbol={}&limit=1
  Cadence: Every 5 minutes per symbol (BTC, ETH, SOL, top 20 by OI)
  WebSocket: wss://fstream.binance.com/ws/{symbol}@markPrice  (real-time)

Liquidations:
  WebSocket: wss://fstream.binance.com/ws/!forceOrder@arr  (all symbols)
  Cadence: Real-time stream, aggregate into 1h/4h rolling windows

Futures basis:
  Spot: GET /api/v3/ticker/price?symbol=BTCUSDT
  Futures: GET /fapi/v1/ticker/price?symbol=BTCUSDT  (quarterly)
  Cadence: Every 5 minutes

Coinglass (Liquidation Aggregation)

Base URL: https://open-api.coinglass.com/public/v2
API key available on free tier.

GET /indicator/liquidation_history?symbol=BTC&time_type=h1&limit=24
  Cadence: Every 10 minutes
  Fields: liqUsd, longLiqUsd, shortLiqUsd, time

GET /futures/liquidation/detail/chart?symbol=BTC&interval=1h
  Cadence: Every 15 minutes

Stablecoin Prices

Coinbase: GET https://api.coinbase.com/v2/prices/USDC-USD/spot
Binance: GET https://api.binance.com/api/v3/ticker/price?symbol=USDCUSDT
Kraken: GET https://api.kraken.com/0/public/Ticker?pair=USDCUSD
DEX (Curve): The Graph API or direct RPC call to Curve USDC pool
  Endpoint: https://api.thegraph.com/subgraphs/name/curvefi/curve

Cadence: Every 60 seconds for all stablecoins (USDC, USDT, DAI)
Alert threshold: price < 0.995 = log alert, price < 0.98 = gap_signal

SEC EDGAR (CEF Tender Offers and Index Events)

EDGAR full-text search:
  URL: https://efts.sec.gov/LATEST/search-index?q=%22tender+offer%22&dateRange=custom&startdt={yesterday}&enddt={today}&forms=TO-I
  Cadence: Daily at 6:00 AM ET

S&P Dow Jones press releases:
  URL: https://www.spglobal.com/spdji/en/search/#overview  (scrape)
  Cadence: Daily at 6:00 AM ET and 6:00 PM ET (announcements after close)

FTSE Russell:
  URL: https://www.ftserussell.com/resources/russel-reconstitution  (scrape)
  Cadence: Daily during reconstitution season (May-June for Russell)

The Odds API (Sports)

API key: f63a46439d104a3a78dee17580c96279
Base URL: https://api.the-odds-api.com/v4

GET /sports/{sport}/odds?apiKey={}&markets=h2h&regions=us&live=true
  Sports: basketball_nba | americanfootball_nfl | icehockey_nhl | baseball_mlb
  Cadence: Every 30 seconds during live games
  Fields: home_team, away_team, bookmakers[0].markets[0].outcomes (live h2h)

ESPN live scores (supplemental for game state):
  URL: https://site.api.espn.com/apis/site/v2/sports/{sport}/scoreboard
  Cadence: Every 30 seconds during games
  Fields: score, clock, quarter/period, status

OANDA (Forex)

Base URL: https://api-fxtrade.oanda.com/v3
Auth: Bearer {OANDA_API_KEY} (existing key in forex bot)

GET /instruments/{pair}/candles?count=1&granularity=M1
  Cadence: Every 60 minutes
  Pairs: USD_JPY, USD_MXN, EUR_USD, GBP_USD (top carry differentials)

Central bank rate data:
  Fed: https://www.federalreserve.gov/releases/h15/  (scrape, daily)
  BOJ: https://www.stat-search.boj.or.jp/ (daily)
  ECB: https://data.ecb.europa.eu/data/datasets/FM (daily)

LST Prices (Liquid Staking Tokens)

CoinGecko:
  GET https://api.coingecko.com/api/v3/simple/price?ids=staked-ether,coinbase-wrapped-staked-eth,rocket-pool-eth&vs_currencies=eth
  Cadence: Every 5 minutes

Curve pool price (on-chain via RPC):
  Pool: 0xDC24316b9AE028F1497c275EB9192a3Ea0f67022  (stETH/ETH pool)
  Read: pool.get_dy(0, 1, 1e18) / 1e18 = price of 1 stETH in ETH
  RPC: Infura or Alchemy endpoint
  Cadence: Every 5 minutes

Storage Estimates

Table Rows/Day (Est.) Avg Row Size Monthly Storage
market_snapshots 5,000 2 KB ~300 MB
kalshi_markets 50 500 B ~750 KB
polymarket_markets 100 500 B ~1.5 MB
weather_observations 1,500 1 KB ~45 MB
vix_timeseries 400 500 B ~6 MB
crypto_funding_rates 200 300 B ~1.8 MB
crypto_liquidations 288 800 B ~7 MB
stablecoin_prices 4,320 400 B ~52 MB
cef_snapshots 50 600 B ~900 KB
index_rebalancing 2 800 B ~50 KB
earnings_iv_history 20 800 B ~480 KB
lst_prices 864 600 B ~16 MB
sports_live_states 2,000 1 KB ~60 MB
gap_signals 50 1 KB ~1.5 MB
trades 10 1 KB ~300 KB
certainty_scores 200 1 KB ~6 MB
collection_log 500 200 B ~3 MB

Total estimated monthly storage: ~500 MB

With WAL mode and standard SQLite compression, real disk usage will be roughly 40-60% of that. Recommend a daily vacuum on the DB and archiving market_snapshots older than 90 days to a compressed SQLite file.

Archive strategy:


All Strategies — Complete Catalog

Tier 1: Near-Physics Level Certain

1. Options Theta Decay (0DTE)

2. Weather Daily High Lock-In (Existing Strategy)

3. Weather Overnight Low Lock-In

4. Stablecoin Depeg Buying

Tier 2: Structural / Math Certain

5. Crypto Funding Rate Harvest

6. VIX Mean Reversion

7. Crypto Futures Cash-and-Carry

8. Earnings IV Crush

9. Kalshi Economic Data Leading Indicators

Tier 3: Clever / Niche

10. Liquidation Cascade Recovery (Crypto)

11. Index Rebalancing Front-Running

12. Closed-End Fund Tender Offers

13. Bureaucratic Impossibility (Prediction Markets)

14. Callable Bond Convergence

15. Forex Carry Trade with Certainty Filter

16. Sports Garbage Time

17. Tennis Tiebreak Rules

18. ETF NAV Discount / Premium

19. Liquid Staking Token Depegs

20. Election Night Mail-In Ballot Lag

21. Corporate Bond Callable Near Call Date


Prediction Market Execution — Kalshi and Polymarket

Kalshi Ladder Structure

Kalshi uses a CLOB (Central Limit Order Book) with ladder pricing in cents (0-100). Key execution details:

Fee-adjusted certainty gap threshold for Kalshi:

Polymarket Ladder Structure

Polymarket uses a CLOB denominated in USDC (0.00-1.00):

Fee-adjusted certainty gap threshold for Polymarket:

Gap Signal Routing Logic

if market_type IN ('kalshi', 'polymarket'):
    min_threshold = 0.05  # 5% net edge minimum for prediction markets
    max_position_usd = 2500  # per market cap
    holding_period = until_resolution or certainty_expires

if market_type IN ('options', 'vix'):
    min_threshold = 0.03  # lower threshold acceptable with high liquidity
    max_position_usd = 10000
    holding_period = 0DTE to 90DTE depending on strategy

if market_type IN ('crypto_funding', 'stablecoin', 'lst'):
    min_threshold = 0.02
    max_position_usd = 5000
    holding_period = hours to weeks

Portfolio Allocation

Recommended baseline allocation for $200K portfolio:

Strategy Allocation Expected Annual Return
0DTE Theta Decay $40K (20%) 30-60%
Crypto Funding Rate Harvest $40K (20%) 15-40%
VIX Mean Reversion $30K (15%) 40-80% when deployed
Earnings IV Crush $30K (15%) 25-50%
Stablecoin Depeg $20K (10%) 10-30% opportunistic
Liquidation Cascade Recovery $20K (10%) 40-80%
Cash Reserve (crisis opportunities) $20K (10%) 0% until deployed

Blended expected annual return: 25-50%

These strategies are largely uncorrelated:

The meta-strategy: at any given time, multiple strategies generate returns while others sit idle waiting for their certainty gap to appear.


Build Priority

Build in this order — each one teaches you something needed for the next:

  1. Crypto Funding Rate Harvest — simplest to implement, runs 24/7, steady returns, teaches exchange API integration and hedge management.
  2. Liquidation Cascade Recovery — straightforward signals (Coinglass API), high per-trade returns, teaches real-time WebSocket data handling.
  3. Stablecoin Depeg Monitor — set it and forget it. Teaches threshold alerting and tranche execution.
  4. 0DTE Theta Decay — highest frequency, most data to learn from quickly. Teaches options chain processing and IV calculations.
  5. Earnings IV Crush — seasonal, very backtestable. Teaches historical earnings data pipeline.
  6. VIX Mean Reversion — less frequent but high conviction when it triggers. Teaches term structure analysis.
  7. Weather Overnight Low Lock-In — copy of existing weather strategy, extend with sunrise logic.
  8. Kalshi Economic Leading Indicators — fast to add once weather bot pipeline is running.
  9. Index Rebalancing Front-Running [PANEL: Gemini 3.1 Pro] — needs NLP press release scraper. High value per trade.
  10. Closed-End Fund Tender Offers [PANEL: Gemini 3.1 Pro] — needs SEC EDGAR scraper. Medium frequency, high win rate.
  11. LST (Liquid Staking) Depegs [PANEL: Gemini 3.1 Pro] — needs Web3.py integration. Set and forget.
  12. Sports Garbage Time — needs live game state pipeline (already partially built for sports desk).

Key SQL Queries

Find Active Gap Signals Worth Trading

SELECT
    gs.id,
    gs.instrument_label,
    gs.market_type,
    gs.certainty_score,
    gs.market_implied_prob,
    gs.gap_size,
    gs.net_edge,
    gs.recommended_action,
    gs.position_size_kelly,
    gs.max_capital_usd,
    gs.holding_period_est,
    gs.detected_at,
    pl.pattern_name,
    pl.category
FROM gap_signals gs
JOIN pattern_library pl ON gs.pattern_id = pl.id
WHERE gs.signal_status = 'open'
  AND gs.net_edge >= 0.03
  AND gs.resolved_at IS NULL
  AND datetime(gs.detected_at) > datetime('now', '-4 hours')
ORDER BY gs.net_edge DESC;

Pattern Win Rate Summary

SELECT
    pl.pattern_name,
    pl.category,
    pp.total_trades,
    pp.wins,
    pp.losses,
    ROUND(pp.win_rate * 100, 1) AS win_rate_pct,
    ROUND(pp.total_net_pnl, 2) AS total_net_pnl,
    ROUND(pp.avg_return_pct * 100, 2) AS avg_return_pct,
    ROUND(pp.avg_certainty_at_entry * 100, 1) AS avg_certainty_pct,
    ROUND(pp.avg_gap_at_entry * 100, 1) AS avg_gap_pct,
    pp.last_trade_at
FROM pattern_performance pp
JOIN pattern_library pl ON pp.pattern_id = pl.id
WHERE pp.total_trades >= 5
ORDER BY pp.total_net_pnl DESC;

Detect Active VIX Spike (Requires Backfill)

SELECT
    observed_at,
    vix_spot,
    vix_m1,
    vix_m2,
    contango_m1_m2,
    spike_flag,
    reversion_signal
FROM vix_timeseries
WHERE spike_flag = 1
ORDER BY observed_at DESC
LIMIT 20;

Stablecoin Depeg Alert Check

SELECT
    symbol,
    exchange,
    price_usd,
    peg_deviation,
    depeg_alert_level,
    observed_at
FROM stablecoin_prices
WHERE depeg_alert_level != 'none'
  AND datetime(observed_at) > datetime('now', '-30 minutes')
  AND is_algorithmic = 0
ORDER BY peg_deviation ASC;

Find Crypto Funding Rate Opportunities

SELECT
    exchange,
    symbol,
    funding_rate,
    funding_rate_annualized,
    mark_price,
    spot_price,
    basis,
    funding_time
FROM crypto_funding_rates
WHERE is_favorable = 1
  AND datetime(funding_time) > datetime('now', '-2 hours')
ORDER BY funding_rate_annualized DESC
LIMIT 20;

Sports Garbage Time Signal Check

SELECT
    game_id,
    sport,
    home_team,
    away_team,
    score_diff,
    time_remaining_sec,
    win_prob_true,
    win_prob_live_market,
    (win_prob_true - win_prob_live_market) AS gap,
    kalshi_market_id,
    polymarket_id,
    observed_at
FROM sports_live_states
WHERE garbage_time_flag = 1
  AND (win_prob_true - win_prob_live_market) > 0.03
  AND datetime(observed_at) > datetime('now', '-5 minutes')
ORDER BY gap DESC;

Liquidation Cascade Detection

SELECT
    symbol,
    observed_at,
    liq_usd_1h,
    liq_usd_4h,
    long_liq_usd_1h,
    short_liq_usd_1h,
    spot_price,
    funding_rate,
    cascade_signal
FROM crypto_liquidations
WHERE cascade_signal = 1
  AND datetime(observed_at) > datetime('now', '-2 hours')
ORDER BY liq_usd_1h DESC;

Daily Collection Health Check

SELECT
    collector_name,
    COUNT(*) AS runs_today,
    SUM(records_written) AS records_written,
    SUM(CASE WHEN success = 0 THEN 1 ELSE 0 END) AS errors,
    MAX(run_at) AS last_run,
    ROUND(AVG(duration_ms)) AS avg_duration_ms
FROM collection_log
WHERE date(run_at) = date('now')
GROUP BY collector_name
ORDER BY collector_name;

Kalshi Markets Matching Pattern Library

SELECT
    km.ticker,
    km.title,
    km.category,
    km.yes_price,
    km.no_price,
    km.volume_24h,
    km.close_time,
    km.pattern_match,
    pl.typical_certainty,
    pl.typical_gap_size,
    (pl.typical_certainty - km.yes_price / 100.0) AS estimated_gap
FROM kalshi_markets km
JOIN pattern_library pl ON km.pattern_match = pl.pattern_name
WHERE km.is_active = 1
  AND km.close_time > datetime('now')
  AND (pl.typical_certainty - km.yes_price / 100.0) > 0.05
ORDER BY estimated_gap DESC;

Trade P&L by Pattern (All Time)

SELECT
    pl.pattern_name,
    pl.category,
    COUNT(t.id) AS trades,
    SUM(t.net_pnl_usd) AS total_pnl,
    AVG(t.return_pct) AS avg_return_pct,
    MIN(t.net_pnl_usd) AS worst_trade,
    MAX(t.net_pnl_usd) AS best_trade,
    SUM(CASE WHEN t.outcome = 'win' THEN 1 ELSE 0 END) AS wins,
    SUM(CASE WHEN t.outcome = 'loss' THEN 1 ELSE 0 END) AS losses
FROM trades t
JOIN pattern_library pl ON t.pattern_id = pl.id
GROUP BY t.pattern_id
ORDER BY total_pnl DESC;

Implementation Checklist

Phase 0: Infrastructure (Do First)

Phase 1: Crypto Data Collection (Simplest Feeds First)

Phase 2: Prediction Market Feeds

Phase 3: Weather Collection

Phase 4: Financial Market Feeds

Phase 5: Equity and Niche Feeds [PANEL Sources]

Phase 6: Gap Detector and Signal Engine

Phase 7: Trade Tracking and Performance

Maintenance



Panel Addition: Sonar Reasoning Pro Review (2026-03-16)

Sonar Reasoning Pro reviewed this spec with web-search-augmented analysis. Unique additions below.

[PANEL: Sonar Reasoning Pro] Prediction Market Ladder Exploitation Module

Both Kalshi and Polymarket offer ladder contracts where the same event has multiple threshold contracts (e.g., "Will CPI MoM be 0-0.2%?" / "0.2-0.4%?" / "0.4-0.6%?" etc.). These are mutually exclusive and sum to 100%. The spec must explicitly address how to exploit mispricing across the ladder.

Approach:

  1. For each economic data release (CPI, NFP, GDP), build a distribution model using nowcast feeds (Atlanta Fed GDPNow, Bloomberg consensus, ADP data).
  2. Integrate the probability density function over each ladder rung to get true probability per contract.
  3. Compare model probability to market price for each rung.
  4. Execute when gap > threshold (fee + 1% buffer minimum).

Key insight: Tails (extreme outcomes) have the fattest edges because retail doesn't price tail distributions correctly. A certainty gap desk should hunt tail contracts aggressively. Contracts at the center of the distribution are crowded and efficient; move to the edges.

Ladder interaction rule: Since ladder contracts are mutually exclusive and sum to 100%, if you're bullish on a high outcome, you can simultaneously sell the low-end tail and buy the high-end tail for a synthetic directional position.

def exploit_ladder(event, model_distribution, market_prices):
    """
    event: "Will CPI print X-Y%?"
    model_distribution: probability density function from nowcasts
    market_prices: dict of {contract_id: market_price}
    """
    for contract_id, (floor, ceiling) in LADDER_BOUNDS.items():
        true_prob = integrate(model_distribution, floor, ceiling)
        market_prob = market_prices[contract_id]

        edge = true_prob - market_prob
        fee = 0.07 * 100 * market_prob * (1 - market_prob) / (market_prob * 100)

        if edge > fee + 0.01:  # Buy (underpriced)
            position_size = kelly_fraction(edge) * portfolio
            execute_buy(contract_id, position_size)
        elif edge < -(fee + 0.01):  # Sell (overpriced)
            execute_sell(contract_id, abs(position_size))

[PANEL: Sonar Reasoning Pro] Kalshi Fee Calculation — All Strategies

No strategy should execute without factoring in Kalshi's fee formula:

Kalshi taker fee: 0.07 × contracts × price × (1 − price), capped at $1.75 per 100 contracts. Kalshi maker fee: $0.44 per 100 contracts (flat).

Rule for all strategies: Skip any trade where edge < fee + 1% buffer.

Example: CPI tail contract at $0.08, model says $0.10.

Example 2: Mid-range contract at $0.50, model says $0.52.

[PANEL: Sonar Reasoning Pro] Kalshi Interest Income Optimization

Kalshi pays 3.75-4% APY on account balances, accruing daily. The desk should optimize for this:

[PANEL: Sonar Reasoning Pro] Liquidity Check Before Execution

def is_trade_feasible(venue, contract_id, position_size):
    order_book = fetch_order_book(venue, contract_id)
    best_ask = order_book["asks"][0]["price"]
    available_at_ask = sum([ask["size"] for ask in order_book["asks"][:3]])

    # Only trade if liquidity > 1.5x position size
    if available_at_ask < position_size * 1.5:
        return False, f"Insufficient liquidity: {available_at_ask} < {position_size * 1.5}"

    return True, "Trade feasible"

Small position sizes on low-liquidity tail contracts can cause massive slippage. A $5K buy on a $0.05 contract with $10K open interest could push the price to $0.08, destroying the edge.

[PANEL: Sonar Reasoning Pro] Cross-Venue Hedging Logic

If trading Kalshi sports, consider hedging with Polymarket (better political/crypto liquidity) or Pinnacle (sharp sports lines). No strategy should trade in isolation without checking whether a better price exists on another venue.

Venue Selection Rules:

Strategy Type Primary Venue Secondary Venue Decision Rule
Economic data (CPI, NFP) Kalshi Polymarket Kalshi if edge >2%, else monitor Poly
Sports (NFL, NBA, MLB) Kalshi Polymarket Kalshi: majority sports volume, use unless liquidity <$100K
Political (elections) Polymarket Kalshi Poly deeper in presidential; Kalshi better state-level
Crypto (BTC price bands) Polymarket Kalshi Both available; Poly has better funding/derivatives arbs
Weather Kalshi Kalshi only venue for weather markets

[PANEL: Sonar Reasoning Pro] NWS API Enforcement for Weather

Weather signals MUST use National Weather Service API only (grid-based forecast data). No scraping of forecast.weather.gov — use official NDFD API (NOAA/National Digital Forecast Database). Lock in positions when NWS official forecast diverges >2% from market price. Max lag: 15 minutes from NWS update to order placement.

[PANEL: Sonar Reasoning Pro] Resolution Dispute Risk

Kalshi and Polymarket resolve differently:

A trade could win on fundamentals but lose on how the market resolves. Add to risk assessment: don't trade if resolution criteria are ambiguous. Flag any contract where resolution rules reference subjective judgment.


End of Certainty Gap Desk — Final Merged Specification

Source: ~/edgeclaw/results/certainty-gap-desk/spec-final.md