TA-Futures Desk — Data Collection Spec (MERGED FINAL)

This is the merged final spec for the TA-Futures Desk. Base: Claude Opus 4.6 (complete, unmodified). Panel additions cherry-picked from: Gemini 3.1 Pro, Grok 4.1, DeepSeek V3.1, GPT-4.1, Grok 4 Fast, Sonar Reasoning Pro. Each panel addition is tagged [PANEL: Model Name]. Purpose: data collection spec for a coding bot, feeding the Futures Desk analyst pipeline. Primary trading venue: futures broker (Interactive Brokers recommended as primary data source). Target infrastructure: SQLite, TypeScript/Node.js, 3 VPS servers, ~$115-145/month API budget.


Instruments to Monitor

Replace the 15 forex pairs with futures contracts. Start with:

Symbol Name Exchange Contract Months Session (ET)
ES E-mini S&P 500 CME H, M, U, Z (quarterly) Sun 6PM - Fri 5PM, halt 4:15-4:30
NQ E-mini Nasdaq 100 CME H, M, U, Z Sun 6PM - Fri 5PM, halt 4:15-4:30
CL Crude Oil NYMEX Every month Sun 6PM - Fri 5PM, halt 5-6PM daily
GC Gold COMEX G, J, M, Q, V, Z (even months) Sun 6PM - Fri 5PM, halt 5-6PM daily
ZB 30-Year Treasury Bond CBOT H, M, U, Z Sun 6PM - Fri 5PM, halt 4:15-4:30
NG Natural Gas NYMEX Every month Sun 6PM - Fri 5PM, halt 5-6PM daily
SI Silver COMEX H, K, N, U, Z Sun 6PM - Fri 5PM, halt 5-6PM daily
HG Copper COMEX H, K, N, U, Z Sun 6PM - Fri 5PM, halt 5-6PM daily
ZC Corn CBOT H, K, N, U, Z Sun 8PM - Fri 1:20PM, daytime 8:30AM-1:20PM
ZS Soybeans CBOT F, H, K, N, Q, U, X Sun 8PM - Fri 1:20PM, daytime 8:30AM-1:20PM

Config file lists symbols, and for each: exchange, contract roll cycle, RTH hours, ETH hours, tick size, point value, daily price limit (if applicable), and round number grid settings.

Phased Product Rollout [PANEL: Claude Opus 4.6 — A+ changes]

Do NOT attempt all 10 products from day 1. Follow this phased plan:

Phase 1 (weeks 1-4): ES and CL only. These are the two most liquid products and are maximally different from each other (equity index vs commodity). They exercise every code path: quarterly rolls (ES) vs monthly rolls (CL), CME exchange vs NYMEX, no daily limits (ES) vs daily limits (CL), equity correlations vs energy correlations, weak seasonality (ES) vs strong seasonality (CL).

Phase 2 (weeks 5-6): Add GC and ZB. GC adds precious metals and COMEX. ZB adds fixed income and 32nds pricing.

Phase 3 (weeks 7-8): Add NQ and NG. NQ validates ES pipeline generalization. NG validates CL pipeline generalization.

Phase 4 (weeks 9-12): Add SI, HG, ZC, ZS. Lower-liquidity products that may expose edge cases.

Acceptance criteria: Each phase is complete when all indicators produce values within 1% of a reference source (TradingView or broker platform) for 20 consecutive trading days.


Global Changes (Apply to ALL Sections)

1. Continuous Contracts and Roll Handling

Futures expire. You must build or consume continuous contract data.

What to store per symbol:

Roll rules:

Triple-storage approach [PANEL: Claude Opus 4.6 — A+ changes]: Store THREE continuous price series per symbol:

  1. raw_contract — actual traded prices per individual contract. No adjustment. Used for exact entry/exit levels and roll validation.
  2. adjusted_panama — Panama Canal (additive) adjustment. Recalculate only for the MOST RECENT 2 YEARS of data on each roll (not all history). Used for short-term indicator calculation (SuperTrend, ATR, BB, etc.).
  3. adjusted_ratio — ratio adjustment (multiply all historical prices by new_close / old_close). Applied to FULL history. Used for long-term seasonal analysis (15-year averages) and any analysis spanning more than 2 years.

Add to contract_rolls table: panama_gap (additive), ratio_factor (multiplicative), cumulative_panama_adjustment (running total — when this exceeds 20% of current price, log a warning that Panama series is degrading).

Add to config per symbol: preferred_adjustment_method defaulting to 'panama' for intraday/swing timeframes and 'ratio' for daily+ timeframes.

Enhanced roll validation [PANEL: Claude Opus 4.6 — A+ changes]: Use a composite roll signal for less-liquid products:

  1. Back month volume > front month for 2 consecutive sessions AND
  2. Back month open interest > front month open interest AND
  3. Within 15 trading days of first notice day

For agricultural products, also check if the front month is in delivery month. Hardcode known CME recommended roll dates as a fallback.

Post-roll, verify the adjusted continuous series has no jump greater than 1 tick at the roll boundary. If a jump is detected, the Panama adjustment was applied incorrectly — halt and alert.

Why this matters: Without roll adjustment, a $5 gap between ES contracts would corrupt ATR, SuperTrend, Bollinger Bands, moving averages, ZigZag — everything.

2. RTH vs ETH (Globex) Session Hierarchy

Futures trade nearly 24 hours but RTH is where real volume happens.

Session hierarchy [PANEL: Claude Opus 4.6 — A+ changes]: For each timeframe, designate one session type as 'canonical' (used for prediction signals and confluence scoring) and the other as 'supplementary' (stored for analyst reference but not fed into automated signal generation).

Timeframe Canonical Session Reason
5min RTH 75%+ of volume, signal quality
15min RTH Same
1H RTH Same
4H Full session RTH-only 4H candles are irregular (only 6.75 hours available)
Daily Full session + settlement close Institutional standard
Weekly Full session Standard
Monthly Full session Standard

For v1, ONLY calculate the canonical session type. Defer supplementary session calculations to v2. This cuts indicator compute in half while preserving the most analytically valuable view at each timeframe.

Add a session_canonical boolean column to the candles table. Downstream consumers (confluence scoring, cross-section flags, event detection) ONLY read candles where session_canonical = true.

A SuperTrend flip that happens at 2 AM on Globex with 500 contracts traded is not the same as a flip at 10:30 AM RTH with 50,000 contracts traded.

3. Settlement Price vs Last Trade

Futures have an official daily settlement price calculated by the exchange at 4:15 PM ET (for CME products). This is NOT necessarily the last trade.

Store per symbol, per day:

Use settlement price for: Daily candle close, daily pivot point calculations, daily range calculations. Use last trade for: Intraday indicators, real-time analysis.

Settlement sourcing and fallback [PANEL: Claude Opus 4.6 — A+ changes]:

Weekly settlement cross-validation [PANEL: Claude Opus 4.6 — A+ changes]: Compare stored settlement prices against CME's published settlement file once per week. If any close differs by more than 2 ticks, the data source is wrong — flag and use CME's value.

4. Daily Price Limits

Store per symbol (from config):

Products with limits: CL, NG, ZC, ZS, SI, HG. ES, NQ, ZB have circuit breakers.

Enhanced limit event handling [PANEL: Claude Opus 4.6 — A+ changes]: When limit is hit: mark all indicators from that candle as unreliable. In backtesting, exclude limit-hit candles from validation. Add a limit_tail_risk flag for the next session (elevated gap risk). Track next_session_gap_direction and next_session_1h_return after limit events.

5. Round Number Grids (Per Product)

Symbol Tick Size Minor Grid Medium Grid Major Grid
ES 0.25 25 pts 50 pts 100 pts
NQ 0.25 100 pts 250 pts 500 pts
CL 0.01 $0.50 $1.00 $5.00
GC 0.10 $25 $50 $100
ZB 1/32 1 point (32nds-aware) 2 points 5 points
NG 0.001 $0.25 $0.50 $1.00
SI 0.005 $1.00 $2.50 $5.00
HG 0.0005 $0.10 $0.25 $0.50
ZC 0.25 25 cents 50 cents $1.00
ZS 0.25 25 cents 50 cents $1.00

ZB note: Prices are quoted in points and 32nds (e.g., 118'16 = 118 and 16/32nds). The round number grid must use 32nds-aware format: Minor = full point (e.g., 118'00), Medium = half point (e.g., 118'16), Major = 5 points. CL adds a $0.50 sub-minor level.

The S&R engine (Section 17) rounding logic: instead of rounding to .0025, round outward to the nearest Minor Grid level for that product.

6. "Pips" Do Not Exist

Replace ALL references to "pips" with "ticks" or "points." Every field that stores a distance in pips must store in ticks instead, with a config lookup for tick value in dollars.

7. Data Source Architecture [PANEL: Claude Opus 4.6 — A+ changes]

PRIMARY source: Interactive Brokers API (free with funded account)

IB API endpoints: reqHistoricalData for candles, reqMktDepth for Level 2, reqFundamentalData for contract details.

IB rate limits: 60 historical data requests per 10 minutes, 3 simultaneous market depth subscriptions on basic plan (10 with CME Level 2 add-on).

IB fallback: if IB connection drops, queue missed candle requests and backfill on reconnection. For real-time depth, simply skip — depth is ephemeral and backfilling is meaningless.

SECONDARY sources (all free):

TERTIARY source (v2 only): Databento at ~$50/month for tick-level data — enables delta/cumulative delta and accurate Volume Profile. Do NOT use for v1.

Data source priority queue [PANEL: Sonar Reasoning Pro]: Build with source abstraction: SwitchableDataProvider interface, not hardcoded API calls. Priority queue:

  1. IB API (preferred)
  2. Yahoo Finance (fallback, delayed)
  3. Barchart.com free tier (fallback)
  4. Last-resort: use 24h-old data until source recovers

Total data cost: $10-20/month for v1 (IB market data bundles). $60-70/month for v2 with Databento.

Replace OANDA with:

Data Type Forex Source Futures Source
Price candles (OHLCV) OANDA v20 REST IB TWS API (primary), Yahoo Finance (fallback)
Volume OANDA tick volume (proxy) REAL exchange volume — major upgrade
Order book OANDA order/position book CME Market Depth (Level 2) via IB
Settlement prices N/A IB daily bars (primary), CME files (validation)
COT data CFTC (same) CFTC disaggregated (same URL, different format)

Critical difference: real volume. Forex uses OANDA tick volume as a proxy. Futures have REAL exchange volume. All volume-based indicators (OBV, CMF, Force Index, Volume Profile) are genuinely accurate.

8. Open Interest

Store per symbol, daily:

Important: OI is a daily-resolution signal only. CME publishes preliminary OI around 8:30 PM ET, final OI next morning. DO NOT use OI for intraday decisions. Do not include OI in confluence scoring for intraday timeframes.

Store per symbol, weekly (from COT):

Micro vs standard volume ratio [PANEL: Claude Opus 4.6 — missing_data]: Where available, track volume for micro contracts (MES, MNQ, MCL, MGC) alongside standard contracts.

9. Term Structure (Contango / Backwardation)

Store per symbol, daily (commodity futures only: CL, NG, GC, SI, HG, ZC, ZS):

Simplified for equity index and fixed income (ES, NQ, ZB): Only store front_back_spread as a roll-timing signal, not as an analytical signal. Term structure is dominated by interest rates and dividends for these products, not supply/demand.

Actual calendar spread prices [PANEL: Claude Opus 4.6 — missing_data, NICE-TO-HAVE]: Where the data provider makes it available, track the actual traded spread price (not just the calculated front-back difference). Calendar spreads trade as their own instruments with their own bid/ask and volume. Spread volume indicates whether term structure moves are driven by real positioning or just one leg moving.

10. Delta / Cumulative Delta (Bid-Ask Volume Analysis)

v1 approximation [PANEL: Claude Opus 4.6 — A+ changes]: Defer true delta (requires tick data) to v2. In v1, for each candle calculate:

This captures ~60-70% of the signal that true delta provides, sufficient for daily/4H timeframes.

v2 (when Databento tick data available):


Section-by-Section Adaptation

SECTION 1: CORE STRATEGY — SuperTrend + ATR

Minor tweak.

SECTION 2: TREND LINES — 3 Price Sets + 1 RSI Set

No change.

ZigZag (15/5), HTF validation, the 3 price sets, RSI trend lines, and event tracking all work identically on futures OHLCV data. Cosmetic change: replace "pips" with "ticks" in distance thresholds.

SECTION 3: EVENT TIMING CORRELATIONS

No change.

The correlation logic between SuperTrend flips, price TL breaks, and RSI TL breaks is purely about candle-counting. Works identically on any instrument.

SECTION 4: RSI DIVERGENCES

No change.

RSI divergence detection is mathematically identical across asset classes. All 4 divergence types and outcome tracking work as-is.

SECTION 5: OTHER TECHNICAL INDICATORS

Minor tweak.

Moving Averages, RSI, MACD, Bollinger Bands, Stochastic, ADX/DMI, Keltner Channels: No change. Math on OHLC data.

VWAP [PANEL: Claude Opus 4.6 — A+ changes]: Simplify to RTH-anchored VWAP only for v1:

Volume-Pressure Indicators (OBV, CMF, Force Index): No formula changes. Now calculated from REAL exchange volume. Flag in schema: volume_type = "exchange". Add volume quality flag at ingestion: if volume is estimated/synthetic, disable all volume-based indicators and log a warning.

Fibonacci Retracement: No change.

Pivot Points:

Round Numbers: See Global Change #5 for product-specific grids.

Pair Correlations — replace with futures inter-market correlations:

Known lead-lag patterns: VIX often leads ES by 1-2 candles. ZB often leads ES during macro events. CL leads energy stocks.

SECTION 6: CONFLUENCE EVENTS

Minor tweak.

All 4 event types (Momentum Squeeze, Trend Exhaustion, Breakout Setup, Mean Reversion Extreme) work as-is.

Add new event type: Delta Divergence Confluence:

Multi-Indicator Confluence Score: Replace -5 to +5 range with -6 to +6 by adding delta direction as a 6th signal (+1 if cumulative_close_position positive, -1 if negative).

SECTION 7: ORDER FLOW AND POSITIONING

Major rewrite.

Remove:

Replace with:

CME Market Depth (Level 2 — event-driven snapshots only) [PANEL: Claude Opus 4.6 — A+ changes]:

Do NOT poll depth continuously. Capture a depth snapshot ONLY when:

  1. Price enters within 1.5 × ATR of an active S&R zone with score ≥ 50
  2. A SuperTrend flip occurs
  3. A confluence event (Section 6) triggers
  4. First/last 30 minutes of RTH (opening/closing imbalance)

Outside these conditions, poll depth every 60 seconds as a background heartbeat.

Store depth snapshots in depth_snapshots table:

Expected volume: ~50-200 snapshots per symbol per day. This fits comfortably in SQLite.

Volume Profile — exact algorithm [PANEL: Claude Opus 4.6 — A+ changes]:

Calculate at RTH session close using 1-minute OHLCV bars:

  1. For each 1-minute bar, distribute its volume across the price range using a TRIANGULAR DISTRIBUTION:

  2. Aggregate volume into price buckets:

  3. POC = bucket with highest volume.

  4. Value Area: starting from POC, alternately add next-highest-volume bucket above and below until 70% of total session volume is included. VAH = highest price in value area, VAL = lowest.

  5. HVN (High Volume Node) = any bucket with volume > 1.5 standard deviations above mean bucket volume.

  6. LVN (Low Volume Node) = any bucket with volume < 0.5 standard deviations below mean.

  7. Store full volume-at-price histogram as JSON array [{price, volume}] for visualization.

  8. Add field vp_data_quality: 'approximated_1min' (v1) or 'tick_accurate' (v2).

v1 Market Profile fields (store at RTH close):

Defer to v2 (advanced Market Profile): tpo_distribution_shape, open_type classification, poor_high, poor_low, single_prints, developing_poc_migration, day_type classification. These require subjective algorithmic decisions (no standard algorithm for shape classification) and consume weeks of development for marginal v1 value.

COT Report (Commitment of Traders) — upgrade to disaggregated [PANEL: Claude Opus 4.6 — A+ changes, Gemini 3.1 Pro]:

Replace standard COT fields with disaggregated fields. Source: CFTC Disaggregated Futures Only report at https://www.cftc.gov/dea/futures/deacmelf.htm (free, same release schedule: Tuesday data, Friday 3:30 PM ET).

For physical commodity futures (CL, GC, NG, SI, HG, ZC, ZS):

Derived signals:

For equity index and bond futures (ES, NQ, ZB) — use Traders in Financial Futures (TFF) report instead:

Standard COT lumps swap dealers into "commercials," which destroys the signal. In CL, swap dealer positions can be 40% of total commercial OI. Disaggregated COT takes 1 day to add and provides a signal that no price-based indicator can replicate.

SECTION 8: CROSS-MARKET SIGNALS

Major rewrite.

Remove:

Replace with — inter-market correlations between our futures:

Equity-Bond Nexus:

Energy Complex:

Precious Metals:

Ag Complex:

Safe Haven Flows:

External data still needed:

VIX term structure detail [PANEL: Sonar Reasoning Pro — missing_data]: Store separately: vix_9d, vix_30d, vix_term_ratio = vix_9d / vix_30d. When ratio < 1 (9-day < 30-day), fear is concentrated short-term (buy-dip environment). When ratio ≥ 1 (inverted or flat), sustained fear. This is predictive for ES mean reversion timing and is available free from CBOE.

Seasonal Supply/Demand Data:

Each inventory/report event gets tracked like the economic calendar: actual vs consensus, surprise magnitude, 5min/30min/1h/4h price reaction.

Consensus estimate sources [PANEL: Claude Opus 4.6 — A+ changes]:

Store per report:

The surprise z-score is the key signal: a -2 z-score EIA crude build is strongly bearish regardless of the absolute number.

Weather data for commodity futures (v1 simple version) [PANEL: Grok 4.1, Grok 4 Fast, DeepSeek V3.1]: For CL, NG, ZC, ZS — collect from NOAA API (free, https://api.weather.gov):

Cross-reference: if growing_season = true AND weather_anomaly_score > 2, flag as 'high weather risk' for amplified price moves. NOAA NWS API is free and authoritative — use it over any commercial weather service.

Breaking news / geopolitical event tracking [PANEL: Grok 4.1 — NICE-TO-HAVE, GPT-4.1]: For CL, NG, GC: monitor RSS feeds from Reuters, Bloomberg (via scraping), and CME/ICE operational status pages for:

Store per event: event_timestamp, symbol_affected, headline_text, price_reaction_5min, price_reaction_1h. Free tier NewsAPI or RSS scraping is sufficient for v1. Do NOT use paid sentiment APIs unless backtest shows value.

CME SPAN margin requirement tracking [PANEL: Claude Opus 4.6 — A+ changes, Gemini 3.1 Pro, Grok 4.1]: Source: CME SPAN margin file (free) — parse the 'Performance Bond' CSV at https://www.cmegroup.com/clearing/margins/outright-vol-scans.html.

Store per symbol, daily:

Track 2-day and 5-day forward returns after margin hikes. Historically, margin hikes on commodities (CL, NG, SI) cause 2-5% selloffs within 48 hours as undercapitalized longs liquidate.

Add to Section 18 cross-section flags: margin_hike_at_resistance (boolean — margin hike occurred while price is within 1 ATR of an active S&R zone = amplified selling pressure).

SECTION 9: CURRENCY STRENGTH ISOLATION — REPLACED WITH RELATIVE STRENGTH

Replace with: Momentum Ranking and Pairwise Relative Strength [PANEL: Claude Opus 4.6 — A+ changes]

Remove the sector abstraction entirely. Replace with a simple momentum ranking table updated daily:

Per symbol: roc_1d, roc_5d, roc_20d, roc_60d (rate of change). Rank all 10 symbols by each ROC period: rank_1d, rank_5d, rank_20d, rank_60d (1 = strongest, 10 = weakest). Track rank_change_5d = current rank minus rank 5 days ago (positive = improving).

Pairwise ratio ROCs for key pairs:

Drop: sector_roc_5d, sector_roc_zscore, sector_leadership_rank, rotation_signal. A 'sector' with one instrument (HG, ZB) is meaningless.

SECTION 10: SESSION-SPECIFIC TRACKING

Major rewrite.

Remove:

Replace with:

Session definitions per product (store in config):

Product ETH Open RTH Open RTH Close ETH Close Daily Halt
ES, NQ, ZB Sun 6PM 9:30AM 4:15PM Fri 5PM 4:15-4:30PM
CL, GC, NG, SI, HG Sun 6PM 9:00AM 2:30PM Fri 5PM 5:00-6:00PM
ZC, ZS Sun 8PM 8:30AM 1:20PM Fri 1:20PM varies

Per session, per symbol, log:

Initial Balance:

Day type classification (simplified for v1):

Overnight inventory:

Options expiration and roll date calendar [PANEL: Gemini 3.1 Pro — A+ changes]:

Source: CME Group calendar (free scraping) and CBOE.

Triple/quad witching (3rd Friday of Mar/Jun/Sep/Dec) causes massive volume and volatility spikes. Price often pins to heavy gamma strikes.

Holiday/half-day calendar:

SECTION 11: VOLATILITY REGIME CLASSIFICATION

Minor tweak.

All calculations (Historical Volatility Percentile, Kaufman Efficiency Ratio, Fractal Dimension, Squeeze Detection, Adaptive Parameters) work identically on futures OHLCV data.

Changes:

SECTION 12: ECONOMIC CALENDAR AND NEWS PROXIMITY

Minor tweak.

The structure (pull calendar, track events, log post-release reactions) works identically.

Add futures-relevant events:

Add inventory_report_day: boolean — is this an EIA/USDA/Baker Hughes report day?

Remove forex-specific flags: ecb_week, boj_week. Keep NFP Friday and FOMC week (both move futures).

Post-release tracking (actual vs consensus, 5min/30min/1h/4h reaction, retrace_pct) works as-is.

SECTION 13: SEASONAL AND TIME-BASED PATTERNS

Major rewrite.

Keep:

Remove:

Add — Commodity Seasonality:

Store per symbol (from historical data, build from 5-15 years depending on data available):

Known seasonal patterns (hardcoded as flags for v1):

Hardcoded agricultural growing season windows [PANEL: DeepSeek V3.1 — A+ changes]: Encode USDA phases explicitly in config per commodity:

Add: usda_growth_phase: categorical — 'pre_planting', 'planting', 'vegetative', 'pollination', 'grain_fill', 'dough', 'harvest', 'off_season'. Source from USDA's published calendar phases. Analysts weight weather data heavily when phase is 'pollination' or 'grain_fill' for ZC, 'pod_fill' for ZS.

Growing season / weather sensitivity:

Delivery month effects:

Seasonal baseline regime note [PANEL: Sonar Reasoning Pro — flaws]: When computing seasonal baselines from 10+ year histories, weight recent years 2x more than older years, and recalculate quarterly. Seasonal patterns shift with economic regime (NG seasonality changed post-fracking; ZC changed post-renewable energy push). Store seasonal_baseline_years = number of years used in baseline so analysts can assess data currency.

SECTION 14: DERIVED METRICS

Minor tweak.

All derived metrics (forward returns, setup win rates, price path efficiency, intermarket correlation shifts) work identically.

Changes:

SECTION 15: DATA STORAGE AND ARCHITECTURE

Minor tweak plus additions.

Rename:

SQLite CREATE statements:

-- Main candle store
CREATE TABLE futures_candles (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    symbol TEXT NOT NULL,
    timeframe TEXT NOT NULL,             -- '1m','5m','15m','1h','4h','1d','1w','1mo'
    session_canonical INTEGER NOT NULL,  -- 1 = use for signals, 0 = supplementary
    session_type TEXT NOT NULL,          -- 'RTH' or 'Full'
    open_time INTEGER NOT NULL,          -- Unix timestamp
    open REAL NOT NULL,
    high REAL NOT NULL,
    low REAL NOT NULL,
    close REAL NOT NULL,
    volume INTEGER NOT NULL,             -- Real exchange volume (not tick count)
    volume_type TEXT DEFAULT 'exchange', -- 'exchange' or 'estimated'
    open_interest INTEGER,              -- Daily snapshot (null for intraday)
    settlement_price REAL,              -- Daily only, official exchange settlement
    settlement_missing INTEGER DEFAULT 0, -- 1 if settlement not available
    close_position REAL,                -- (close-low)/(high-low), v1 delta proxy
    cumulative_close_position REAL,     -- Running sum from session open
    delta_data_quality TEXT DEFAULT 'candle_approximation',
    is_roll_day INTEGER DEFAULT 0,       -- 1 = exclude from event detection
    adjusted_panama REAL,               -- Panama-adjusted close
    adjusted_ratio REAL,                -- Ratio-adjusted close
    raw_close REAL,                     -- Unadjusted close
    limit_hit INTEGER DEFAULT 0,        -- 1 = price reached daily limit
    limit_type TEXT,                    -- 'up','down', or NULL
    indicators_reliable INTEGER DEFAULT 1, -- 0 if roll_day or limit_hit
    UNIQUE(symbol, timeframe, session_type, open_time)
);

CREATE INDEX idx_candles_symbol_tf_time ON futures_candles(symbol, timeframe, session_type, open_time);
CREATE INDEX idx_candles_canonical ON futures_candles(symbol, timeframe, session_canonical, open_time);

-- Market profile (per session close)
CREATE TABLE market_profile (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,              -- Unix timestamp of session close
    symbol TEXT NOT NULL,
    poc REAL NOT NULL,
    vah REAL NOT NULL,
    val REAL NOT NULL,
    ib_high REAL NOT NULL,
    ib_low REAL NOT NULL,
    ib_range_ticks REAL NOT NULL,
    ib_range_vs_avg_20d REAL,
    ib_extension TEXT,                  -- 'Up','Down','Both','Inside'
    day_type TEXT,                      -- 'Trend','Normal','Neutral','Double_Distribution'
    vp_histogram TEXT,                  -- JSON: [{price, volume}]
    vp_data_quality TEXT DEFAULT 'approximated_1min',
    -- v2 fields (NULL until tick data available)
    tpo_distribution_shape TEXT,
    open_type TEXT,
    poor_high INTEGER DEFAULT 0,
    poor_low INTEGER DEFAULT 0,
    single_prints TEXT,                 -- JSON array of price levels
    UNIQUE(symbol, date)
);

CREATE INDEX idx_mp_symbol_date ON market_profile(symbol, date);

-- Term structure (commodity products only)
CREATE TABLE term_structure (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    front_price REAL NOT NULL,
    second_price REAL,
    third_price REAL,
    front_back_spread REAL,
    term_structure_shape TEXT,           -- 'contango','backwardation','flat'
    contango_pct REAL,
    -- v2 fields
    roll_yield_annualized REAL,
    term_structure_shift_5d REAL,
    UNIQUE(symbol, date)
);

-- Contract roll tracking
CREATE TABLE contract_rolls (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    old_contract TEXT NOT NULL,         -- e.g., ESH26
    new_contract TEXT NOT NULL,         -- e.g., ESM26
    roll_gap REAL NOT NULL,             -- additive Panama gap
    ratio_factor REAL NOT NULL,         -- multiplicative ratio factor
    cumulative_panama_adjustment REAL,
    roll_validated INTEGER DEFAULT 0,   -- 1 = post-roll continuity check passed
    UNIQUE(symbol, date)
);

-- Open interest (daily)
CREATE TABLE open_interest_daily (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    total_oi INTEGER NOT NULL,
    oi_change INTEGER,
    oi_change_pct REAL,
    oi_vs_volume_ratio REAL,
    oi_price_confirmation TEXT,         -- 'New Longs','Short Covering','New Shorts','Long Liquidation'
    oi_momentum_5d REAL,
    UNIQUE(symbol, date)
);

-- COT disaggregated
CREATE TABLE cot_disaggregated (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    report_date INTEGER NOT NULL,       -- Tuesday date of the data
    release_date INTEGER NOT NULL,      -- Friday release date
    symbol TEXT NOT NULL,
    report_type TEXT NOT NULL,          -- 'disaggregated' or 'tff'
    -- Disaggregated fields (physical commodities)
    oi_producer_merchant_long INTEGER,
    oi_producer_merchant_short INTEGER,
    oi_swap_dealer_long INTEGER,
    oi_swap_dealer_short INTEGER,
    oi_managed_money_long INTEGER,
    oi_managed_money_short INTEGER,
    oi_other_reportable_long INTEGER,
    oi_other_reportable_short INTEGER,
    oi_non_reportable_long INTEGER,
    oi_non_reportable_short INTEGER,
    -- Derived signals
    managed_money_net INTEGER,
    managed_money_net_pct_oi REAL,
    managed_money_net_zscore REAL,
    producer_merchant_net INTEGER,
    swap_dealer_net INTEGER,
    -- TFF fields (equity index / bond futures)
    asset_manager_net INTEGER,
    leveraged_funds_net INTEGER,
    dealer_net INTEGER,
    UNIQUE(symbol, report_date)
);

-- Seasonal baselines
CREATE TABLE seasonal_baselines (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    symbol TEXT NOT NULL,
    week_of_year INTEGER NOT NULL,
    avg_return_5yr REAL,
    avg_return_15yr REAL,
    historical_direction_pct REAL,      -- % of years price rose this week
    volatility_percentile REAL,
    seasonal_inflection INTEGER DEFAULT 0,
    seasonal_baseline_years INTEGER,
    calculated_date INTEGER,            -- when this baseline was last computed
    UNIQUE(symbol, week_of_year)
);

-- Inventory reports
CREATE TABLE inventory_reports (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    report_date INTEGER NOT NULL,
    report_type TEXT NOT NULL,          -- 'eia_crude','eia_natgas','usda_wasde','usda_crop_progress','baker_hughes'
    symbol_affected TEXT NOT NULL,
    actual_value REAL,
    consensus_value REAL,
    consensus_source TEXT,              -- 'investing.com','barchart','prior_value','rolling_avg'
    prior_value REAL,
    surprise REAL,                      -- actual - consensus
    surprise_pct REAL,
    surprise_zscore REAL,              -- vs 52-week history
    price_5min REAL,
    price_30min REAL,
    price_1h REAL,
    price_4h REAL,
    UNIQUE(symbol_affected, report_date, report_type)
);

-- Depth snapshots (event-driven, not continuous)
CREATE TABLE depth_snapshots (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    trigger_reason TEXT NOT NULL,       -- 'zone_approach','st_flip','confluence','rth_open','rth_close','heartbeat'
    best_bid REAL,
    best_ask REAL,
    spread_ticks REAL,
    bid_depth_5 INTEGER,
    ask_depth_5 INTEGER,
    depth_imbalance REAL,               -- (bid-ask)/(bid+ask), -1 to +1
    large_resting_bid INTEGER DEFAULT 0,
    large_resting_ask INTEGER DEFAULT 0
);

CREATE INDEX idx_depth_symbol_time ON depth_snapshots(symbol, timestamp);

-- Session tracking
CREATE TABLE session_data (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    eth_open_price REAL,
    rth_open_price REAL,
    rth_high REAL,
    rth_low REAL,
    rth_close REAL,
    eth_only_high REAL,
    eth_only_low REAL,
    overnight_range_ticks REAL,
    overnight_range_vs_avg REAL,
    gap_from_settlement REAL,
    gap_filled INTEGER DEFAULT 0,
    rth_range_ticks REAL,
    rth_range_vs_avg REAL,
    rth_volume INTEGER,
    eth_volume INTEGER,
    rth_volume_pct REAL,
    globex_open_price REAL,             -- simple reference level
    prior_settlement REAL,              -- simple reference level
    is_opex_day INTEGER DEFAULT 0,
    opex_type TEXT,
    is_vix_expiration INTEGER DEFAULT 0,
    exchange_holiday INTEGER DEFAULT 0,
    half_day INTEGER DEFAULT 0,
    UNIQUE(symbol, date)
);

-- CME margin tracking
CREATE TABLE margin_requirements (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    initial_margin REAL NOT NULL,
    maintenance_margin REAL NOT NULL,
    margin_change_pct REAL DEFAULT 0,
    margin_change_direction TEXT,       -- 'hike','cut', or NULL
    margin_event INTEGER DEFAULT 0,     -- 1 if change >10%
    UNIQUE(symbol, date)
);

-- Limit events
CREATE TABLE limit_events (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    event_type TEXT NOT NULL,           -- 'limit_up','limit_down','circuit_breaker_L1','circuit_breaker_L2','circuit_breaker_L3'
    trigger_price REAL NOT NULL,
    duration_minutes REAL,
    session_type TEXT,                  -- 'RTH','ETH'
    prior_day_settlement REAL,
    gap_from_settlement_at_trigger REAL,
    next_session_open REAL,
    next_session_gap_direction TEXT,    -- 'continuation','reversal','flat'
    next_session_1h_return REAL
);

-- Contract spec change log
CREATE TABLE contract_spec_changes (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    detected_date INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    spec_field TEXT NOT NULL,           -- 'tick_size','point_value','trading_hours','price_limit'
    old_value TEXT,
    new_value TEXT,
    cme_advisory_number TEXT,
    reviewed INTEGER DEFAULT 0,         -- 1 = human reviewed and config updated
    UNIQUE(symbol, detected_date, spec_field)
);

-- Data quality log
CREATE TABLE data_quality_log (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,
    symbol TEXT,
    check_name TEXT NOT NULL,
    status TEXT NOT NULL,               -- 'pass','fail','warning'
    detail TEXT,
    timestamp INTEGER NOT NULL DEFAULT (strftime('%s','now'))
);

CREATE INDEX idx_dql_date ON data_quality_log(date, symbol, check_name);

-- Weather data (commodity futures)
CREATE TABLE weather_data (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,
    symbol TEXT NOT NULL,
    region TEXT NOT NULL,               -- 'midwest_corn_belt','gulf_coast','henry_hub'
    temp_anomaly_zscore REAL,
    precip_anomaly_zscore REAL,
    growing_degree_days REAL,
    growing_degree_days_deviation REAL,
    weather_anomaly_score REAL,
    critical_weather_event INTEGER DEFAULT 0,
    usda_growth_phase TEXT,
    UNIQUE(symbol, date, region)
);

-- Micro vs standard volume ratio
CREATE TABLE micro_volume (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    date INTEGER NOT NULL,
    symbol TEXT NOT NULL,               -- standard contract (ES, CL, etc.)
    micro_symbol TEXT,                  -- micro contract (MES, MCL, etc.)
    standard_volume INTEGER,
    micro_volume INTEGER,
    micro_to_standard_ratio REAL,
    UNIQUE(symbol, date)
);

Sampling frequencies:

Data Type Cadence
OHLCV candles 1m, 5m, 15m, 1H, 4H, 1D, 1W, 1Mo
Volume Per candle (real exchange volume)
Open interest Daily (preliminary 8:30PM ET, final next morning)
Market Profile RTH session close
Term structure Daily from settlement prices
COT disaggregated Weekly (Tuesday data, Friday 3:30PM ET)
Inventory reports Per report (EIA Wed, USDA Thu, BH Fri)
SPAN margins Daily check
Depth snapshots Event-driven (~50-200/symbol/day)
Weather data Daily
Contract spec check Weekly automated

Storage estimates:

SECTION 15.5: DATA QUALITY AND VALIDATION [PANEL: Claude Opus 4.6 — A+ changes]

Run these automated checks daily. Store results in data_quality_log.

1. candle_continuity_check: For each symbol and timeframe, verify no gaps in candle sequence during trading hours. If gaps found, log and attempt backfill from data source.

2. roll_adjustment_check: After each roll, verify adjusted continuous series has no jump greater than 1 tick at the roll boundary. If jump detected, Panama adjustment was applied incorrectly — halt and alert.

3. indicator_sanity_check: Verify: ATR > 0, RSI between 0-100, SuperTrend value within 5 ATR of current price, BB upper > BB lower, VWAP within day's range. If any check fails, flag the candle and exclude from event detection.

4. cross_source_validation (weekly): Compare last 5 daily closes from IB against barchart.com free delayed data. If any close differs by more than 2 ticks, investigate.

5. settlement_validation: Compare stored settlement price against CME's published settlement file. If they differ, use CME's value and log discrepancy.

6. volume_validation: Verify daily volume is within 2 standard deviations of the 20-day average. Extreme outliers may indicate double-counted volume during roll.

7. contract_spec_check (weekly): Pull current contract specifications from CME product pages. Compare against stored config values: tick_size, point_value, trading_hours, price_limit_formula, position_limits, settlement_method. If ANY value has changed, set config_alert = true, halt indicator calculation for that symbol, and require human review. Log: old_value, new_value, detected_date.

8. volume_quality_flag: At ingestion time, if volume source is estimated/synthetic rather than exchange-reported, set volume_type = 'estimated' and disable all volume-based indicators for those candles.

SECTION 16: CLOSED-PERIOD STRUCTURE TABLE

Minor tweak.

All 12 subsections (16.1-16.12) work as-is on futures OHLCV data.

Changes:

Schema versioning [PANEL: GPT-4.1 — flaws]: Add schema_version table tracking all migrations. Build explicit forward/backward compatibility for schema changes. With so many per-product variations, schema drift will otherwise cause future pipeline changes to be painful.

SECTION 17: MULTI-METHOD S&R ZONE ENGINE

Minor tweak.

The engine is asset-class agnostic. All 10 detection methods, scoring, decay, clustering, timeframe hierarchy, cross-referencing, and validation work as-is.

Changes:

SECTION 18: CROSS-SECTION INTELLIGENCE FLAGS

Major rewrite.

Remove (forex-specific):

Replace with:

18.1 COT + Term Structure + SuperTrend Alignment

18.2 Open Interest Confirmation

18.3 Market Depth at S&R Zones

18.4 Seasonal + Fundamental Alignment

18.5 Volatility Regime + Economic Calendar Same as forex 18.5. Works identically.

18.6 Inter-Market Regime Signal

18.7 Volume Profile + S&R Zone Interaction

18.8 Margin Hike at Resistance [PANEL: Claude Opus 4.6 — A+ changes]

SECTION 19: FUTURE — INSTITUTIONAL-GRADE DATA

Minor tweak.

Tier 1: Options-Derived Intelligence (v2 priority — defer from v1): CME options on ES, CL, GC are among the most liquid options markets.

Options expiration tracking [PANEL: Gemini 3.1 Pro — A+ changes]: Even without options IV data, track OpEx dates (free from CME/CBOE calendar). Price often "pins" to heavy gamma strikes around expiration. Monthly and triple witching expirations are particularly impactful.

Tier 2: Depth of Market / Order Flow:

Tier 3 — REMOVE [PANEL: Claude Opus 4.6 — over_engineered]: Remove satellite imagery, electricity grid data, mine production data from this spec entirely. Replace with: "Alternative data sources (satellite imagery, IoT sensor networks, satellite-based physical commodity monitoring) are hedge-fund-grade tools costing $10K-100K+/month. They are explicitly out of scope for this system."

Tier 4: Advanced Analytics (free, compute-intensive):


Summary Table

Section Verdict Key Change
1: SuperTrend + ATR Minor tweak session_type, volume_at_flip, close_position proxy; pips→ticks
2: Trend Lines No change Pips→ticks cosmetic only
3: Event Timing No change Pips→ticks cosmetic only
4: RSI Divergences No change Works identically
5: Other Indicators Minor tweak RTH VWAP only (v1); settlement pivots; futures correlations
6: Confluence Events Minor tweak Add delta divergence; +1 score for delta
7: Order Flow Major rewrite CME depth (event-driven); Volume Profile (triangular dist algorithm); disaggregated COT
8: Cross-Market Major rewrite Inter-futures correlations; inventory reports with consensus; weather; margin tracking
9: Currency Strength Major rewrite Momentum ranking + pairwise ratio ROC
10: Sessions Major rewrite RTH/ETH/IB/overnight structure; OpEx calendar
11: Volatility Regime Minor tweak VIX integration for equities; overnight vol ratio
12: Economic Calendar Minor tweak Add inventory reports (EIA, USDA, OPEC, Baker Hughes)
13: Seasonal Patterns Major rewrite Commodity seasonality; USDA growth phases; weather data
14: Derived Metrics Minor tweak OI momentum; term structure momentum; seasonal deviation
15: Storage/Architecture Minor tweak + SQL Full CREATE statements; all new tables
15.5: Data Quality NEW section 8 automated validation checks; weekly contract spec check
16: Closed-Period Minor tweak Settlement as close; product-specific gap thresholds
17: S&R Engine Minor tweak Product-specific grids; Market Profile levels; settlement as method
18: Cross-Section Flags Major rewrite OI confirmation; depth at zones; seasonal alignment; margin events
19: Future Data Minor tweak Emphasize options/GEX; remove alternative data (out of scope)

New global systems not in forex spec:

  1. Continuous contract roll handling — triple storage (raw, Panama, ratio)
  2. RTH vs ETH canonical session hierarchy
  3. Settlement price integration + validation
  4. Daily price limits + limit event logging
  5. Open interest tracking (daily, daily-resolution only)
  6. Term structure (commodity futures only, simplified)
  7. Delta approximation v1 (close_position) + v2 path (tick data)
  8. Volume Profile — exact triangular distribution algorithm
  9. Market Profile v1 (POC/VAH/VAL/IB only) + v2 advanced features deferred
  10. Product-specific round number grids (with ZB 32nds fix)
  11. Commodity seasonality engine with USDA growth phases
  12. SPAN margin tracking + margin event signal
  13. Circuit breaker / limit event logging
  14. Contract spec change monitoring (automated weekly)
  15. Data quality validation (15.5)
  16. Weather data (NOAA API, commodity futures)
  17. OpEx / roll date calendar
  18. Disaggregated COT (replaces standard COT)

API Endpoints Reference

# Interactive Brokers TWS API
reqHistoricalData(symbol, duration, barSize, whatToShow='TRADES', useRTH=0)
reqMktDepth(symbol, numRows=5, isSmartDepth=False)
reqContractDetails(contract)

# CFTC Disaggregated COT
https://www.cftc.gov/dea/futures/deacmelf.htm  (physical commodities)
https://www.cftc.gov/dea/futures/deacmesf.htm  (financial/TFF report)

# CME SPAN margins (free)
https://www.cmegroup.com/clearing/margins/outright-vol-scans.html
# Download: Performance Bond CSV, updated daily

# CME contract specifications
https://www.cmegroup.com/markets/equities/sp/e-mini-sandp500.contractSpecs.html
# (similar pattern per product)

# CME holiday calendar
https://www.cmegroup.com/tools-information/holiday-calendar.html

# CBOE VIX
https://www.cboe.com/tradable_products/vix/  (free, real-time delayed)
https://cdn.cboe.com/api/global/us_indices/daily_prices/VIX_History.csv  (free historical)

# EIA inventories (economic calendar with consensus)
https://www.investing.com/economic-calendar/eia-crude-oil-inventories-75
https://www.investing.com/economic-calendar/natural-gas-storage-386

# USDA WASDE consensus
https://www.barchart.com/cmdty/reports/wasde  (free tier)

# NOAA weather API
https://api.weather.gov/points/{lat},{lon}
https://api.weather.gov/gridpoints/{office}/{X},{Y}/forecast

# EIA data API (free)
https://api.eia.gov/v2/petroleum/sum/sndw/data/  (crude inventory data)
https://api.eia.gov/v2/natural-gas/stor/sum/data/  (natgas storage)

# FRED (Fed funds, SOFR, economic data)
https://api.stlouisfed.org/fred/series/observations?series_id=FEDFUNDS&api_key=...

Collection Cadence Summary

Signal Cadence Source Cost
OHLCV candles (all timeframes) Continuous / on-close IB TWS API $10-20/mo (IB bundle)
Real exchange volume Per candle IB TWS API Included above
Settlement prices Daily 4:20PM ET IB daily bars Included
Open interest Daily 8:30PM ET IB / CME files Included
Term structure (front/second/third) Daily IB Included
RTH session metrics Daily at RTH close Derived from candles $0
Volume Profile (POC/VAH/VAL) Daily at RTH close Calculated from 1m bars $0
Market Profile IB metrics Daily at RTH close Derived from 1m bars $0
COT disaggregated Weekly Fri 3:30PM CFTC.gov $0
SPAN margins Daily CME.com $0
EIA crude inventory Weekly Wed 10:30AM investing.com + EIA API $0
EIA natgas storage Weekly Thu 10:30AM investing.com + EIA API $0
USDA WASDE Monthly (~12th) barchart.com $0
Baker Hughes rig count Weekly Fri 1PM barchart.com $0
NOAA weather data Daily api.weather.gov $0
VIX + VIX term structure Daily CBOE $0
DXY Daily IB or free source $0
Depth snapshots Event-driven IB Level 2 ~$10/mo add-on
Contract spec check Weekly CME product pages $0
OpEx calendar As needed CME/CBOE $0
Tick data (delta v2) Per tick, RTH only Databento $50/mo (v2 only)

Total v1 budget: $10-20/month (IB market data subscriptions) Total v2 budget: $60-70/month (add Databento for tick data)


Implementation Checklist

Phase 1: Foundation (Weeks 1-4, ES + CL only)

Phase 2: Core Futures-Specific Data (Weeks 5-8, add GC + ZB)

Phase 3: Calendar, Fundamental, Weather (Weeks 9-10, add NQ + NG)

Phase 4: Seasonality, Correlations, Remaining Products (Weeks 11-16)

v2 Additions (Post-launch, when data sources and budget allow)


Key SQL Queries for Analysts

-- Get latest confluence signals for all symbols
SELECT c.symbol, c.open_time, c.close, c.cumulative_close_position,
       mp.poc, mp.vah, mp.val,
       oi.oi_price_confirmation, oi.oi_momentum_5d,
       ts.term_structure_shape, ts.contango_pct
FROM futures_candles c
LEFT JOIN market_profile mp ON mp.symbol = c.symbol AND mp.date = date(c.open_time, 'unixepoch')
LEFT JOIN open_interest_daily oi ON oi.symbol = c.symbol AND oi.date = date(c.open_time, 'unixepoch')
LEFT JOIN term_structure ts ON ts.symbol = c.symbol AND ts.date = date(c.open_time, 'unixepoch')
WHERE c.timeframe = '1d' AND c.session_canonical = 1
ORDER BY c.open_time DESC;

-- COT managed money extremes (crowded positioning signal)
SELECT symbol, report_date, managed_money_net, managed_money_net_zscore,
       managed_money_net_pct_oi
FROM cot_disaggregated
WHERE ABS(managed_money_net_zscore) > 2.0
ORDER BY report_date DESC, ABS(managed_money_net_zscore) DESC;

-- Recent inventory report surprises above 2-sigma
SELECT report_date, report_type, symbol_affected,
       actual_value, consensus_value, surprise, surprise_zscore,
       price_1h
FROM inventory_reports
WHERE ABS(surprise_zscore) > 2.0
ORDER BY report_date DESC;

-- Margin events with price context
SELECT m.date, m.symbol, m.margin_change_pct, m.margin_change_direction,
       c.close AS close_on_event,
       s.gap_from_settlement
FROM margin_requirements m
JOIN futures_candles c ON c.symbol = m.symbol
    AND c.timeframe = '1d' AND c.session_canonical = 1
    AND date(c.open_time, 'unixepoch') = date(m.date, 'unixepoch')
JOIN session_data s ON s.symbol = m.symbol AND s.date = m.date
WHERE m.margin_event = 1
ORDER BY m.date DESC;

-- Active S&R zones near current price (requires zone engine output table)
-- Assumes zone_engine_output table exists with: symbol, zone_center, score, method
SELECT z.symbol, z.zone_center, z.score, z.method,
       c.close, ABS(c.close - z.zone_center) AS distance_pts
FROM zone_engine_output z
JOIN futures_candles c ON c.symbol = z.symbol
    AND c.timeframe = '1d' AND c.session_canonical = 1
    AND c.open_time = (SELECT MAX(open_time) FROM futures_candles WHERE symbol = z.symbol AND timeframe = '1d')
WHERE z.score >= 50
    AND ABS(c.close - z.zone_center) < (SELECT atr FROM futures_candles WHERE symbol = z.symbol AND timeframe = '1d' ORDER BY open_time DESC LIMIT 1) * 1.5
ORDER BY z.symbol, z.score DESC;

-- Seasonal deviation — price vs seasonal expectation
SELECT sb.symbol, sb.week_of_year,
       sb.avg_return_5yr AS seasonal_expected_return,
       c.close / LAG(c.close, 5) OVER (PARTITION BY c.symbol ORDER BY c.open_time) - 1 AS actual_return_5d,
       (c.close / LAG(c.close, 5) OVER (PARTITION BY c.symbol ORDER BY c.open_time) - 1) - sb.avg_return_5yr AS seasonal_deviation
FROM seasonal_baselines sb
JOIN futures_candles c ON c.symbol = sb.symbol AND c.timeframe = '1d' AND c.session_canonical = 1
WHERE CAST(strftime('%W', datetime(c.open_time, 'unixepoch')) AS INTEGER) = sb.week_of_year
ORDER BY c.open_time DESC, ABS((c.close / LAG(c.close, 5) OVER (PARTITION BY c.symbol ORDER BY c.open_time) - 1) - sb.avg_return_5yr) DESC;

-- Data quality failures in last 7 days
SELECT date, symbol, check_name, status, detail
FROM data_quality_log
WHERE status IN ('fail', 'warning')
    AND date > strftime('%s', 'now', '-7 days')
ORDER BY date DESC;

Cross-Desk Reuse

Component Reusable For
Economic calendar + event reaction tracker forex, stocks, options, crypto, prediction-markets
VIX and VIX term structure data stocks, options, crypto
S&R Zone Engine (multi-method, scoring, decay, clustering) forex, stocks, crypto
Volatility regime classification (HV percentile, Kaufman ER, fractal dim, squeeze) forex, stocks, crypto, options
COT parsing infrastructure forex (same CFTC source, different column mappings)
Continuous contract / roll handling concept prediction-markets (Kalshi/Polymarket recurring contracts)
Seasonal pattern engine stocks, sports-betting, prediction-markets
Inter-market correlation tracking + regime shift detection forex, stocks, crypto
SQLite schema patterns (time-series candles, event tables, reference tables) all desks
SwitchableDataProvider abstraction all desks
Backtesting framework (Sharpe, Brier, out-of-sample) sports, crypto, stocks, options, prediction-markets
Confluence scoring logic (multi-signal weighting) crypto, stocks, options, sports
Alert / notification system (when confluence fires) all desks
Settlement tracking (predicted vs actual, Brier scoring) all desks
Volume-based signal library (OBV, CMF, Force Index) stocks, crypto, options
NOAA weather ingestion weather prediction desk
Inventory report parsing (EIA PDF parsing) energy-focused prediction markets

Reality Check

v1 is realistic at $10-20/month. IB API provides OHLCV, real volume, settlement, and basic Level 2 for free with a funded account. All the high-value signals (SuperTrend, Volume Profile, disaggregated COT, seasonal patterns, inventory reports, SPAN margins) are available from free sources.

What will break first: Settlement price sourcing at 4:15 PM ET. CME timing varies. Build the retry logic and settlement_missing fallback on day 1, not as an afterthought.

Biggest operational risk: Roll automation. A missed or miscalculated roll propagates silently through every indicator for every subsequent candle. The post-roll jump validation check (≤1 tick at boundary) is not optional — it is the safety net.

What to skip in v1 (in priority order):

  1. Tick data / true delta (defer to v2, use close_position approximation)
  2. Full dual-track RTH + Full session indicators (use canonical hierarchy)
  3. Advanced Market Profile (TPO shapes, open type, single prints, poor highs/lows)
  4. Options IV / GEX (use VIX as proxy for ES/NQ, accept blind spot for CL/GC/ZC/ZS)
  5. Alternative data (satellite, Level 3, footprint — explicitly out of scope)

Start with ES and CL. They are maximally different, exercise every code path, and have the best free data availability. Do not add more products until the pipeline is validated and stable.


Build Status (Updated 2026-03-29)

Data Sources

Source Table Cadence Status
Polygon ETF proxies (OHLCV) ta_futures_ohlcv 15 min FLOWING
CFTC COT data cftc_cot (from Futures desk) Weekly FLOWING
EIA inventories eia_* tables (from Futures desk) Weekly FLOWING
Baker Hughes rig counts baker_hughes (from Futures desk) Weekly FLOWING
NOAA degree days noaa_degree_days (from Futures desk) Daily FLOWING

Panel-mandated source change: Yahoo Finance -> Polygon ETF proxies -- DONE.

Code

Proprietary Metrics

Metric Status Table
TSV (Term Structure Velocity) Computing ta_futures_metrics
SDZ (Seasonal Deviation Z-Score) Computing ta_futures_metrics
IMSG (Inter-Market Stress Gauge) Computing ta_futures_metrics
CCC (Commodity Curve Carry) Computing ta_futures_metrics
CCSS (Cross-Commodity Spreads) Computing ta_futures_metrics
Volume Profile (POC/VAH/VAL) Computing ta_futures_overlay
RTH/ETH Session Tagging Computing ta_futures_overlay
COT Positioning Overlay Computing (reader built) ta_futures_overlay
CPM (CFTC COT alignment to price) STUBBED -- needs live COT feed --
ISF (Inventory Surprise Factor) STUBBED -- needs inventory data --
PMSI (Physical Market Stress) STUBBED -- needs physical market data --

Instruments

SPY, QQQ, GLD, USO, TLT, SLV, UNG, DBA (ETF proxies) x 1h, 4h, 1d timeframes

What's NOT Built Yet

Foundation Models (shared sidecar, port 5050, systemd auto-restart)

# Model Params Type In Ensemble Latency
1 Chronos-T5-Small 46M General probabilistic Yes ~1s
2 Chronos-T5-Base 200M General probabilistic Yes ~3s
3 Chronos-2 120M General probabilistic (v2) Yes ~0.1s
4 Kronos-mini 4.1M Finance OHLCV Yes ~0.5s
5 Kronos-base 102M Finance OHLCV Yes ~2s
6 Moirai v1.1 ~300M Multivariate Yes ~1.7s
7 TimesFM 2.5 200M General (Google) Yes ~0.6s
8 Lag-Llama ~10M Probabilistic On-demand ~2s

Forecasts persisted to ta_model_forecasts with Kronos OHLCV + derived candle shape. Stubbed: TTM (Python 3.11), FinCast (no weights).

Shared Core (all desks, computed every 15 min per instrument)

Module Table What It Does
Dual SuperTrend (10/2+10/3) ta_supertrend_flips Flip events, direction, outcomes (2R-6R)
RSI crosses (8 pairs, 16 lines) ta_rsi_crosses Crossover events, dual stop tracking
Pullback monitor (5 methods) ta_pullback_results Pullback setups from recent flips
S&R Zone Engine (6 methods) ta_sr_zones + events Zones scored 0-100+, decay, lifecycle
Structure analysis ta_closed_structure Candle morphology, trend structure
Regime detection ta_regime Hurst, autocorrelation, ATR forecast
Liquidity sweeps ta_liquidity_sweeps Stop-hunt reversals
Fair value gaps ta_fvg Body gaps + fill tracking
Candle patterns (6 types) ta_candle_patterns Zone-gated pattern detection
RSI divergences (4 types) ta_divergences Z-score strength
Lead-lag ta_lead_lag Cross-instrument correlation
Confluence scoring ta_confluence -35 to +35 composite
Additional signals (5 types) ta_signals Composite entry signals
Order blocks ta_sr_zones Inside S&R engine

Pipeline Wiring

Source: ~/edgeclaw/results/spec-panel/ta-futures/spec-final.md