FOREX-MACRO DESK: FINAL PANEL RULING

Date: 2026-03-24 Panel: Opus 4.6 + Sonnet 4.6 + Grok 4.2 Reasoning + Gemini Pro 3.1 Judge: Opus 4.6 Question: Additional data sources + advanced calculated metrics for Forex-Macro desk edge over retail Kalshi bettors

Panel Quality: A-

All four panelists delivered substantive, well-structured responses with concrete data sources, formulas, and prioritization. No hallucinated APIs, no vaporware. Overlap was high on core themes (carry models, regime detection, liquidity indices, COT positioning), which signals genuine consensus rather than groupthink -- each arrived via different reasoning paths.


CONSENSUS POINTS (3+ panelists agree)

  1. Carry Model is foundational — All four propose carry_score = rate_diff / realized_vol
  2. Regime Detection is essential — All four propose classifying risk-on/off/etc using VIX, yield curve, credit spreads, MOVE
  3. COT Positioning as contrarian signal — All four recommend z-score normalization for crowding detection
  4. Liquidity/Financial Conditions Index — All four propose composite from Fed BS, credit spreads, volatility
  5. Economic Surprise Index — Three of four recommend custom Citi-style ESI from actual vs consensus
  6. FRED as primary data backbone — Unanimous
  7. Real yield differentials matter — Three of four call out TIPS/breakevens as key inputs

DISSENT RULINGS

Dissent Ruling Reason
How many regimes? 4 regimes (Opus): Risk-On, Risk-Off, Stagflation, Transition Stagflation is distinct for forex; Transition prevents false signals
Custom Dollar Index? Yes, but Phase 2 — simple trade-weighted from OANDA Useful but not urgent; DXY proxy works for now
Fed speech NLP? Deferred High effort, moderate reward. Just track dates as vol catalysts
Foreign central bank APIs? FRED only — FRED mirrors the key foreign yields ECB SDW, BoJ, BoE have painful non-standard APIs
Google Trends? Skip Noisy, rate-limited, poor forex signal. COT serves this better
Term Premium? Skip Requires replicating academic models. Yield curve shape captures same info simpler
Geopolitical Risk Index? Phase 3 — trivial to collect, marginal edge

FINAL BUILD LIST

PHASE 1: Core Foundation (Build First)

Data Sources

# Source Frequency Method
1 FRED bulk pull (15+ new series) Daily FRED API — TIPS yields (DFII5/10), breakevens (T5YIE, T10YIE, T5YIFR), foreign 10Y (6 countries), WALCL, RRPONTSYD, WTREGEN, BAMLH0A0HYM2, NFCI, ADSINDEX
2 CME FedWatch probabilities Daily Derive from Fed Funds futures or scrape
3 CFTC COT (forex futures) Weekly Bulk CSV download
4 Econoday consensus + actual Daily Scrape consensus for major releases
5 Atlanta Fed GDPNow ~Weekly FRED or scrape

Metrics

# Metric Formula Priority
M1 Rate Expectations Deviation deviation = kalshi_prob - cme_fedwatch_prob HIGHEST — closest to arbitrage
M2 Carry-to-Volatility Score carry_score = rate_diff / realized_vol_20d Core FX signal
M3 Net Liquidity Index net_liq = WALCL - RRPONTSYD - WTREGEN; impulse = 30d change; z-score over 2Y Explains risk asset direction
M4 Economic Surprise Index surprise_z = (actual - consensus) / historical_std; ESI = EWMA(surprise_z, span=90) Maps to Kalshi macro contracts
M5 Macro Regime Classifier 4 regimes from: 2s10s_z, vix_z, hy_spread_z, breakeven_direction Gates all other signals

PHASE 2: Edge Enhancement

Data: Daily Treasury Statement, ISM sub-components, Initial/Continuing Claims, NFCI sub-indices

Metrics

# Metric Formula
M6 COT Positioning Signal net_spec_pct z-scored 52w; contrarian at ±1.5
M7 Financial Conditions Index mean(z(10Y), z(HY_spread), z(MOVE), z(USD))
M8 Yield Curve Shape level/slope/curvature/butterfly, all z-scored 2Y
M9 Real Yield Gap (G10) (US_real_10Y) - (foreign_real_10Y) for Germany, Japan

PHASE 3: Nice-to-Have

Custom Dollar Index, GPR Index, BIS REER, correlation regime breaks, cross-asset momentum, Fed speech calendar flags

REJECTED

Fed speech NLP, Term Premium models, Google Trends, foreign CB direct APIs, TIC/COFER/AAII/BIS quarterly data


IMPLEMENTATION NOTES

  1. Generic FRED scraper with configurable series list → single fred_series table covers 70% of data needs
  2. SQLite is fine — ~30 series daily, hundreds of rows/day max
  3. COT data — weekly CFTC bulk CSV, parse "Traders in Financial Futures" report
  4. CME FedWatch — derive from Fed Funds futures (most robust) or scrape
  5. All metrics computed in daily batch cron (6AM ET), not real-time
  6. Forex prices stay at 4H — fundamentals desk, not scalping

This ruling is final and binding.

Source: ~/edgeclaw/results/panel-results/forex-macro-final-ruling.md