Is there alpha in an AI trading signal?

November 13, 2025
|
Product

Last week we introduced Pythia-v0.1.0, our finance-native AI model. It’s trained on CME Globex Level-II order-book (LOB) data and produces minute-by-minute intraday predictions of mid-price direction (Up / Stable / Down) for two U.S. futures: ES and NQ. For this benchmark, we evaluate a signal-only policy that takes a position every 5 minutes based solely on the prediction and then flattens all positions at “end of day” (EOD). 

Review detailed benchmark report here

What we tested (PnL first)

  • Policy: Every 5 minutes from 09:45–15:30 ET we take long/short/flat, allocating 1/70 of the day’s starting portfolio per decision. All positions are exited across 15:55–16:00 ET. Costs: 1 bp round-turn (exchange/clearing fees excluded). Contract: front month, rolled T–5 days before expiry. 
  • Why this policy: It’s intentionally naïve—no overlays, no sizing logic—to expose what the signal contributes on its own before any strategy-specific tuning.
Figure 1: E‑mini NASDAQ-100 Futures: Cumulative Strategy P&L (Net, Long‑Only, Short‑Only) vs Price
Figure 2: E‑mini S&P 500 Futures: Cumulative Strategy P&L (Net, Long‑Only, Short‑Only) vs Price

Results at a glance:

  • Sharpe ratios overall and by quarter: 
    • ES: Overall Sharpe 1.17. The lowest quarter is Q1’25 with Sharpe ~1.12 and the strongest is Q3’25 with Sharpe ~3.75. 
    • NQ: Overall Sharpe 2.1. The lowest quarter is also Q1’25 with Sharpe ~1.47 and the strongest is Q3’25 with Sharpe ~3.30.
  • Regime sensitivity. Performance improves notably in Q2–Q3’25, consistent with a rising market; the signal still carries information in the weaker Q1’25 period. 
  • Data note. The models are trained exclusively on Level-II LOB, with no news inputs. They appear to pick up order-flow patterns that reflect major macro and policy events, even without ingesting text. For example:
    • Presidential election in November 2024.
    • Fed rate announcement in December 2024. 
    • Tariff announcements in March and April 2025. 
    • Volatility around “liberation day” in April ‘2025.
  • Asset-specific behavior. NQ tracks the broader swings in Q3’25, though some early-Q2 upside isn’t captured. ES captures direction in Q3’25 but doesn’t monetize the full magnitude of the move.
  • Long vs. short. Contributions vary by regime: longs dominate in uptrends, shorts add value during pullbacks; the balance shifts quarter to quarter.

How to read this

  • These are signal-only results with uniform entries and EOD exits. They demonstrate that the signal contains tradable information before any portfolio construction or execution optimization.
  • In practice, we expect teams to combine our signal with their own models and signals in majority rules, weighted average, or weighted experts approaches that combine multiple inputs. 

Reproduce or extend

We provide all predictions generated by the model for the 4 quarter period. They are available for download so you can run your own harness.