US Futures (ES and NQ): Performance Benchmark Q3 2025

November 12, 2025

.

10 min read

1. Executive Summary

Quantum Signals provides pre-trained intraday AI signals for US equity index futures (ES, NQ). Our models predict short-horizon mid-price trends (up / stable / down) using finance-native neural networks trained on Level II limit order book (LOB) and microstructure features.  

Objective: Review the performance of pre-trained AI signals on ES and NQ futures and understand how the technology performs and generalizes.

Bigger picture: Pre-trained signals are a baseline that can be used for validation. We do not expect all our customers to trade using these signals. The goal is to help you design custom signals tailored to your symbols, horizons, entry cadence, and risk rules.

2. Review the Basics

2.1 Data

We use Level II LOB (10 levels, prices & volumes) from CME Globex with ~3 years’ worth of history for training, plus the real-time LOB feed for near-real-time predictions.

2.2 Predictions

For the two pre-trained signals we are reviewing in this document we are predicting the following:

  • Target variable: mid-price trend (Up / Down / Stable) between two averaging windows:
    • Start window: [now, now + 5 min] (average over next 5-minutes)
    • End window: end of the day (average over 15:55–16:00 ET)  
  • Stable: ±2 bps band around 0  
  • Prediction cadence: every 1 minute on CME Trading Days for Equities,  09:30–16:00 ET

2.3 AI Model

Current model used: Pythia-v0.1.0-Sep25

We introduced this model in a recent blog post. The “Sep25” suffix corresponds to the last month in the data set used to train, test, and validate the model.

We utilize a Temporal Fusion Transformer (TFT) architecture optimized for market microstructure. This is not an LLM, we use transformer components tailored for finance. Here are some of the key considerations the model takes into account:  

  • multi-scale time features (LOB events, seconds, minutes, days)
  • probability calibration (post-processing of the output using class probabilities)
  • cost-sensitive loss function aligned to PnL  Technical paper on TFT: “Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting”, B. Lim et al (available at https://arxiv.org/abs/1912.09363).

3. Benchmarks

3.1 Training, Validation, and Test periods

We are retraining our models every 3 months, and evaluating performance out-of-sample.  

  • Training: Several months of historical data that we use to train the model. Every 3 months we append another 3 months of data to the end of this data set.  
  • Validation: A 3-month period (rolling forward every 3 months), not seen during training, that we use to pick the best performing model.
  • Test: A 3-month period (rolling forward every 3 months), not seen during training or validation, that we use to report performance.
Reported Performance for
Q4 2024
Reported Performance for
Q1 2025
Reported Performance for
Q2 2025
Reported Performance for
Q3 2025
Training Q1 2022 Q1 2022 Q1 2022 Q1 2022
Validation Q3 2024 Q4 2024 Q1 2025 Q2 2025
Out of Sample Test Q4 2024 Q1 2025 Q2 2025 Q3 2025

Table 1: Training, Validation, and Test Periods.

3.2 Trading Strategy Used for Benchmarking

Every 5 minutes (between 09:45 and 15:30 ET) we decide to take a long, short or no position using 1/70 of our starting portfolio for the day (there are 70 possible openings per day).  

  • Each long/short position is then split into 5 parts and executed on each minute for the next 5 minutes following the decision. There is no sizing adjustment.  
  • We exit all positions at the end of the day again split over five minutes (15:55–16:00 ET).
  • Costs: 1 bp round-turn assumption (ES: 1bp ~ 2 ticks = $25.0, NQ: 1bp ~ 8 ticks = $40.0).
  • Extra exchange/clearing fees not included.
  • Contract series & roll: Front-month continuous. Switch at the open T–5 trading days before expiration—stop trading the expiring contract and start trading the next.

3.3 Performance Metrics

E-mini S&P 500 (ES):
Quarter Sharpe Calmar Win % Ann. Return % Ann. Vol % MDD %
25Q3 3.7526 11.7375 58.1402 12.0748 3.2178 -1.0287
25Q2 1.5868 7.1555 55.7388 31.2954 19.7219 -4.3736
25Q1 1.1222 3.1175 54.2403 8.2167 7.322 -2.6357
24Q4 1.2532 4.4738 51.3311 8.4703 6.7591 -1.8933
Overall 1.1759 2.6264 53.5831 11.4871 9.7686 -4.3737

Table 2: ES Performance Metrics by Quarter

Figure 1: E‑mini S&P 500 Futures: Cumulative Strategy P&L (Net, Long‑Only, Short‑Only) vs Price

E-mini Nasdaq-100 (NQ):
Quarter Sharpe Calmar Win % Ann. Return % Ann. Vol % MDD %
25Q3 3.3025 12.0777 56.326 11.8201 3.5792 -0.9787
25Q2 2.5332 12.6965 50.5501 48.6833 19.2178 -3.8344
25Q1 1.4694 4.0019 54.0082 14.28 9.7184 -3.5683
24Q4 1.8842 6.3836 53.5973 15.165 8.0483 -2.3756
Overall 2.1033 5.9438 53.7524 22.791 10.836 -3.8344

Table 3: NQ Performance Metrics by Quarter

Figure 2: E‑mini NASDAQ-100 Futures: Cumulative Strategy P&L (Net, Long‑Only, Short‑Only) vs Price

Notes:

  • Annualization: daily conversion to annual using √252.  
  • Sharpe uses daily returns.
  • Ann. Vol: Annualized Volatility.
  • MDD: Max Drawdown.
  • Rebasing: curves rebased to 100 at period start.

4. Historical Prediction Data

The predictions generated by the model are available in the following files for download to anyone interested in using them in their own test harness. There is one file per model.  

Here are the column names and meaning in those files:  

  • date_time: Timestamp of each observation (mostly 1-minute cadence) | (datetime string)
  • predictions: Model-predicted class (0 for Down, 1 for Stable, 2 for Up) | (int)
  • actual_labels: Realized (actual) class label for the outcome (0/1/2) | (int)

5. Next steps

  1. Get on a short call to ask more questions and/or ask for a detailed table/CSV of past predictions for your test harness.  
  2. Review out-of-sample performance by consuming the predictions in real-time using our API. This requires signing-up for our “Professional” tier which includes a 1-month free trial.
  3. Go beyond pre-trained signals by tailoring a signal to your specific needs. Customize by symbol, target variable, time horizon, neutral band, entry cadence and risk rules.

6. Contact

Yianni Gamvros
Co-founder & CEO
yianni[at]quantumsignals.ai

Disclaimers:  

  • Futures trading involves substantial risk of loss and is not suitable for all investors.  
  • Hypothetical/simulated results do not represent actual trading, may under- or over-state market impacts (e.g., liquidity), and reflect hindsight.  
  • No representation that any account will achieve similar results.  
  • Past performance is not necessarily indicative of future results.  
  • Quantum Signals is not an NFA member.
Share with your network.
In this article

Executive Summary