Pythia v0.1.0: Intraday AI Trading Signal for US futures (ES and NQ)

A Finance-Native AI Model for Intraday Mid-Price Predictions

Markets evolve faster than most models can adapt. In a previous post we talked about computational methods currently used and gaps in the landscape.

Order-book dynamics shift, volatility regimes appear and vanish, and what looked predictive in January may be noise by March. After months of research and training, we’re introducing Pythia — our quarterly-adaptive multi-asset AI Model for intraday predictions of mid-price trends. It is the culmination of months of effort that most teams don't have the skills, patience, and budget required to complete and that is why most teams do not use advanced AI methods when trading.

It’s built to learn from the market’s structure without being trapped by its past.

1. The Problem: Microstructure Never Sits Still

Intraday alpha is fragile. Liquidity providers, execution algos, and venue behavior constantly reshape the flow that drives price discovery. A static model, however sophisticated, fades as soon as the market changes character. What we needed was a framework that keeps the memory of structural features while continuously adapting to new regimes.

That’s the idea behind Pythia.

2. The Model: A Shared Intelligence Across Assets

Pythia learns from the full limit-order book and message flow of the assets it is trained on, encoding how micro-events ripple through short-horizon price formation. Its temporal-attention backbone captures sequencing, reaction time, and cross-asset interactions.

The output is a forecast of intraday-horizon mid-price, expressed as a directional signal ready to be turned into tradable strategies. It is trained to remain stable under realistic costs and capacity constraints, making it robust for desks that operate intraday at scale.

3. The Learning Loop: Built to Evolve

Pythia doesn’t stay fixed.

Every quarter, it retrains on the most recent data, re-evaluates its performance, and promotes the stronger version forward — keeping structural memory while adapting to new flow and volatility conditions.

This lightweight refresh cycle ensures the model remains synchronized with current market regimes without losing the stability that comes from long-term learning.

4. The Pipeline: From Tick Data to Live Signal

Pre-processing.
We construct fixed time bars from full limit-order-book and message data. Each bar aggregates raw and derived features — depth, imbalance, spreads, order flow, and volatility proxies — totaling a few hundred inputs.

Labels correspond to Up, Down, or Stable mid-price moves, defined by the average mid-price in the next five minutes and in the closing window (15:55–16:00).

Contextual features from minutes, hours, and prior days are added to capture multi-scale memory and intraday patterns.

Aggregation & Normalization.
Multiple assets are processed within the same dataset. Features are aligned on a common time grid and normalized using rolling statistics to maintain point-in-time integrity and cross-session stability.

Architecture & Training.
The model is a Temporal Fusion Transformer adapted for order-book data. It uses variable lookback windows with attention across multiple temporal scales to represent both fast and slow market dynamics.

Training employs a cost-sensitive loss, directly optimizing returns net of an assumed 1 bp round-trip cost to ensure alignment with tradable outcomes.

Backtesting.
Evaluation follows a rolling, forward-chained protocol with strictly non-overlapping train, validation, and test windows. All metrics — Sharpe, Calmar, MDD, returns, and volatility — are measured after costs and capacity constraints.

Post-processing.
The model’s raw outputs are converted into directional signals and thresholds optimized for turnover, cost, and stability across assets.

Real-time Inference.
Signals are generated in near real time — one or two seconds behind the live feed — and refreshed every minute. Latency and load are continuously benchmarked to ensure consistency with backtested performance.

Retraining & Monitoring.
Each quarter, the model is retrained on the most recent data under the same forward-chained discipline. Monitoring tracks drift, Sharpe degradation, long/short balance, and volatility to verify that performance remains within the model’s stability band.

5. Risk & Controls

Pythia operates under clear quantitative guardrails:

Per-symbol balance across all symbols in the training set..
Positive Sharpe under slippage stress tests.
Last-month stability as a prerequisite for promotion.
Tail concentration limits, ensuring no single day dominates quarterly P&L.

These controls make Pythia more explainable, reproducible, and operationally reliable — the foundation required for institutional deployment.

6. Why It Matters

Most AI-driven trading models chase patterns until they decay.

Pythia instead listens to the market’s structure and learns deliberately. It bridges the gap between research and execution by treating adaptivity as infrastructure, not an afterthought.

Every quarterly update comes with its own model card, data snapshot, and Weights & Biases trace, so each signal can be fully audited, compared, and reused in production.

This isn’t a one-off backtest — it’s a living system designed to evolve.

7. What’s Next

We’re now promoting the latest quarterly model for paper-trading and shadow runs with fixed entry/exit rules, while tracking live reliability, turnover, and capacity.

Our next post will cover performance results and cost-adjusted benchmarks across multiple quarters — stay tuned for the analysis.

Adaptivity is the edge that lasts.

Pythia is how we build it — quarter by quarter.

Introducing Pythia v0.1.0