Over the last 18 months we’ve been talking with quants, PMs, and AI researchers (basically anyone who’d chat with us) across large and small buy-side and sell-side institutions. Our goal was to learn where AI is used in production for alpha and execution. More specifically, we wanted to see the gaps in current methods and the real benchmarks AI has to beat to get adopted. What follows is a quick re-org of our notes and the anecdotes we heard along the way. We’ve organized these from the ultra-short microstructure horizons to the longer-term, macro and portfolio horizons.

LOB-event AI models

On one end of the spectrum we have talked to AI researchers that are using deep learning models trained on raw Level‑II/III limit-order‑book (LOB) states to forecast very short‑horizon moves. Published results show that some benchmarks can achieve high accuracy scores, for example 80% accuracy at predicting price trend for the next 10 LOB events. However, it seems that these results are not consistent across different data sets. Specifically, the accuracy drops to around 60% for some US equities data sets. Additionally, it is important to keep in mind that 10 events can elapse in microseconds to milliseconds on liquid symbols. Even though the literature is rich with these models, it is not clear whether these are used in production at scale.

Book imbalance momentum models

‍A lot of the market makers that we talked to employ simple, fast signals from order‑flow/queue imbalance, i.e., how much volume sits at the best bid vs. top ask and how that pressure changes. Academic work shows imbalance is a statistically significant predictor of short term mid‑price changes. These models are good at predicting how the book moves in time frames that are typically measured in 10s of seconds all the way to a few minutes. Most of the practitioners we talked to expect these models to have ‘hit rates’ (accuracies) between 50% and 55%.

Physics‑inspired models

We also keep running into a well-documented school of thought popular with quants who came up through physics. Think Avellaneda–Stoikov for inventory-aware quoting and Hawkes/point-process views of order arrivals and LOB dynamics. The appeal is that these models might explain how and when the next few orders (at specific prices and volumes) are generated and therefore allow teams to anticipate them and benefit from the associated market moves. Additionally, these models typically let you tune levers like risk, inventory, and sensitivity making them appropriate for different strategies. These show up intraday for quoting/inventory and as a structured starting point when data are thin or noisy, so the model doesn’t chase randomness. It’s a mature academic backbone many shops still borrow from.

Legacy ML methods

By far the most common methods cited by PMs at hedge funds and other buy side institutions were what we would call “legacy ML” techniques. These are tried and tested approaches like gradient‑boosted trees, random forests, regularized linear models, and careful feature plumbing. The appeal is obvious: strong tabular performance, governance‑friendly, and easy to refresh when edges fade. Large studies in cross-sectional asset pricing keep finding that tree/NN families beat plain linear baselines economically, which matches what we heard from most desks. These are used in all sorts of timeframes from days to weeks to months.

LLMs for signals and portfolio construction

We’ve also heard a lot about LLMs. News-driven signal generation isn’t new, but modern models have definitely revived it. Text pipelines that summarize or score news, earnings reports, Fed statements, and quarterly filings are widely used, often via finance-specific platforms. Peer-reviewed work shows LLM-scored headlines can predict next-day returns out-of-sample and sometimes outperform traditional sentiment, which explains the current interest. The newest turn we’re seeing: LLMs for stock picking and portfolio construction on weeks-to-months horizons.

Macro time‑series models

Risk and treasury teams still live with VAR for multi‑series dynamics and (A)GARCH for volatility—explainable, documented, and battle‑tested. They’re not glamorous, but they anchor planning and stress at monthly or quarterly horizons and fit neatly into model‑risk frameworks. If you’ve sat in a risk committee, you know why these endure.

Where we are focusing

Given the above we have identified a gap in the use of state-of-the-art AI for intraday horizons of minutes to hours. Like the shorter-term approaches, we consume raw Level-II limit-order-book data—but instead of applying legacy techniques like ML or physics models, we build finance-native transformer models from scratch and train them only on quantitative time-series, not text. Think of it as using the useful ideas behind LLMs (long- and short-pattern understanding, static-plus-dynamic inputs), tailored for market data. The goal is simple: deliver better intraday predictions than prior approaches, packaged as a B2B SaaS solution with a clean API for fast deployment.

AI and Adjacent Methods in Trading: Notes from the Field