What we shipped recently

Plain-English notes on the last few weeks of work. Accuracy improvements to the backtest engine, UX fixes that surface failures honestly, and the occasional bit of infrastructure plumbing.

Subscribe via RSS.

May 2026

2 May 20262 min read

Changelog gets a proper page, and signed-in users can find their way back

UXmarketingnavigation

The first version of /changelog shipped last week as a card grid. It was honest, but it didn't read like a release-notes page — you landed and saw nothing but a header and a list of titles. We rebuilt the layout so the most recent entry now opens out in full on the index page, with a sticky right-rail of every other release as compact version cards. The detail page got the same treatment: breadcrumbs at the top, a real article body in the middle, the same sidebar of other releases on the right, and a "back to all updates" link at the bottom.

We also added the small things that make a release-notes surface feel finished — relative dates ("today", "yesterday", "3 days ago") for entries within the last week, reading-time estimates on the detail page, and tag chips that match the chrome we use everywhere else. Mobile collapses the two columns into a single stack, so nothing gets squished.

The second fix was less visible but more annoying. If you were signed in and clicked through to /changelog or /learn, the marketing header still showed "Sign in" and "Start free trial" — and there was no link back to your dashboard anywhere on the page. You had to retype the URL or hit the browser back button. Now the header detects that you're authenticated and swaps both CTAs for a single "← Back to dashboard" button. Logged-out visitors see the original sign-up CTAs, exactly as before.

While we were in there we also added Changelog as a top-level link in the marketing nav (it was hidden in the footer before), and wrapped /learn with the same marketing chrome so the back-to-dashboard affordance works there too.

The RSS feed is unchanged, the markdown-on-disk data model is unchanged, and the in-app "What's new" sidebar link still goes where it always did. None of the existing 16 entries needed any edits.

April 2026

30 April 20263 min read

The agent stack now speaks plain English

a11yagentUXtrust

Quantis Trade was built quant-first. Every layer of the agent stack — intent parsing, the strategy classifier, the goal translator, the misconception handler, the result card — assumed the user already knew what Sharpe ratio, drawdown, factor exposure, and PSR meant. A casual trader who'd never used a backtesting platform landed on the result card, saw twenty abbreviations with no explanations, and bounced.

We did a full audit (26 findings across 10 layers) and shipped the fixes in two waves.

Audience mode. Every user now has an audienceMode preference — casual by default, with explicit quant and auto options. The setting flows from the user record through the agent session into every LLM-driven layer, so the same query can route to plain-English or quant-flavoured responses without changing the underlying classifier output. Sessions can override with a header, cookie, or query param.

Plain-English LLM prompts. The strategy intent classifier, goal translator, and misconception handler each got a casual variant of their system prompt. In casual mode, "highest-Sharpe strategies" becomes "strategies with the steadiest, most consistent month-to-month returns". "Try a global portfolio across SPY, EFA, EEM" becomes "Try a global portfolio across SPY (US stocks), EFA (developed-market stocks outside the US — Europe, Japan, UK), and EEM (emerging-market stocks — China, India, Brazil)". The semantic output is identical; only the surfaced language changes.

Result card glossary. Every metric pill now has a ? icon that opens an accessible modal with the metric's full name, a one-sentence plain-English definition, why a casual trader should care, and a link to the relevant Learning Centre article. 33 metrics covered, organised into core (always visible) and advanced (behind a "Show advanced metrics" toggle for casual users). Drawdown gets a specific clarification — "this is the temporary fall from a peak, not money lost permanently unless you sell at the bottom" — because that confusion bounces more first-time users than any other concept.

Honest fallbacks. When the regex parser couldn't classify a query it used to silently fall back to a momentum strategy. Now it returns a ClarificationCard with three concrete reformulations — "I didn't catch that, did you mean one of these?". Same fix for ambiguous tickers, single-strategy hints the LLM detected but the regex missed, and bare "unsupported" responses with no redirect. Picking an option fills the chat input but doesn't auto-send, so you can edit before running.

Trust block, factor exposure, and regime labels. The data-quality strip below the equity curve now has three tiers (headline → expandable explainer → technical). Factor exposure shows a one-sentence plain-English summary on top, with the regression coefficients collapsible behind "Show technical details". Regime-conditional Sharpe is labelled "Up markets / Down markets / Volatile markets" instead of Bull/Bear/HighVol.

Onboarding. A prompt-suggestion carousel under the chat input shows eight starter queries (mix of strategies and research questions), filtered live as you type and hidden after three messages. /help is a chat-local command — /help sharpe opens the glossary modal at that metric, /help on its own lists every entry. Asking "what is X" where X is a known metric is intercepted locally before it hits the backend, so it doesn't burn tokens on a definitional question.

The audit report (docs/audits/AUDIT-AGENT-STACK-ACCESSIBILITY-2026-04-30.md) is in the repo with the full finding list, the discovered-during-implementation gaps, and the sequencing rationale. Future plan: an A/B test harness to actually measure whether casual-mode improves time-to-interpretation and re-run rate.

30 April 20261 min read

Calmar / MAR distinction + survivorship bias estimate

backtestaccuracyeducation

Two quant-rigour additions to the result card. First, Calmar and MAR are now reported as separate ratios — they're often used interchangeably, but Calmar uses the worst rolling 36-month drawdown while MAR uses lifetime max drawdown, and the gap between them is itself useful information.

Second, every backtest now carries a survivorship-bias estimate in the trust block. Our universe is curated to liquid US names that exist today, which means strategies are implicitly tested only on companies that didn't go bust. We compute a rough adjustment based on the historical delisting rate for the relevant sector mix, and surface it as a downward sleeve on the headline CAGR — "your reported CAGR is X%, survivorship-adjusted estimate is Y%".

This isn't a perfect correction (no public dataset of delisted ticker bars exists at this scale), but it's a more honest number than pretending the bias doesn't exist.

30 April 20261 min read

Documentation hub at /docs — one place to find everything

UXdocsfix

If you typed /docs into the address bar on quantistrade.co.uk, you used to get a 404. That was embarrassing — somebody we'd just promoted the platform to went looking for it and bounced. Fixed.

/docs is now a documentation hub: five sections of plain-English links into the rest of the site. Start here, core metrics (Sharpe, Sortino, Calmar, MAR, max drawdown, hit rate, turnover, alpha, beta, information ratio), concepts that show up across the platform (overfitting, Sharpe vs PSR vs DSR, walk-forward vs CPCV, factor decomposition, IC decay), strategy types the natural-language agent recognises (momentum, mean reversion, quality, low volatility), and the usual contact / security shortcuts.

Each entry is one short blurb that links into deeper material in the Learning Centre or to the contact page. The point is to give somebody who's never used the platform a single place to land that doesn't dump them straight into the dashboard.

While shipping this we also fixed a build-time issue where the changelog loader was reading from docs/changelog/ but the Dockerfile wasn't copying that directory into the image. That's why an unrelated catastrophic restart earlier today briefly took /changelog offline along with /docs. Both are now baked into the image and won't regress on the next rebuild.

30 April 20261 min read

Single-strategy queries now work without keyword tricks

accuracyresearchfix

Asking the agent for "a 50-day / 200-day moving average crossover" or "RSI mean reversion on QQQ" used to fall through the keyword router and get treated as a generic portfolio query. The result was technically correct but never the strategy you actually asked for.

A small LLM-based strategy-intent classifier now sits in front of the router. It recognises the common single-strategy shapes — moving-average crossovers, RSI thresholds, Bollinger band touches, breakout systems — and dispatches them straight to the relevant DSL template. Free-form portfolio queries still get the goal translator, unchanged.

You should notice the difference most when you describe a strategy in trader-speak rather than in keywords. The previous router needed you to say "crossover" specifically; the new one understands the intent.

30 April 20261 min read

Monte Carlo chart no longer renders as a flat line

backtestfix

The Monte Carlo chart was technically correct but visually useless: a couple of outlier paths dragged the Y-axis to such an extreme range that the interquartile band — the part that actually matters — was compressed into a flat horizontal smear.

The Y-axis is now clamped to the 5th–95th percentile of simulated paths. Outliers are still drawn (clipped to the chart edge with a subtle marker), but the IQR band, median path, and your realised equity curve are all clearly visible. The summary stats below the chart are unchanged — only the visualisation was off.

The bug had been there since we shipped the Monte Carlo panel. The summary numbers were always right, which is partly why nobody caught it sooner.

30 April 20261 min read

Sign-in rate limit moved to Redis (no more reset on deploy)

securityinfra

The auth rate limiter used to live in process memory, which had two embarrassing properties: every container restart wiped the counter (so an attacker could time their attempts to coincide with deploys), and a multi-replica deployment would split the count across replicas instead of summing them.

It now runs against Redis, shared across replicas and durable across restarts. If Redis is unreachable the request fails closed — no silent fallback to the old in-memory store, because that would be exactly the bug we just fixed in disguise.

Most users will never notice this. If you've ever been mysteriously rate-limited despite having only made one or two attempts, that's the thing it fixes — the counter no longer carries garbage from a previous user on the same IP.

30 April 20261 min read

Strategies show pending, ready, or failed status

UXtrustfix

If strategy creation errored partway through, the broken strategy still appeared in your list looking exactly like a working one. Clicking it would surface the failure, but only after you'd already tried to use it. Not great.

Every strategy now carries one of three statuses: pending (still being built), ready (backtested at least once), or failed (creation hit an error and never recovered). The list view shows a small status pill on each row, failed strategies are visually de-emphasised, and the row links to the underlying error rather than pretending it ran.

Same idea as the wallet fix below: when something is broken, say so. The whole point of the strategy agent is to be a trustworthy collaborator, and trust evaporates fast when broken work looks identical to working work.

30 April 20261 min read

Trade attribution panel + active return / tracking error pills

backtestaccuracy

Headline backtest stats answer "did the strategy work" but not "where did the return actually come from". The new trade-attribution panel breaks the equity curve down by trade, ranking the largest contributors and detractors so you can see whether your CAGR is being carried by three lucky trades or by a broad base.

Three new pills appear next to Sharpe and CAGR on every result card: active return (vs benchmark), tracking error, and information ratio. Together they tell you whether the strategy actually beat its benchmark per unit of relative risk taken — not just whether the absolute number looks good.

If you've been wondering whether your favourite backtest is genuinely diversified or quietly concentrated in a few outliers, the attribution panel is the first place to look.

30 April 20261 min read

Wallet page now tells you when something fails

UXtrustfix

The billing and usage pages used to render blank when an API call failed — no error, no retry button, just a missing panel. A beta tester (thanks Petar) flagged it; we couldn't tell whether his page was broken, his account was empty, or his plan needed selecting.

Each fetch now returns a typed result that distinguishes "ok", "empty", and "error". An empty wallet shows an onboarding callout. A failed fetch shows an amber banner with the HTTP status, the underlying message, and a Retry button. You'll never see a blank panel again.

This is the kind of fix that should have been there from day one. We were silently swallowing errors in production — exactly the pattern our no-silent-fallback rule was supposed to prevent. The audit fixed roughly twenty callsites across billing, usage, settings, and research.

29 April 20261 min read

AUM-scaled CAGR + drawdown duration distribution

backtestaccuracy

Two additions for traders who care about deploying real capital, not just looking at curves.

AUM-scaled CAGR adjusts the headline return for the slippage and impact you'd realistically take at three notional account sizes — $10k, $100k, and $1M. The math uses the strategy's average position size, our existing slippage model, and the typical bid-ask on each name. The headline number stays the same; an extra row shows what's left after costs at scale.

Drawdown duration distribution shows you the full histogram of "time spent underwater" rather than only the worst drawdown depth. A strategy with a 15% max drawdown that recovers in three weeks is a very different beast from one with the same depth that takes eighteen months — and the median duration is often more useful than the worst.

29 April 20261 min read

Honest expectation card before every portfolio backtest

UXeducationtrust

A short framing card now appears between "you submitted a portfolio query" and "here are the backtest results". It tells you what to realistically expect from the kind of strategy you described — what range of CAGR is plausible for that asset class and risk level, what drawdowns are typical, what the historical hit rate looks like for similar systems.

The point is anchoring. If you ask for "a low-risk income strategy" and the backtest comes back at 28% CAGR with a 4% max drawdown, the right reaction is suspicion, not celebration. The expectation card primes that suspicion by showing you the realistic range first, so an unrealistic result reads as a warning sign rather than a win.

For high-CAGR queries (above ~25%) the card is more direct: it explicitly names the trade-off — strategies in that range historically pay for it with deep drawdowns, high turnover, or fragile regime dependence. You can still run the backtest. You just see the catch first.

29 April 20261 min read

Plain-English portfolio goals get multiple backtest variants

UXresearchaccuracy

Asking for "a diversified healthcare portfolio targeting 12% annualised" used to require you to know the system's preferred phrasing. The new goal translator parses free-form portfolio descriptions — sector, target return, risk constraints, holding period — and produces a small set of backtest variants that each match the goal in a different way.

Instead of one answer, you get three to five honest options. A momentum-biased version. An equal-weight buy-and-hold. A volatility-targeted version. Each gets a one-line summary of the trade-offs ("higher CAGR, deeper drawdowns") so you can compare without having to read every result card.

This pairs with the expectation-screen card that now appears between query and results — it tells you upfront what the realistic range looks like for the kind of strategy you described, before any specific numbers come back. Anchoring matters; quoting a single backtest as "the answer" implies a precision the data doesn't support.

29 April 20261 min read

Honest refusals for guaranteed-return, crypto, and international queries

UXtrusteducation

Some queries can't be answered honestly with the platform we have today. Asking for a "guaranteed 20% return" misunderstands what backtesting can prove. Asking for "Indian small caps" hits the universe limitation. Asking for "best crypto strategy" hits both data and methodology limits.

The system used to silently degrade these into the closest momentum strategy and return numbers anyway. That was the worst possible behaviour: the user got a number that looked like an answer but was unrelated to what they asked. We now recognise three classes of dead-end query — guarantee-language, unsupported universe, unsupported asset class — and respond with a short refusal that names the specific limitation, plus one or two honest reformulations that we can actually run.

For example, "guaranteed 20%" becomes "no backtest can prove a guarantee — here's a strategy targeting that return level with the historical hit-rate, drawdown, and CI". "Indian small caps" becomes "we're limited to 89 US-listed names; here are some India-exposed ADRs that trade on US exchanges". The point isn't to gatekeep — it's to explain rather than fake an answer.

29 April 20261 min read

Regime-conditional Sharpe + Omega ratio

backtestaccuracy

A single Sharpe ratio averaged across a 10-year backtest hides a lot. A strategy with Sharpe 1.2 overall might be Sharpe 2.5 in bull markets and Sharpe -0.4 in high-volatility regimes — and you'd want to know that before sizing a position.

Every backtest now reports Sharpe broken out by regime: bull, bear, and high-volatility (defined by the SPY 200-day trend and rolling 60-day realised vol). The breakdown lives in the trust block on the result card and can also be exported in the audit trail.

We also added the Omega ratio next to Sharpe. Omega counts asymmetry — it rewards strategies whose upside is larger than their downside in absolute terms, not just risk-adjusted. Plenty of strategies look fine on Sharpe and badly skewed on Omega.

29 April 20261 min read

Research surface now classifies queries across 16 typed intents

researchaccuracy

The research and chat surface used to dispatch queries with a regex template — fast and cheap, but it failed quietly on anything that didn't match a hard-coded shape. Asking for "compare AAPL and MSFT capital allocation over the last decade" might route to a generic news search instead of the structured comparison view.

The classifier now runs on Anthropic Haiku and returns a typed intent ("compare", "screen", "explain", "macro_lookup", and thirteen others) along with structured entities and a confidence score. The 16 dispatch handlers each render their own component — a stock comparison page looks different from a screener result, which looks different from a macro chart. When confidence is low, the response says so and asks you to clarify, instead of guessing.

If you've been doing research that needed manual workarounds — phrasing things in specific ways to get the right view — try the natural phrasing now. It should mostly just work.

29 April 20261 min read

Bootstrap confidence intervals on hit rate, profit factor, expectancy

backtestaccuracytrust

A 0.55 hit rate sounds great until you realise it came from 24 trades, in which case the 95% confidence interval brackets coin-flip. We were quoting trade-level stats as point estimates without saying how confident we were — which is exactly the false-precision problem quant rigour is supposed to prevent.

Hit rate, profit factor, and expectancy are now reported with bootstrap 95% confidence intervals (1,000 resamples of the trade list). The interval shows up next to the point estimate as 0.55 [0.37, 0.73]. When the trade count is small, the interval is wide. When it's large, the interval narrows. Either way you're seeing the actual statistical quality.

If a strategy's "edge" disappears once you look at the lower CI bound, that's important information — and now it's visible by default rather than something you'd have to compute yourself.