Metrics / Calmar

Calmar Ratio

3 min read · Metric

Annualised return divided by your worst drawdown. The honest answer to 'how much pain did this return cost me?'

What it is

Calmar is the ratio of annualised return to the absolute value of maximum drawdown. Unlike Sharpe (which measures volatility) or Sortino (downside deviation), Calmar measures actual peak-to-trough loss — the worst experience someone holding the strategy would have lived through.

Strategies with shallow drawdowns score high on Calmar even with modest CAGR; strategies with deep drawdowns score poorly even when their CAGR looks great. Two backtests with identical CAGR can have wildly different Calmars.

Formula

# Max drawdown is always reported as a positive percentage
calmar = CAGR / |max_drawdown|

# Example: 12% CAGR, 30% max DD
calmar = 0.12 / 0.30 = 0.40

Typical ranges

SPY long-run: 0.2–0.4 (10% CAGR against ~50% MaxDD in 2008–2009).
Decent systematic strategy: 0.5–1.0.
Strong: > 1.0 sustained — your CAGR exceeds your worst drawdown, meaning you recovered the worst hole within a year on average.
Above 2.0: usually a regime artefact — small drawdowns because the sample didn't include a stress event. Check the sample window before believing.

Common mistakes

Backtest doesn't span a stress event. A 5-year backtest from 2015–2020 has its worst drawdown in March 2020 — a 1-month event that looks small in retrospect. The same strategy through 2008 would show a very different Calmar.
Confusing Calmar with Sharpe. They penalise different things. A high-volatility strategy that never has a sharp drawdown (oscillating around the mean) can have high Calmar and low Sharpe; a steady-eddy strategy with one fat-tail event has the opposite profile.
Ignoring drawdown duration. Calmar uses the depth of the drawdown but not how long you stayed underwater. Two strategies with the same Calmar can differ by months on time-to-recovery.

What the platform flags

Quantis shows Calmar alongside MaxDD and recovery duration so you see both the ratio and the underlying components. The platform also surfaces the date of the MaxDD trough, which lets you tie the number to a specific historical event rather than treating it as an abstract statistic.

What it is

Formula

Typical ranges

Common mistakes

What the platform flags

Further reading