Does anyone actually know WHY their backtest failed — not just the numbers?

I've been running strategies on Streak for a while now and one thing genuinely frustrates me.

After every failed backtest, the platform shows me: - Win rate: 43% - Max drawdown: 31% - Sharpe ratio: 0.4

But it never tells me WHY it failed.

Was it because February and March were strongly trending and my RSI strategy only works in sideways markets? Was it because my stop loss was too tight for the volatility that month? Was it because I overtested and the edge was never real to begin with?

I genuinely can't tell. I end up just tweaking parameters and running the backtest again — which I know is wrong but I don't know what else to do.

How do you all approach this? Do you manually go through the trade log? Do you look at market conditions during the losing period? Is there a systematic way to diagnose a failed strategy that I'm missing?

Would love to hear how experienced traders here actually approach this problem.

1 Like

@Streak

@Shivanjali_Vishwakar

A backtest engine can accurately simulate entries, exits, P&L, drawdown, win rate, and other statistical metrics because these are rule-based calculations. However, identifying why a strategy underperformed is a much more complex layer of analysis.

For instance, determining whether the underperformance was caused by trending markets, changing market regimes, liquidity conditions, or stop-loss sensitivity requires contextual interpretation. These factors are dynamic in nature and can often overlap, making them difficult to attribute to a single cause.

Disclaimer:
Please note that all Discover strategies and scanners available on the platform are meant purely for educational purposes to help users understand how the platform and strategy building work. They should not be considered as buy or sell recommendations. Before deploying any strategy in live markets, we strongly recommend thoroughly analyzing the logic, validating it through backtesting across different market conditions, and understanding the associated risks.

2 Likes

This is really well put and honestly confirms exactly what I’m exploring.
The gap you described between what a backtest engine can calculate vs what actually caused the underperformance is precisely what I’m trying to bridge

The idea is to use AI to provide that contextual interpretation layer on top of the raw backtest metrics. So instead of just seeing drawdown 31%, a trader sees your strategy struggled in February because Nifty was in a strong uptrend — your mean reversion logic doesn’t perform in trending conditions.

Quick question do you work on one of these platforms? I’d genuinely love to understand this problem from the builder’s side too. Would you be open to a 10 minute chat?

1 Like

@Shivanjali_Vishwakar Thank you for the interest and for sharing your thoughts. For any product-related discussions, suggestions, or queries, please feel free to write to [email protected]. The relevant team will review and assist you further.

@Streak’s point about context is important. One practical habit I would add: keep a small “failure journal” for every backtest, not just the final metrics.

For each failed run, tag the losing trades by cause:

  • trend day vs range day
  • entry late vs entry early
  • stop too tight vs target too far
  • losses concentrated in one time window
  • losses concentrated after a volatility spike

After 30-50 trades, patterns usually become clearer. Sometimes the answer is not “change RSI from 14 to 12”, but “this logic only works in low-volatility sideways phases” or “most losses happen in the first 20 minutes”.

That also reduces over-optimisation, because you are diagnosing behaviour instead of only tuning numbers.