/ bias

Mistakes and Biases in Trading Strategy Development

Some time ago I stumbled upon CitiFX presentation titled State of the Retail Foreign Exchange Market from 2014. Among other statistics there were two graphs showing the difference between expected and actual returns of current and potential forex traders. One of them shows that while 84 percent of traders believe that they can make money in the market, less than third of them is really able to do so.

FX Traders: Expectation vs Reality

While the portion of people achieving positive returns is not that surprising to me, the dissonance in the expectations is. It is a great example of optimism bias, one of many cognitive biases that burden our mind and prevent us from succeeding in investing. Some other well known are anchoring effect, loss aversion or hindsight bias.

The importance of psychology and discipline is mentioned again and again in the trading literature. It is often claimed that self-control and the right mindset are 80% of the success. So shouldn't the recent increase in the number of retail traders using automatic trading systems (ATS) also lead to the increase in the number of profitable accounts? Shouldn't the emotionless machine be free of all the biases that affect men? It turns out that the opposite is true. Not only that we can introduce some of the common cognitive biases through the development process but there is also a group of biases related mainly to the trading system design and back-testing.

I have seen a countless number of automated trading systems fail. Some of them were mine. Here is a list of the most common mistakes and biases in trading strategy development and utilization. It is supposed to be an introductory overview; If you want to read more on each particular item, I recommend reading Behavioral Finance and Wealth Management by Michael M. Pompian and Evidence-Based Technical Analysis by David Aronson.

  • Confirmation bias is the tendency to acquire and interpret evidence selectively in order to confirm our pre-existing beliefs. People tend to search for evidence in support of their hypothesis instead of looking for contrary evidence. Many of the following items in the list are form of or related to confirmation bias.

  • Data dredging or data snooping is caused by extensive data mining on limited data set (e.g. market prices) which results in finding misleading relationships. With enough tries it is usually possible to find correlations occurring solely by chance. These of course will not generalize well to unseen data. One of the possible remedies for data dredging is out-of-sample testing on large enough data set. But with enough tries some of the hypotheses will inevitably pass the out-of-sample test by chance too. For further details I recommend reading Five Myths About Data-Mining Bias.

  • Peeking bias or look-ahead bias is caused by the use of information in data analysis or simulation that would not have been known during the period being analysed. One of the ways it can be introduced is by training or optimizing on later data and out-of-sample testing on earlier.

  • Overfitting - when working with market data there is lot of noise. But in all that noise we also want to find genuine patterns. Choosing excessively complex models can lead to poor out-of-sample performance relative to in-sample performance because it describes random error instead of underlying relationship. In other words the model does not generalize well. The solution for this problem is beyond the scope of this overview but it usually includes either penalization of complex models and / or testing the model's generalization ability by evaluating its performance on previously unseen data.

  • Survivorship bias is caused by focusing on things that survived certain process and omitting those which did not. For example one can inadvertently exclude delisted stocks during backtests. Every day stocks delist because they are acquired, merged, reorganized or because they go bankrupt. If let's say you want to test relative momentum system and you do not include the delisted stocks, your results might be distorted.

  • Pre-inclusion bias is caused by using stocks which are part of an index for backtesting while assuming that they were always part of the index. If you do not check carefully when exactly each particular stock entered the index you can end up with misleading results.

  • Sampling bias is a type of selection bias. It is caused by a flaw in the sample selection process. For example by the exclusion of a subset of data from testing because of its particular attributes. Imagine backtesting a strategy for stocks. You decide that you need at least 20 years of historical data for each of them. If there are some with less history you exclude them from the testing. This may be fine if your final strategy also trades only stocks with 20+ years of history. But if not you have just introduced a sampling bias.

  • Selection bias is the improper selection of data for analysis. The resulting sample is not representative of the population being analysed. It is closely related to many other mentioned biases (e.g. confirmation, publication, sampling, peeking and survivorship bias).

  • Overlap bias - training and testing data sets are usually adjacent in time. If we intend to predict a price change n bars into the future (where n > 1) we include the target variable in the training set. But due to boundary effects the results can be influenced by an optimistic bias. It is caused by the fact that indicators often exhibit strong serial correlation. Practical example: imagine using relative strength index (RSI) indicator, computed using the last 30 bars, to predict the price change in the following 20 bars. Now lets compare the last case in the training set and the first case in the testing set. Their RSI value will be quite similar because only one bar from those 30 bars used for the computation has changed. The price change will be also very similar unless a huge market move happened during one bar. The same will be probably true for several following cases in the test set. So instead of two independent data sets - one used for training and one for testing - we now have two intertwined sets and this can lead to overly optimistic test results.

  • Market fitting bias can be caused by selecting a test period not from the end, but randomly out of the price data. The test then uses data that is, although out-of-sample, from the same market situation as in training.

  • Position bias or trend bias affects all "asymmetric" strategies, i.e. those that use different algorithms or rules for long and short trades. In such situations a price trend can greatly influence the results. One of the solutions for this problem is removing the underlying trend from data (detrending) which produces a new time series with average price change equal to zero. But also note that the market itself can behave differently in upward and in downward moves.

  • Granularity bias occurs when you use data with different periodicity for testing and live trading. When you are live trading, the price data usually come in with each price change - sometimes many times per second. But if you use OHLC data for testing you can determine the first and the last price in each time period and the highest and the lowest but you cannot possibly know how was the price moving during that time. So sometimes you might not know whether your stop-loss was hit before the trade exit or not.

  • Sample size bias - if you make conclusions based on the backtesing bear in mind that the values derived from maxima and minima (e.g. drawdown) are related to the test length. With longer testing period there is bigger chance of obtaining more extreme values. This is intuitive. On the other hand if you are computing mean, variation is more likely in smaller samples. With large enough sample the average value will be closer to the true mean.

  • Gambler's fallacy or Monte Carlo fallacy is something you would expect among casino players but it can be seen in investing surprisingly often too. It is the mistaken belief that the occurrence of a random event is dependent on the frequency of occurrences in the past. For example in the social trading platforms you can see many systems based on martingale "money management" which means increasing position size after a loss or series of losses. Needless to say such strategies will ruin your account sooner or later.

  • Trusting stateful trading system luck - be wary of strategies which produce very different backtest results when started on slightly different dates (e.g. few days or weeks apart). It might be the case that the start date affects strategy's asset holdings at given time. The strategy can skip a trade signal because it is fully invested during one backtest run but it can take the same trade if you start the backtest on a different date.

  • Ignoring market impact - it is often the case that during backtests only the best ask and bid price is considered for trading. But during live trading each trade has an impact on the market itself. For example the market may not be able to absorb your order on requested price and you might end up with only partially matched order or with slightly worse price. There are many other situations like that. The market is a live beast which reacts and even adapts to your behaviour.

  • Automation bias is rarely mentioned in connection with investing but I consider it highly relevant especially in the light of the previous financial crisis. In short: people trust automated systems too much and often discard contradictory information provided by other sources even if it is correct. There is also a research suggesting that introducing deliberate errors into automated systems is more effective in reducing automation bias than mere warning that the errors can occur. In the financial world the automation bias can manifest itself as over-reliance on risk models based on historical data while concealed or even known risks not included in the models are overlooked. This reduces the ability to adapt to ever-changing nature of risk.

  • Publication bias occurs when research findings are not published based solely on the quality of the research but for example based on the tested hypothesis or the significance of the findings (e.g. authors are more likely to publish positive results rather than negative or inconclusive). A special case is outcome-reporting bias which occurs when there are multiple outcomes for a particular analysis but their publication is based on the strength and / or result of each outcome. When using other people's research remember that it can be affected by publication bias but also by other concealed biases which can subsequently "infect" your research too.

Designing automated trading systems is not a rocket science although it might seem that way in the beginning. People who are able to keep all these potential pitfalls in mind have a great chance to be among the thirty percent of profitable traders.