I have the feeling that there is some subtle yet spread misconception about data-driven research in financial markets and I will take this article: Seeking Alpha — Not Even Wrong: Mercenary Trader as a starting point for the discussion. Plain old boring boring? In fact they are very complex. As such, predictions based on data mining of a single historical variable or single cherry-picked pattern observation are almost always worse than useless because they ignore a core confluence of factors.
The ability to assess a confluence of key factors in the present, as impacting important market relationships here and now. This is true for any field and for any prediction method, be it AI or human reasoning. The hard part of course is finding these features and combining them.
Is this really that different from properly done data mining? If along the way they can find some sort of explanation for it, the better. Because markets are a complex sea of swirling and interlocking variables — and it is the historical drivers and qualitative cause-effect relationships are what have lasting value.
It is not the output of a spreadsheet that matters — the pattern-based cherry picking lacking insight as to what created the results — but the qualitative relationships truly attributable to joint causation of various outcomes, on a case-by-case basis, with a very big nod to history and context. Something that I noticed quite some time ago when trading 30y US bond futures was that whenever my limit orders were executed, I was immediately at a loss. What this means is better explained by an example.
What is a matching algorithm? CME explains it as:. Lately I have been looking for a more systematic way to get around overfitting and in my quest I found it useful to borrow some techniques from the Machine Learning field. If you think about it, a trading algorithm is just a form of AI applied to prices series. As complexity increases, performance in the training set increases while prediction power degrades.
Something particularly interesting is that the use of the very same feature e. In this new scenario, a model using the season of the year as a feature is more likely to result in an overfitted model because of the different underlying dynamics. Luckily for us, there are a whole bunch of techniques developed in the Machine Learning field to operate feature selection.
I recommend the following paper for an overview of the methods: This is a quick follow-up on my previous post on Quantile normalization. The advantages are symmetric to those discussed in the previous post, as long as your backtest allows for realistic modelling of trades execution — e.
Trimming out the worst returns is particular useful in case of strategies having single big losses such are mean-reversion strategies of some kind usually , whereas trimming the best returns is more useful for strategies with big positive days e. Math Trading Applying quantitative analysis to gain an edge in financial markets.
To comment on a last point brought up by Mr Litle: Order matching algorithms Posted on April 21, by mathtrading. CME explains it as: Trimmed performance estimators Posted on October 25, by mathtrading. Two of many possible variants are:More...