Walk forward optimization has the potential to be an excellent way for both verifying and optimizing trading systems. However, in practice, most retail trading platform WFO implementations today have limitations that limit their ultimate value:
1. Markets are so efficient that systems may need to be reoptimized on a very frequent basis. It is not clear how to make use of that most recent data while maintaining statistical validity.
2. Traditional WFO cannot distinguish between stable systems and systems that outperform due to chance (i.e. systems or parameters). There is no ability to consider the solution space as a whole (surrounding or neighboring systems) to see how profitable they are.
3. There’s not enough “recent” data for the WFO to use and it is not clear what the window size should be. Rolling requires making arbitrary guesses while anchored uses all the data no matter how old.
4. Most retail WFO engines can only surface a single best result which is highly random.
5. Sample sizes may vary (see my latest observation). Small sample sizes will exhibit greater variation!
In order to address those problems, I present the following solutions.
I suggest the WFO should surface the Nth top performing best results and that those results should be averaged. This change, as well, enables for portfolios of WFO results to be traded.
Weighted Anchored Walk Forward Analysis
Weighted WFO uses all data like anchored WFO except it weights newer data more heavily (decreases the value of older data). The objective is to give more value to stable parameters while still allowing the system to adapt to the new data. There is a linear or exponential decay algorithm. Basically, we will take into account all the data but older profits and losses are decayed while newer data is weighted more heavily. This should give stable parameters or systems an edge over being selected as compared to relatively unstable configurations. As well, it solves the window problem of trying to decide what window to use for in-sample portion.
Variation Punishment Algorithm
VPA does not weight based on time but instead punishes parameters that have higher variation. This optimizer can be anchored or non-anchored. However, the VPA punishes systems based on historical variation in performance over all the dat. The simplest example would be let us imagine 1 variable system with a bi-modal optimal. The optimal values are 0 and 100. One optimal, let’s say, 100 optimal produces return like this +100,-100,+100,-100 while mode 2 produces this +90,+80,+94+79. The VPA will punish the results of the former because of the variation or volatility in returns. The optimizer will favor systems or parameters that perform better over time. A sensitivity analysis could be used for placing systems into a smaller subset of buckets for this analysis to reduce the number of combinations.
Quadrant Grouping/Sensitivity Analysis
This solution attempts to tackle the problem of picking the best system that is surrounded by bad system. The idea is that a sensitivity analysis is performed that groups every parameter into 3 or 4 classes and results are average over the class. So, no matter how many values a parameter can take on– there are only 3 or 4 possibilities. As an example, a system with 3 parameters can only exist in 3*3*3= 27 quadrants. The WFO analysis the averaged results of the quadrants to determine what results to surface. The best results from the best quadrants are surfaced.
The author is passionate about markets. He has developed top ranked futures strategies. His core focus is (1) applying machine learning and developing systematic strategies, and (2) solving the toughest problems of discretionary trading by applying quantitative tools, machine learning, and performance discipline. You can contact the author at email@example.com.
Please log in again. The login page will open in a new tab. After logging in you can close it and return to this page.