Testing The Ivy 10 Trading System

December 31, 2012 5:00 am12 commentsViews: 1499


The Ivy Portfolio article posted several weeks back resulted in a lot of good discussions. The main thrust of many criticisms of the Ivy Portfolio came down to curve fitting. Was the Ivy Portfolio curve fitted to produce such great results? There was some question on the quality of the historical data provided by ETF Replay, but that will be beyond this article as I don’t have evidence that their data may be tainted. Besides, the curve fitting aspect is much more relevant to our main theme here at System Trader Success.

Curve fitting can be a huge problem as there are many ways to curve fit a system. It can even be introduced into your testing without you knowing about it.  In fact, it’s hard to get out of bed in the morning without curve fitting. Within the comments of the original article there were some good examples of where curve fitting can creep into your testing. For example, curve fitting could be introduced by…

  • Choosing the lookback period for the simple moving average filter
  • Choosing the number of top instruments to trade
  • Choosing the instruments to trade
  • Choosing the ranking formula

Just about anytime you make a decision when building a trading system, you introduce the possibility of curve fitting. That’s why it’s important to think carefully on how you choose particular values. In my personal endeavors I always test trading system parameters independently over a range of values. When looking at the results I want to see some consistency among the tested values. That is, if I’m testing a lookback period for a moving average I don’t want to see a large variation between a 10 period vs. 20 period. I’ve seen systems where you change one value by a single increment and the results change dramatically. This is a clear warning flag. Instead, I prefer to see a a clustering of results which gives me confidence that the system is not curve fitted based on a select value.

In this article I would like to perform some simple quick-and-dirty testing to see if we can gain more confidence in the performance of the Ivy Ten Portfolio. The tests are far from exhaustive but should shed some light on the robustness of the trading system which was both created and tested in the original article. The system is a simple relative strength portfolio that consists of 10 different ETFs. Every month the top three producing ETFs are picked from the 10 potential ETFs. A simple 5-period moving average is also applied to a monthly chart to act as a regime filter. Trades are only taken if a given ETF is above the moving average. The system is called the Ivy 10 and is explained in more detail here. The same testing assumptions and methods used during the original article are used for all the tests within this article.

Out of Sample Results

The concepts and trading instruments that make up the Ivy Ten trading system were published in early 2009. Thus,the book was written from 2008 and earlier.  Just for the sake of testing, let’s assume 2007 as our starting point for out-of-sample results. Yes, there may be some overlap which includes some in-sample results, but starting from 2007 will give us 191 trades over 1,500 days. This should be a nice number of trades for out out-of-sample testing.  The following trading results are from 2007 through 2012.

Ivy-10 Trading System

Total Return: 123%
Drawdown: -28.7%
CAGR: 14.4%
Sharpe: .83

Below is the equity graph of the trading system in green, and our benchmark (SPY) in blue. Overall, it appears the trading system has held up nicely since it was first conceived in 2008.

 

Different Instruments

It’s my impression the Ivy Portfolio was designed to mimic the trading instruments of the Ivy endowments by picking ETFs which represented the large asset classes utilized by the endowments. However, were the instruments picked as to show good results? In other words, is the trading system curve fitted by only performing well on the portfolio of ETFs proposed by the author?  To test the robustness of the system I created several different portfolios of broad market ETFs. The assumption here is that we are more likely to not have a curve fitted system if it demonstrates solid performance over different portfolios of ETFs. The following test was conducted on data from 2002 through December 28, 2012.

International Portfolio-  ECH, EGPT, EPHE, EPU, EWZ, FXI, GXG, IDX
U.S. Leveraged Portfolio -  DDM, DUG, QLD, ROM, SSO, URE, UYG, UYM
U.S. Major Market Indices Portfolio –  DIA, IWM, NYC, QQQ, SPY
Sample Portfolio – This was an example portfolio that was included when I joined ETF rewind. It included: EWC, GLD, IEF, SHY, SPY
U.S. Sectors and Group Portfolio –  IGN, XLB, XLE, XLF, XLI, XLK, XLP, XLU, XLV, XLY
Bond Portfolio –  BLV, BND, CFT, LQD, PCY, SHY, TIP, TLT, WIP
Commodities Portfolio –  DIA, IWM, NYC, QQQ, SPY
Currency Portfolio –  CEW, FXA, FXC, FXE, FXF, FXY, UDN

The Ivy-10 portfolio is on the high side, but it’s certainly not an outlier as the international portfolio. Five of the other portfolios also generated double digit CAGR results. The U.S. Leveraged Portfolio created nearly identical CAGR but had less Total Return due to the fact the ETFs only existed very recently. What was striking to me as I performed this test is the trading system performed the two main tasks it was designed to do when compared to the benchmark: 1) increase total reruns and 2) reduce drawdown. All portfolios executed against the relative strength strategy did just that.

Testing Lookback Periods

The trading system utilizes a simple moving average applied to a monthly chart to determine if a given instrument is within a bullish or bearish mode. A given ETF will only be purchased if it’s above this moving average. In this test I looked at modifying the lookback period of this moving average over the values two through 16. Below is chart containing the results as well as a graph depicting the total return as the lookback period was modified.

The following test was conducted on data from 2002 through December 28, 2012.

The recommendation of using a lookback period of 10, as stated in the Ivy Portfolio book, is clearly not an optimal value. In my version of the rotational system I halved this value to produce a lookback period of five. The value five produces similar results to lookback periods of two, three, four, and seven. The lookback period of six produces the best results. The lookback period of five does not appear to be an outlier and there is a clear orderly pattern to the values. All lookback periods produced positive returns. It’s interesting to note the longer the lookback period, the less total return achieved. This is clearly seen in the falling trendline in red. This makes sense as the longer you wait to jump into a momentum trade, the less returns you’re likely to make.

Testing Number of Top Ranked

The trading system will pick the top three ranked instruments to trade. In this test I will vary the number of top instruments to test.

The following test was conducted on data from 2002 through December 28, 2012.

Again, there is a clear orderly progression of reduced returns as you increase the number of top ranked instruments. This makes sense when you consider as you increase the number of top ranked instruments to trade you are “diversifying” your holdings. This reduces returns and reduces drawdown. Notice the drawdown also falls as you increase the number of instruments chosen. This is a great example of balancing returns vs drawdown – a classic dilemma. All values produce positive returns. The recommended value of three does not appear to be an outlier.

Conclusion

It appears the trading concept can be applied to other portfolios of ETFs. It also appears two of the key trading parameters (the lookback period and the number of top ranked instruments to pick) are not highly optimized. While this is far from a complete test, it should give some confidence to relative strength trading system proposed in the book, The Ivy Portfolio.

Download

Here is an Excel document that contains the individual trade informatino as generated by ETFReplay.

Jeff is the founder of System Trader Success – an inBox magazine dedicated to sharing great ideas and concepts from the world of automated trading systems. Read More Google

Facebook Twitter 

Testing New List Name: Email: We respect your email privacyPowered by AWeber Autoresponder 
Tags:

12 Comments

  • I have been using a system like this for years (using a 140d MA). I have used fib numbers as my RS calc — 21, 34 and 55d weighted. The best part of the system is the ability to get you out of a bear market. It saved my bacon in 2000-03, 2008-09. In fact the benefits are understated because of the loss of mental capital and increased stress in bear markets. “The ability to stay calm …….When all around you are losing their head…..”

  • Nice article! But I tried to replicate your various portfolios using my own software and Yahoo data, but were unable to do so from 2002. Eg. I can only run the Ivy10 from 2007 if I demand data for all ETFs. Can you add an additional column to indicate the starting month for each portfolio where you have data for all ETFs?

    • Yes you are correct. Not all the ETFs were available until 2007. Before then you had to work with only the ETFs that were available. I just added an Excel document at the bottom of the article which was generated by ETF Replay. It contains more detail on each trade. Hope this helps.

  • Hi Jeff
    Really great article on an interesting topic. I like your writing style and analysis. I know for myself that the max. drawdown will be too painful for me and I would be looking for ways to minimize it.

    Will surely be looking to read more from you. Greetings from Denmark.

  • Jeff, the International Portfolio is awesome. See, that’s why the Ivy needs to be tweaked to include momentum.. If you had allocated say 20% to International stocks in the Ivy and see that International is beating Domestics and other Asset groups, you could increase it to say 40%. Just run a screen for country ETFs and choose the best. Or you could go with VWO for a broader exposure. And stay with it till it goes below your chosen MA or the 3-month return turns negative.

    • Ross, my fear is this: curve fitting. The International Portfolio may not be reproducible. Just because it did well over the past few years does not mean it will continue to do so. In hindsight it looks great, but hindsight is not a guarantee it will continue to work. Expecting a 30% CAGR is not realistic. In short, I’m leery at attempting to optimize the returns too much. For example, I could have also picked a portfolio of apple stock, gold and international stocks and created a killer portfolio that performed well over the past 10 years. However, that would be optimizing my trading instruments. I used my current knowledge of the past to select what would produced great results in my backtests. I’m more interested in a robust system that will work well over longer periods of time. One that shows more consistency, and will likely continue into the future.

  • Jeff, if you look at the last 10 years, yes that would be curve fitting. I look at 3-month and 6-month returns and upward sloping MA. So if gold is breaking out and beating the other asset classes, I’d increase my allocation from 20% to 40%, or from 0% to 20%. Go with the winner.

  • Glenn Madison

    I’m enjoying your series on this. Your analysis is well-written and understandable by the not-so-knowledgeable like me.

    The simplicity is what drew me to Ivy in the first place. Your tweaks are very interesting, and your commitment to KISS is heartening.

    Good luck to you.

  • Great article, as per usual. It really is a simple strategy: select a set of ETF’s (properly diversified!), rank them by RS and whether or not above/below a longer term MA and buy the top ones untill they lose their spot. Simple as pie. Trouble is just this: we have no way of knowing whether this will work in the future. While it may protect us from crushing bear markets (above the MA else no buys at all) which indeed is already a victory I fear that we might be setting ourselves up. But it does sound terrific, I’ll give you that!

    • what I forgot to add: the reasons why I am rather sceptical:
      1) the test range is very short in itself (10 years), and even shorter if we take into account that some ETFs didn’t exist until 2007…
      2) we get distorted results for all the analysis posted here due to this change in starting date between the various ETFs: it’s not quite accurate to say that the best lookback period for say the MA is 6m when some of the test subjects were included only halfway through the testing period. Then again, conducting this kind of analysis would not be very trustworthy had the starting period been the date that ALL ETFs were available (because then we’d only test about 7 years of data, way too short).

      All in all some interesting thoughts!

Leave a Reply



six × = forty eight

 Name: Email: We respect your email privacyPowered by AWeber Email Marketing Services