Using Genetic Programming to evolve Trading Strategies
A friend and I recently worked together on a research assignment where we successfully used Genetic Programming (GP) to evolve solutions to a real world financial classification problem. This problem, called security analysis, involves determining which securities ought to be bought in order to realize a good return on investment in the future. To find a solution to this problem we used Genetic Programming to evolve a population of decision trees that could perform security analysis on sixty two of the technology stocks listed on the S&P 500. That is, we evolved decision trees capable of classifying those stocks according to whether they should be bought or sold short.
Security Analysis Decision Trees
During the study we evolved two types of security analysis decision trees. The first used only indicators from fundamental analysis and the second used only indicators from technical analysis. Fundamental analysis is a method of evaluating a security to measure its intrinsic value by examining related economic, financial and other qualitative and quantitative factors. Technical analysis is a method of evaluating securities by analyzing statistics generated by market activity.
A strategy for security analysis, regardless of whether it uses technical or fundamental indicators, will consist of a number of rules for making investment decisions. That strategy can be represented as a decision tree where the terminal nodes represent investment decisions and the functional nodes represent rules based either on technical or fundamental indicators. Due to this fact, many existing investment strategies are represented in the form of decision trees.
In total forty two different indicators were selected and used from both Technical analysis and Fundamental analysis. The evolved strategies were for a fixed holding period either three months, six months, nine months or twelve months long. The decision trees were then back-tested using market data from 2011 to 2013.
Genetic Programming is a specialization of a Genetic Algorithm. Genetic Algorithms are population based, meaning that they operate within a population consisting of many different individuals. Each individual is represented by a unique genotype (usually encoded as a vector). Genetic Algorithms model the process of genetic evolution through a number of operators including the selection operator which models survival of the fittest, the crossover operator which models sexual reproduction and the mutation operator which models the genetic mutations that occur randomly to individuals in a population. These operators, when combined, produce what computer scientists refer to as a Genetic Algorithm.
The difference between a Genetic Algorithm and the Genetic Programming Algorithm is the way in which individual genotypes are represented. In Genetic Algorithms genotypes are represented either as Strings or as Vectors whereas in Genetic Programming these genotypes are represented using tree data structures. The crossover operation on tree structures can happen in a few ways, either a sub-tree is swapped out, a leaf node is remove or changed, or the values of some node are adjusted. An illustration of this is shown below,
After this study we concluded that Genetic Programming has great potential to evolve new strategies for security analysis and investment management provided that better functions for calculating fitness can be derived. Throughout our research study we saw that decision trees evolved using Genetic Programming were able to produce stock classifications that beat the average market return consistently over all four quarters. This is true for decision trees that used technical indicators as well as decision trees that used fundamental indicators. A number of other conclusions were derived from our research including the optimal sizes and level of heterogeneity for the decision trees and the value added by the different indicators and the performance of the strategies relative to one another. Some results are included below.
Two independent research reports were produced by myself and my friend. Both reports go into much more detail about our research study, the approach taken, our design and implementation, the testing strategies we used, our conclusions and recommendations for further research. You can also download a copy of the source code created during the implementation. For my colleagues more technical account of the project please click here.