
(Moved from haskell@haskell.org) Hi Kenneth, I'm sorry for the very late reply. On Wed, 2010-12-08 at 10:01 +0100, Kenneth Hoste wrote:
Hi Jan,
On 11/29/2010 12:55 PM, Jan Snajder wrote:
Dear Haskellers,
I am pleased to announce the release of genprog-0.1, a genetic programming library.
(snip)
Very interesting, and nice job on the code (elegant, well-structured, well-documented, ...)!
Thank you. This was my first shot at building a Haskell library and I'm flattered to receive such positive comments :-)
Genetic programming recently got my attention because one of the bots in the Google AI contest was built using this technique (see http://ai-contest.com/forums/viewtopic.php?f=17&t=1136), and it performed really well (way better than my hand-crafted bot).
Interesting, thanks for the link.
I have a bit of experience with genetic algorithms, on a practical and pragmatic level. The field of genetic programming is something I hope to look at and play with in the upcoming months (just for fun).
I was kind of planning to implement my own genetic programming library in Haskell as I became familiar with the field, but after diving into your code it quickly became clear to me you've done a way better job than I would have.
I really like the example you provided for evolving an expression that computes a specified integer value.
I kept this example deliberatly simple so to make it clear how you can use the library. The example is actually a degenerated case of symbolic regression, in which you are given only a single instance to train at, rather than a set of instances. You can find much more interesting examples in Koza's book.
I plan to start playing with that and improve the example (faster convergence to a perfect solution, and also tweaking the current GP config to obtain smaller solutions).
A couple of questions (some fairly unrelated to Haskell):
How hard would it be to extend the current version to support the evolution of actual Haskell programs? As far as I can tell, the current version has support for (simple?) self-defined expressions, but this seems like a long way from supporting Haskell programs with multi-argument functions that operate on lists, etc., even if you just limit it to non-monadic Haskell98 code.
Actually, I have no idea. :-) I don't know if there is a data structure that can represent an AST of a Haskell program. I guess there is, because I guess at some point a compiler like GHC has to maintain such a structure internally, but I never looked into this. At present, I am more than satisfied to be able to evolve custom-made data structures. Another issue with evolving arbitrary Haskell (or whatever) programs is the "closure property". When doing crossover, you have to ensure that what you get by exchanging two subtrees still is a valid program. In other words, if F is a set of (arbitrary arity) functional nodes and T is a set of terminal nodes, then, given a functional node F(x1,...,xi,...xn), you should be able to substitute for argument xi each element of F `union` T. Obviouslly, this will not work for programms in general and you would have to either introduce specialized crossover operations or treat the nodes differently depending on their types. In genprog, I have kept things simple and closure propery is ensured by allowing crossover to take place only at nodes that are of the same type as the root node. This is certainly a limitation, but still I think one can do interesting stuff with it.
Have you considered playing with dynamic values for the various parameters that steer the evolution, i.e. population size, crossover/mutation probabilities, etc.? One thing I've always wanted to try (but never really got to it) is e.g. increasing the mutation probability as the evolution seems to be getting stuck in a (local?) optimum. Also, shrinking the population size if enough progress is being made could be interesting, to speed up the evolutionary process. Are you aware of studies that do such a thing?
Yes, definitely! Variable mutation probability is often used in genetic algorithms to improve the results. The idea is to have higher mutation probability in the beginning, so to encourage exploration, and then decrease the mutation probability as the evolution progresses, so to encourage exploitation. This increases the chances of not getting stuck in a local optima. But I didn't include this into genprog because I don't need variable mutation right now. I will probably do it in the next release. The most straightforward to do it, I guess, would be make the mutation prob. (and pop. size) a function of the evolution state. There are also other things that I plan to include in the next release. I hope I will find time to do it. Btw., mutation is actually rarely used in GP (unlike in GA). Koza, for example, used mutation in one example only. The thing is that, if you think of mutation as a replacement of a subexpression with a randomly generated subexpression, then you already can do this with crossover. Now, this only true for this default type of mutation. If you have a custom-defined mutation operator, one that changes a node depending on its value and/or type, than this cannot be accomplished by crossover anymore. Whether or not you need custom-defined mutations depends on the task at hand. E.g., if the expressions you're evolving feature values from a very large domain set, than a crossover operation alone will not get you enough genetic diversity. For this reason I've defined in genprog the mutation operation as a function of an expression that is being mutated.
Are you planning to actively maintain and extend this library? Are you using it for research purposes, or did you hack this together just for fun (if so, nice job!). What do you plan to use it for?
Yes, I will maintain the library actively. I did this for fun, of course :-), and also for research, which I also do for fun :-) I am using the library for solving specific problems related to natural language processing.
Anyway, great job, I hope I'll find the time to play around with this.
So do I. Please let me know if you have any suggestions on how to improve the library. If you do interesting stuff with it, please let us know what it is. Cheers, Jan