
At this very moment I'm struggling with fitting a huge graph of Twitter communications into a Haskell program. Apparently it gets into a loop freeing memory. As I suspected, JVM garbage collector got more testing than Haskell at this scale; since not many people load it up as much, it may be less tested. The memory behavior apparently requires tweaking -A and -H in some ways not right away obvious. I'm still new to Haskell, so perhaps with more profiling/tuning experience it would have been easier, but so far Clojure is more predictable -- even though Haskell beats it on a smaller data set, I get a linear resource consumption with Clojure while Haskell explodes. I'd say at this point it's not prime time for large-scale data mining on a single box. -- Alexy