
Hi Tobias,
A friend [is] currently looking into how best work with assorted usage data: currently 250 million entries as a 12GB in a csv comprising of information such as which channel was tuned in for how long with which user agent and what not.
as much as I love Haskell, the tool of choice for data analysis is GNU R, not so much because of the language, but simply because of the vast array of high-quality libraries that cover topics, like statistics, machine learning, visualization, etc. You'll find it at http://www.r-project.org/. If you'd want to analyze 12 GB of data in Haskell, you'd have to jump through all kinds of hoops just to load that CVS file into memory. It's possible, no doubt, but pulling it off efficiently requires a lot of expertise in Haskell that statistics guys don't necessarily have (and arguably they shouldn't have to). The package Rlang-QQ integrates R into Haskell, which might be a nice way to deal with this task, but I have no personal experience with that library, so I'm not sure whether this adds much value. Just my 2 cents, Peter