
jeff p wrote:
Hello,
So before I embark on day 1 of the project, I thought I should check and see if anyone on this list has used Haskell to munge a ten-million-row database table, and if there are any particular gotchas I should watch out for.
One immediate thing to be careful about is how you do IO. Haskell is not very good, in my experience, at reading files fast. You'll probably want to skip the standard Haskell IO functions and use the lazy bytestring library (http://www.cse.unsw.edu.au/~dons/fps.html).
I'm planning to use HSQL, since it's in Debian stable and the API resembles what I'm already familiar with. Database access is slower than file access (which is one reason I want to move as much logic as I can out of SQL), so if the speed of getting rows out of the database turns out to be the bottleneck in my code, I'll either be happy that all the other code is so efficient or peeved that HSQL is so inefficient.