
Joe Fredette wrote:
My suggestion would be to look into writing a parser (via parsec) to handle this. Parsec is fairly easy to learn, and since your data is a pretty simple format, the parser won't be hard to write.
While I'm all for using a proper parser, Brent Pedersen notes that his data will have millions of rows, so that Parsec is likely to run into memory problems. I think something along the lines of import Data.ByteString.Lazy.Char8 as B parse = map (zipWith ($) formats . B.split '\t') . B.lines where formats = [str, str, int, int, int, int, int, int, int, float, float] int = fst . fromJust . readInt float = \s -> read (unpack s) :: Double str = id will do just fine. (The implementation of float is a kludge, I think there's something on hackage for that, though?) Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com