
Jefferson Heard wrote:
I thought this was fairly straightforward, but where the marked line finishes in 0.31 seconds on my machine, the actual transpose takes more than 5 minutes. I know it must be possible to read data in [snip]
dataFromFile :: String -> IO (M.Map String [S.ByteString]) dataFromFile filename = do f <- S.readFile filename print . length . map (S.split ',' $!) . S.lines $ f -- finishes in 0.31 seconds
The S.split applications will never be evaluated - the list that you produce is full of thunks of the form (S.split ',' $! <some bytestring>) The $! will only take effect if those thunks are forced, and length doesn't do that. Try print . sum . map (length . S.split ',') . S.lines $ f instead, to force S.split to produce a result. (In fact, S.split is strict in its second argument, so the $! shouldn't have any effect on the running time at all. I didn't measure that though.)
return . transposeCSV . map (S.split ',' $!) . S.lines $ f -- this takes 5 minutes and 6 seconds
HTH, Bertram