
On Thu, 2008-07-10 at 03:16 -0700, Grzegorz Chrupala wrote:
Hi all,
Is there a less ugly way of avoiding laziness in the code pasted below then the use of seq in the last line? The program is supposed to split a large input file into chunks and check in how many of those chunks each of a list of words appear, as well as the total number of chunks. Without the seq it consumes huge amounts of memory.
Strategies! Try ((,) $| rnf) dfs' (n + 1) Or (id $| seqPair rnf r0) (dfs', n + 1) But I don't know if that falls within the intended meaning of `less ugly'. jcc
{-# LANGUAGE BangPatterns, PatternGuards #-} import Data.List (foldl')
split delim s | [] <- rest = [token] | otherwise = token : split delim (tail rest) where (token,rest) = span (/=delim) s
main = do putStrLn =<< fmap (show . stats ["the","a","and"] . split "<DOC>" . words) getContents
stats ws docs = foldl' f ((map (const 0) ws),0) docs where f (dfs,n) d = let dfs' = zipWith (\w df -> (df + fromEnum (w `elem` d))) ws dfs in sum dfs' `seq` (dfs',n+1)