More idiomatic use of strictness

Hi all, Is there a less ugly way of avoiding laziness in the code pasted below then the use of seq in the last line? The program is supposed to split a large input file into chunks and check in how many of those chunks each of a list of words appear, as well as the total number of chunks. Without the seq it consumes huge amounts of memory. Thanks! Grzegorz {-# LANGUAGE BangPatterns, PatternGuards #-} import Data.List (foldl') split delim s | [] <- rest = [token] | otherwise = token : split delim (tail rest) where (token,rest) = span (/=delim) s main = do putStrLn =<< fmap (show . stats ["the","a","and"] . split "<DOC>" . words) getContents stats ws docs = foldl' f ((map (const 0) ws),0) docs where f (dfs,n) d = let dfs' = zipWith (\w df -> (df + fromEnum (w `elem` d))) ws dfs in sum dfs' `seq` (dfs',n+1) -- View this message in context: http://www.nabble.com/More-idiomatic-use-of-strictness-tp18379800p18379800.h... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

On Thu, 2008-07-10 at 03:16 -0700, Grzegorz Chrupala wrote:
Hi all,
Is there a less ugly way of avoiding laziness in the code pasted below then the use of seq in the last line? The program is supposed to split a large input file into chunks and check in how many of those chunks each of a list of words appear, as well as the total number of chunks. Without the seq it consumes huge amounts of memory.
Strategies! Try ((,) $| rnf) dfs' (n + 1) Or (id $| seqPair rnf r0) (dfs', n + 1) But I don't know if that falls within the intended meaning of `less ugly'. jcc
{-# LANGUAGE BangPatterns, PatternGuards #-} import Data.List (foldl')
split delim s | [] <- rest = [token] | otherwise = token : split delim (tail rest) where (token,rest) = span (/=delim) s
main = do putStrLn =<< fmap (show . stats ["the","a","and"] . split "<DOC>" . words) getContents
stats ws docs = foldl' f ((map (const 0) ws),0) docs where f (dfs,n) d = let dfs' = zipWith (\w df -> (df + fromEnum (w `elem` d))) ws dfs in sum dfs' `seq` (dfs',n+1)

jonathanccast:
On Thu, 2008-07-10 at 03:16 -0700, Grzegorz Chrupala wrote:
Hi all,
Is there a less ugly way of avoiding laziness in the code pasted below then the use of seq in the last line? The program is supposed to split a large input file into chunks and check in how many of those chunks each of a list of words appear, as well as the total number of chunks. Without the seq it consumes huge amounts of memory.
Strategies! Try
((,) $| rnf) dfs' (n + 1)
Or
(id $| seqPair rnf r0) (dfs', n + 1)
But I don't know if that falls within the intended meaning of `less ugly'.
I'd use a strict pair and the rnf strategy. data P = P [Something] !Int rnf dfs' (P dfs' (n+1)

Don Stewart-2 wrote:
I'd use a strict pair and the rnf strategy.
data P = P [Something] !Int
rnf dfs' (P dfs' (n+1)
Thanks all, it definitely seems like an improvement. -- Grzegorz -- View this message in context: http://www.nabble.com/More-idiomatic-use-of-strictness-tp18379800p18403657.h... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.
participants (3)
-
Don Stewart
-
Grzegorz Chrupala
-
Jonathan Cast