
Aleksandar Dimitrov wrote:
The important bits are as follows:
mf :: [C.ByteString] -> StdWord mf [] = Word [] C.empty mf s = Word (tail s) (head s)
f' = mf . reverse . C.words
main :: IO () main = do corpus_name <- liftM head getArgs corpus <- liftM (Corpus . (map f') . C.lines) $ C.readFile corpus_name print $ length (content corpus) let interesting = filterForInterestingTags interestingTags corpus print $ show (freqMap interesting)
[...]
Ideally, only a very smart part of the file should ever be in memory, with processing happening incrementally!
The print $ length (content corpus) statement seems contradictory to your goal? After all, the whole file is read into the corpus variable to calculate its length . Regards, apfelmus -- http://apfelmus.nfshost.com