
Am Donnerstag 11 März 2010 00:24:28 schrieb Daniel Fischer:
Hmm, offhand, I don't see why that isn't strict enough.
Turns out, mapM_ was a red herring. The villain was (zip and map). I must confess, I don't know why it sort-of worked without the mapM_, though. "sort-of", because that also hung on to unnecessarily much memory, the space leak was just smaller than with the mapM_. A very small change that eliminates the space leak, is readFasta :: Int -> [Char] -> [Window] readFasta windowSize sequence = -- get the header let (header,rest) = span (/= '\n') sequence chr = parseChromosome header go i (w:ws) = Window w chr i : go (i+1) ws go _ [] = [] in go 0 $ slideWindow windowSize $ filter (/= '\n') rest You can improve performance by eliminating slideWindow and the intermediate Window list (merging fastaExtractor and readFasta), {-# LANGUAGE BangPatterns #-} readFasta2 :: (String -> Bool) -> Int -> String readFasta2 test windowSize sequence = let (header,rest) = span (/= '\n') sequence chr = parseChromosome header schr = show chr go !i st@(_:tl) | test w = w ++ '\t' : schr ++ '\t' : show i ++ '\n' : go (i+1) tl | otherwise = go (i+1) tl where w = take windowSize st go _ [] = [] in go 0 (filter (/= '\n')) rest