
Ahn, Ki Yung wrote:
Recently, I'm facing the dark side of laziness -- the memory leak because of laziness.
Are there standardized approaches for detecting and fixing these kind of problems?
Not really. As Don S. already said, try heap profiling. The function that is too lazy will show up as producer. Other than that, you'll just have to learn to look for the typical patterns. Understanding Haskell's evaluation model and being able to simulate it in your head also helps.
sctAnal gs = null cgs || all (not . null) dcs where gs' = fixSize compose $ Set.fromList [TT (x,y,cs) [] | To _ x y cs<-Set.toList gs] cgs = [z | z@(TT (x,y,cs) _)<-Set.toList gs', x==y] dcs = [[c| c@(a,D,b)<-Set.toList cs , a==b] | TT (_,_,cs) _<-cgs] compose gs = trace ("## "++show (Set.size gs)) $ foldr checkInsert gs $ do ^^^^^ point 1 TT (x1,y1,cs1) l1 <- Set.toList gs TT (_,y2,cs2) l2 <- takeWhileTTfrom y1 . Set.toList $ setGT (TT (y1,Al""(-1),Set.empty) []) gs return $ TT (x1,y2,cs1 `comp` cs2) (l1++y1:l2) ^^^^^^^^^^^ point 2 takeWhileTTfrom y = takeWhile (\(TT (y',_,_) _) -> y==y') checkInsert x s | Set.member x s = s | otherwise = Set.insert x s
I can see two sources of problems. Point 2 seems to be the cause of your immediate problem: this builds nested applications of (++) and never evaluates them. If the result is demanded, (++) calls itself recursively, and if the list is too long, the stack gets exhausted. 'seq' doesn't help, that would only let the (++) accumulate in the list's tail, but 'foldr seq' should help, and so would deepSeq. I wonder why
instance (Ord a, Ord b) => Ord (TT a b) where (TT x lx) < (TT y ly) = lx==lx && ly==ly && x < y
doesn't. Does the (lx == lx) get optimized away? The easiest solution would be to use a data structure that directly supports concatenation. Any implementation of a deque is good (FingerTrees? Having them around can never hurt...) and so is a function. Replace the list [a] by a function ([a] -> [a]), replace [] by id and replace (l1++y1:l2) by (l1.(y1:).l2). Also helps with the quadratic runtime, btw. At point 1i, there lurks another problem. You may find that some graphs will blow your stack or even your heap. That's because the repeated application of checkInsert is not evaluated and this thunk may get too deep or need more space than the Set it would buils. I think, you want foldl' (note the prime) here. Udo. -- F: Was ist ansteckend und kommutiert? A: Eine Abelsche Grippe.