
Philip Armstrong wrote:
On Sun, Feb 17, 2008 at 10:01:14PM +0000, Adrian Hey wrote:
Philip Armstrong wrote:
Since no-one else has answered, I'll take a stab. Obiously, you have a stack leak due to laziness somewhere
I wouldn't say that was obvious, though it is certainly a possibility.
I'm never exactly clear what people mean by a "stack leak". It seems some folk regard any algorithm that makes use of O(n) or worse stack space as a "stack leak".
In my personal lexicon, a stack leak is one where you accumulate unevaluated thunks on the stack.
In this case, the combination of Data.Map not evaluating it's values and Data.Binary being lazy is (I think) resulting in decode thunks accumulating on the stack.
My opinion is that using O(n) or worse working memory when the job could be done in O(log n) or O(1) memory is certainly bad, but using O(n) stack is no worse in principle than using O(n) heap. But at the moment it is worse in practice with ghc, regretably :-(
I'd agree with this entirely.
In fact, a little experimentation has revealed that this:
do [path] <- getArgs m <- liftM decode (BS.readFile path)::IO [((Int, Maybe String), Int)] putStrLn . show . findMax . fromAscList $ m
will work just fine. No extra evaluation needed at all! I'll leave it to the Haskell gurus to explain what's going on...
That's very interesting. Strangely if you use fromDistinctAscList instead (as used by the Binary instance get method) the problem re-appears. This is despite fromAscList making use of fromDistinctAscList.
Yup. It's because (I've just realised) fromAscList is evaluating the values returned by decode in order to build the Distinct Ascending List to pass to fromDistinctAscList. A previous version I was going to put up simply ran another function over the list before passing it to the map constructor, which was not particularly elegant but also worked.
Data.Binary calls fromDistinctAscList to build the map, which is why it gets into this mess.
BTW, I find this especially ironic as fromDistinctAscList is the perfect example what I was talking about in another thread (continuation passing madness caused by an irrational fear of stack use).
In *some* cases, continuation passing can be quite a bit faster than other approaches IIRC. I haven't looked at this bit of code though.
As to what's really going on here, I haven't figured it out and I'm not really inclined to try :-)
I did try and do some profiling, but gave up when I realised that I'd have to go sprinking SCCs around the core libraries, which would mean recompiling Data.Map and friends...
Heap profile graphs which assign everything to "CAF" are not very helpful :)
Phil
Thanks to everyone who answered, this fixes my particular problem, and I think thanks to the comments I sort of vaguely understand what going on now. I run into this kind of esoteric laziness leaks from time to time and it's very frustrating -- I think that's about the only thing I don't like about Haskell. -- Grzegorz -- View this message in context: http://www.nabble.com/Stack-overflow-tp15479718p15543838.html Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.