strange stack overflow with Data.Map

Hi all, I've got a problem that I'm seeing using either Data.Map or Data.IntMap.
module Main where import Data.List import qualified Data.IntMap as Map
stats elems = foldl add_elem Map.empty elems add_elem m x = Map.insertWith (+) x 1 m
main = print $ stats $ take 1000000 $ repeat 1
This program has a space leak and runs out of stack space. I'm guessing that I'm being bit here by an unnatural amount of laziness in Map.insertWith, but I can't see how to fix this. :( I'm sure it's something elementary... I tried defining add_elem m x = let m' = Map.insertWith (+) x 1 m Just num = Map.lookup x m' in seq num m' to force the (+) to be evaluated strictly, but that didn't help. -- David Roundy http://www.darcs.net

David Roundy wrote:
stats elems = foldl add_elem Map.empty elems add_elem m x = Map.insertWith (+) x 1 m [...] I tried defining
add_elem m x = let m' = Map.insertWith (+) x 1 m Just num = Map.lookup x m' in seq num m' to force the (+) to be evaluated strictly, but that didn't help.
No, it doesn't, unless you also use foldl'. Not tested, but been there, seen it, bought the t-shirt. Udo. -- The Force is what holds everything together. It has its dark side, and it has its light side. It's sort of like cosmic duct tape.

Hello David, Thursday, December 29, 2005, 3:42:05 AM, you wrote:
stats elems = foldl add_elem Map.empty elems
DR> This program has a space leak and runs out of stack space. I'm guessing DR> that I'm being bit here by an unnatural amount of laziness in DR> Map.insertWith stack overflows AFAIK is never occur because of laziness, but only because your recursion is not tail-optimized. as Udo says, problem may be fixed by using proper fold. if foldl' will not work - try to write your own, explicit looping scheme -- Best regards, Bulat mailto:bulatz@HotPOP.com

On Thu, Dec 29, 2005 at 11:22:11AM +0300, Bulat Ziganshin wrote:
stack overflows AFAIK is never occur because of laziness, but only because your recursion is not tail-optimized.
AFAIK, evaluating a thunk in GHC-compiled code does use the stack - update frames are being put on it. There is a nice description in this paper, at the bottom of page 3: http://research.microsoft.com/Users/simonpj/papers/parallel/multiproc.pdf Best regards Tomasz -- I am searching for a programmer who is good at least in some of [Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland

David Roundy wrote:
stats elems = foldl add_elem Map.empty elems add_elem m x = Map.insertWith (+) x 1 m
main = print $ stats $ take 1000000 $ repeat 1
This program has a space leak and runs out of stack space. I'm guessing that I'm being bit here by an unnatural amount of laziness in Map.insertWith, but I can't see how to fix this. :( I'm sure it's something elementary...
try: stats = foldl' add_elem Map.empty add_elem m x = myinsertWith (+) x 1 m myinsertWith f k v m = case Map.lookup k m of Nothing -> Map.insert k v m Just w -> let r = f v w in r `seq` Map.insert k r m main = print $ stats $ take 1000000 $ repeat 1

On Thu, Dec 29, 2005 at 01:42:29PM +0100, Christian Maeder wrote:
David Roundy wrote:
stats elems = foldl add_elem Map.empty elems add_elem m x = Map.insertWith (+) x 1 m
main = print $ stats $ take 1000000 $ repeat 1
This program has a space leak and runs out of stack space. I'm guessing that I'm being bit here by an unnatural amount of laziness in Map.insertWith, but I can't see how to fix this. :( I'm sure it's something elementary...
try:
stats = foldl' add_elem Map.empty add_elem m x = myinsertWith (+) x 1 m
myinsertWith f k v m = case Map.lookup k m of Nothing -> Map.insert k v m Just w -> let r = f v w in r `seq` Map.insert k r m
Thanks, that does it! I had tried something like that, but without the foldl', and had tried foldl', but without the myinsertWith. The reason I didn't spend much time using this implementation is that for large maps, it will take twice as long as Map.insertWith, so I assumed (incorrectly) that Map.insertWith should be written so that it behaves in a non-leaky manner. Should the Map modules have an additional Map.insertWith' that behaves strictly, or might it be the case that you always want strict behavior when using insertWith? -- David Roundy http://www.darcs.net

David Roundy wrote:
Should the Map modules have an additional Map.insertWith' that behaves strictly, or might it be the case that you always want strict behavior when using insertWith?
I think so. Once strictness is lost, there's nothing the user of a library could do about it. If a container is too strict however, lazyness can be recovered by wrapping values in an additional data type. So strict variants of updating functions would be a big win, and if two versions of every function are deemed too much of a burden, I'd rather get rid of the lazy version. Udo. -- Ours is a world where people don't know what they want and are willing to go through hell to get it. -- Don Marquis

Laziness and strictness are both important depending on the situation.
I'd recommend keeping both variants around. Having to wrap values in
an extra data type just to keep laziness sort of defeats the point of
Haskell being lazy in the first place. It also makes it somewhat
awkward to use in the cases where laziness is important, which is, I
suspect, quite a lot of the time where the codomain of the Map is a
recursive datatype.
Having lazy and strict variants of data structure libraries (i.e.
splitting the modules into *.Lazy and *.Strict) seems like a good idea
though, but I can imagine some cases arising where it would be
convenient to swap between the two, so it would be good to keep the
type the same anyway. The parent module would then reexport one of the
variants. (Obviously the lazy one, since this is Haskell, right? ;)
- Cale
On 29/12/05, Udo Stenzel
David Roundy wrote:
Should the Map modules have an additional Map.insertWith' that behaves strictly, or might it be the case that you always want strict behavior when using insertWith?
I think so. Once strictness is lost, there's nothing the user of a library could do about it. If a container is too strict however, lazyness can be recovered by wrapping values in an additional data type. So strict variants of updating functions would be a big win, and if two versions of every function are deemed too much of a burden, I'd rather get rid of the lazy version.
Udo. -- Ours is a world where people don't know what they want and are willing to go through hell to get it. -- Don Marquis
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux)
iD8DBQFDtAMdc1ZCC9bsOpURAs7ZAJ9pS/L7pf9tpRoTGi3HDR0gyindOQCfakED Xdpm15vvyqF51QAmCJeCm1U= =tZhs -----END PGP SIGNATURE-----
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Although it's already been solved, I'd like to point out here that
foldl is (or may be) getting tail optimised, but that the stack
overflow isn't from the foldl itself, but from the evaluation of the
huge expression which that foldl builds. Evaluating the left
associative expression involves immediately pushing 1000000 items on
the stack.
Note that:
foldl (flip (:)) [] (replicate 1000000 1)
works fine after a short pause, due to the fact that the result can be
lazily evaluated and printed one piece at a time, whereas
foldl (+) 0 (replicate 1000000 1)
causes a stack overflow.
As was mentioned, the solution is to use the stricter foldl' to keep
the accumulated expression small.
- Cale
On 28/12/05, David Roundy
Hi all,
I've got a problem that I'm seeing using either Data.Map or Data.IntMap.
module Main where import Data.List import qualified Data.IntMap as Map
stats elems = foldl add_elem Map.empty elems add_elem m x = Map.insertWith (+) x 1 m
main = print $ stats $ take 1000000 $ repeat 1
This program has a space leak and runs out of stack space. I'm guessing that I'm being bit here by an unnatural amount of laziness in Map.insertWith, but I can't see how to fix this. :( I'm sure it's something elementary...
I tried defining
add_elem m x = let m' = Map.insertWith (+) x 1 m Just num = Map.lookup x m' in seq num m' to force the (+) to be evaluated strictly, but that didn't help. -- David Roundy http://www.darcs.net _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (7)
-
Bulat Ziganshin
-
Cale Gibbard
-
Christian Maeder
-
David Roundy
-
David Roundy
-
Tomasz Zielonka
-
Udo Stenzel