
firefly:
On Sun, 2007-12-16 at 04:53 +0100, Daniel Fischer wrote:
Hmm. Lazy accumulator eh, on String? Should exhibit a space leak.
Doesn't (with -O2, at least), seems ghc's strictness analyser did a good job. It is indeed about 10* slower than ByteStrings, but very memory friendly -
Daniel is right, there's no space leak.
Try it. You'll get a nice surprise :)
Very nice. If we disable the strictness analsyer, $ ghc -O2 -fno-strictness A.hs -no-recomp -o A We get a core loop that looks like: lgo :: Int -> [Char] -> Int lgo n xs = case xs of [] -> n a : as -> lgo (case a of -- sad dons C# c1_amH -> case c1_amH of { ' ' -> case n of I# i -> I# (i +# 1) _ -> n; ) as Look at that big lazy expression for 'n' not being forced! And when run: 1015M 1017M onproc/1 - 0:05 22.36% A Scary stuff. Lots of Int thunks :) But enabling the strictness analyser: lgo :: Int# -> [Char] -> Int# lgo (n :: Int#) (xs :: [Char]) = case xs of [] -> n a : as -> case a of C# c -> case c of ' ' -> lgo (n +# 1) as -- makes me happy _ -> lgo n as And life is good again :) What is quite amazing is how efficient this program is. I had to rerun this a dozen or so times, since I didn't quite believe it: $ time ./A < /usr/obj/data/150M 24569024 ./A < /usr/obj/data/150M 2.42s user 0.47s system 100% cpu 2.883 total Pretty stunning, I think. Swapping in a slightly more eager structure, the lazy ByteString, $ time ./A < /usr/obj/data/150M 24569024 ./A < /usr/obj/data/150M 0.86s user 0.07s system 98% cpu 0.942 total improves things by a good amount, but I think we can revisit the low level performance of lazy bytestrings again, in light of all the changes to the optimiser in the past 2 years. -- Don