Re: [Haskell-cafe] Unnecessarily strict implementations

On Friday 03 September 2010 12:28:43, Jan Christiansen wrote:
Hi,
On 03.09.2010, at 01:32, Daniel Fischer wrote:
No surprise, there aren't many 'รค's in Shakespeare's works, are there?
nope
On the other hand, the current implementation of lines does not seem to suffer from Wadler's tuple space leak (according to one test I made), so I'd stick with the current implementation for the time being.
Well, I think it does.
Indeed. I botched my test, allowing it to drop the reference to the first line too early by not using enough of it, so although lines shows the leak, let { ds1_s1ts :: ([Char], [Char]) LclId ds1_s1ts = case $wbreak @ Char lvl_r1uj wild_B1 of _ { (# ww1_anp, ww2_anq #) -> (ww1_anp, ww2_anq) } } in : @ String (case ds1_s1ts of _ { (l_aij, _) -> l_aij }) (case ds1_s1ts of _ { (_, s'_ail) -> case s'_ail of _ { , I managed to conceal it.
But obviously one can argue that this is a rare application.
Yes. Ordinarily, lines in text files aren't longer than a few hundred characters, leaking those, who cares? But. Occasionally, long lines occur, and avoiding the space leak seems more important to me than having lines (_|_ : _|_) = _|_ : _|_ instead of lines (_|_ : _|_) = _|_
Hopefully I haven't made a mistake here?
No.
By accident I stumbled across an odd behaviour. If I use the following definition and replace all occurrences of break by break' in the program above the memory behaviour is bad if I compile it without profiling. But if I compile the program with -prof -auto-all the program runs in constant space. Is this a known behaviour?
I can't reproduce that. For me, it leaks also with profiling.
break' :: (a -> Bool) -> [a] -> ([a],[a]) break' p xs = (ys,zs) where (ys,zs) = break p xs
Cheers, Jan

On 03.09.2010, at 14:38, Daniel Fischer wrote:
I can't reproduce that. For me, it leaks also with profiling.
Have you used optimizations? It disappears if I compile the program with -O2. Without profiling I get the following. Here the maximum residency is nearly 45MB. $ ghc --make Temp.hs -fforce-recomp [1 of 1] Compiling Main ( Temp.hs, Temp.o ) Linking Temp ... $ ./Temp +RTS -sstderr ./Temp +RTS -sstderr 5458199 647,520,792 bytes allocated in the heap 256,581,176 bytes copied during GC 44,934,408 bytes maximum residency (11 sample(s)) 1,363,496 bytes maximum slop 103 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 1223 collections, 0 parallel, 0.83s, 0.85s elapsed Generation 1: 11 collections, 0 parallel, 0.49s, 0.62s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 0.63s ( 0.67s elapsed) GC time 1.32s ( 1.46s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 1.96s ( 2.13s elapsed) %GC time 67.6% (68.5% elapsed) Alloc rate 1,022,883,082 bytes per MUT second Productivity 32.3% of total user, 29.6% of total elapsed With profiling it looks as follows. Here the maximum residency is less than 15KB. $ ghc --make Temp.hs -prof -auto-all -fforce-recomp [1 of 1] Compiling Main ( Temp.hs, Temp.o ) Linking Temp ... $ ./Temp +RTS -sstderr ./Temp +RTS -sstderr 5458199 1,051,844,836 bytes allocated in the heap 110,134,944 bytes copied during GC 14,216 bytes maximum residency (96 sample(s)) 37,068 bytes maximum slop 2 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 1908 collections, 0 parallel, 0.57s, 0.59s elapsed Generation 1: 96 collections, 0 parallel, 0.02s, 0.02s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 1.46s ( 1.51s elapsed) GC time 0.60s ( 0.61s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 2.05s ( 2.12s elapsed) %GC time 29.0% (28.7% elapsed) Alloc rate 721,170,248 bytes per MUT second Productivity 71.0% of total user, 68.9% of total elapsed

On Saturday 04 September 2010 00:21:39, Jan Christiansen wrote:
On 03.09.2010, at 14:38, Daniel Fischer wrote:
I can't reproduce that. For me, it leaks also with profiling.
Have you used optimizations?
Of course. Always do :)
It disappears if I compile the program with -O2.
Yeah, without optimisations and with profiling, it runs in small memory. Without optimisations, it also uses less than half the memory it uses with optimisations when compiled without profiling (it does still leak, just less badly).

Daniel Fischer schrieb:
Yes. Ordinarily, lines in text files aren't longer than a few hundred characters, leaking those, who cares?
I got several space leaks of this kind in the past. They are very annoying. They are especially annoying if input comes from the outside world, where people can attack them to crash your program because of memory exhaustion.

On Sunday 05 September 2010 21:52:44, Henning Thielemann wrote:
Daniel Fischer schrieb:
Yes. Ordinarily, lines in text files aren't longer than a few hundred characters, leaking those, who cares?
I got several space leaks of this kind in the past. They are very annoying. They are especially annoying if input comes from the outside world, where people can attack them to crash your program because of memory exhaustion.
That would likely be the case of long lines, wouldn't it? I have trouble imagining a scenario where `lines' holding on to a few hundred characters which could already be released causes a noticeable space leak, let alone memory exhaustion. If you have a case where leaking a few KB creates a serious problem, I'd like to learn about it. Cheers, Daniel

Daniel Fischer schrieb:
On Sunday 05 September 2010 21:52:44, Henning Thielemann wrote:
Daniel Fischer schrieb:
Yes. Ordinarily, lines in text files aren't longer than a few hundred characters, leaking those, who cares? I got several space leaks of this kind in the past. They are very annoying. They are especially annoying if input comes from the outside world, where people can attack them to crash your program because of memory exhaustion.
That would likely be the case of long lines, wouldn't it? I have trouble imagining a scenario where `lines' holding on to a few hundred characters which could already be released causes a noticeable space leak, let alone memory exhaustion.
I talked about an _attack_! I provide a program that processes external data (say a webserver, for instance one for an ICFP contest) and someone feeds it intentionally with megabytes of text without any line ending.

On Monday 06 September 2010 10:47:54, Henning Thielemann wrote:
Daniel Fischer schrieb:
On Sunday 05 September 2010 21:52:44, Henning Thielemann wrote:
Daniel Fischer schrieb:
Yes. Ordinarily, lines in text files aren't longer than a few hundred characters, leaking those, who cares?
I got several space leaks of this kind in the past. They are very annoying. They are especially annoying if input comes from the outside world, where people can attack them to crash your program because of memory exhaustion.
That would likely be the case of long lines, wouldn't it? I have trouble imagining a scenario where `lines' holding on to a few hundred characters which could already be released causes a noticeable space leak, let alone memory exhaustion.
I talked about an _attack_! I provide a program that processes external data (say a webserver, for instance one for an ICFP contest) and someone feeds it intentionally with megabytes of text without any line ending.
Yes, that's absolutely a problem. I was irritated/confused by the selection of text you quoted which was only about the ordinary case of relatively short lines. I didn't originally think about an attack but only about accidental long lines, which make it important enough to fix lines' leak. Throw a possible attack in, and it's urgent to fix it.
participants (3)
-
Daniel Fischer
-
Henning Thielemann
-
Jan Christiansen