Alternatively, just use "return $! ", which has the same effect as seq-ing, and is generally a good practice (unless you want to be explicitly lazy). See http://johantibell.com/files/haskell-performance-patterns.html.

the-thing +RTS -s     (with GHC 8)

  3,965,758,088 bytes allocated in the heap
      46,006,592 bytes copied during GC
         207,792 bytes maximum residency (5 sample(s))             <---- down to 200K from 1.5G
         182,864 bytes maximum slop
               8 MB total memory in use (0 MB lost due to fragmentation)

Robin

2017-01-26 12:42 GMT+01:00 Johannes Waldmann <johannes.waldmann@htwk-leipzig.de>:
Thanks for pointing out this (Knuth's) nice test case.
I should use this as an exam question ...

Allocation and run-time can be reduced
by replacing  return (x3' + x4')
with  let x = (x3' + x4') in x `seq` return x
And, modifySTRef' *does* help.

I *do* notice a regression
(for the seq-ed and primed version)

ghc-8.0.2 : 2.6 GB alloc, 2.3 sec
ghc-6.10.4: 2.0 GB alloc, 1.7 sec

(measured on X5365  @ 3.00GHz )

Numbers are for total allocation,
residency is small  (it runs with  +RTS -H10k -A10k
but then the number of GCs goes up)

- J.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.