I made a few modifications to your code, and found that replacing `return (x3' + x4')` with `return $! x3' + x4'` reduced maximum residency down to 64kb. This forces evaluation earlier. You can see the progression here:

https://gist.github.com/snoyberg/6a48876aedb9b19c808a0c53e86109ac

I took it one step further, and used the mutable-containers package to use an unboxed reference instead of the boxed STRef type. In other words: it avoids allocating a heap object. Here's that version of the code:

#!/usr/bin/env stack
-- stack --resolver lts-7.14 --install-ghc exec --package mutable-containers -- ghc -O2 -with-rtsopts=-s
import Control.Monad.ST
import Data.Mutable

a :: Int
  -> ST s Int
  -> ST s Int
  -> ST s Int
  -> ST s Int
  -> ST s Int
  -> ST s Int
a k x1 x2 x3 x4 x5 =
   do kk <- fmap asURef $ newRef k
      let b = do k0 <- readRef kk
                 let k1 = k0 - 1
                 writeRef kk k1
                 a k1 b x1 x2 x3 x4
      if k <= 0 then do x3' <- x3; x4' <- x4; return $! x3' + x4'
                else do x5' <- x5; b' <- b; return $! x5' + b'

main = print (runST (a 22 (return 1) (return (-1)) (return (-1)) (return 1) (return 0)))

It knocked down total allocations from 2.7GB to 1.8GB, which is an improvement, but I think there's still some more low hanging fruit here.

On Thu, Jan 26, 2017 at 12:26 PM, Antoine Rimlet <antoine.rimlet@gmail.com> wrote:
Hi list,

I try to get the following little program (a slightly modified "Man or boy", it prints -14254067) work "as expected", that is, without consuming lots of memory:

 import Data.STRef
 a k x1 x2 x3 x4 x5 =
   do kk <- newSTRef k
      let b = do k <- modifySTRef kk pred >> readSTRef kk; a k b x1 x2 x3 x4
      if k <= 0 then do x3' <- x3; x4' <- x4; return (x3' + x4')
                else do x5' <- x5; b' <- b; return (x5' + b')
 main = print (runST (a 22 (return 1) (return (-1)) (return (-1)) (return 1) (return 0)))

I use GHC 7.8.4, and executing this program uses about 2.5 GB of memory (or about 3.5 GB with runghc).  However, the "equivalent" program in OCaml only needs 4 MB (or 9 MB with the interpreter):

let rec a k x1 x2 x3 x4 x5 =
  let kk = ref k in let rec b () = begin decr kk; a !kk b x1 x2 x3 x4 end
  in if k <= 0 then let v = x3 () in v + x4 () else let v = x5 () in v + b ();;
print_int (a 22 (fun () -> 1) (fun () -> -1) (fun () -> -1) (fun () -> 1) (fun () -> 0));; print_newline ();;

Therefore I suspect I'm doing something wrong, but I can't see what.  I did try to use the strict version modifySTRef' as indicated in the manual, but with no visible improvement.

Thanks,

Antoine


_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.