Are you sure? I use ghc 7.6.2 (compiled with -O2) and without bang patterns
for 1million iterations it blows stack space.
With bang patterns it runs in constant space , same as 
other version?

bmaxa@maxa:~/haskell$ ./state +RTS -s
500000500000
          52,080 bytes allocated in the heap
           3,512 bytes copied during GC
          44,416 bytes maximum residency (1 sample(s))
          17,024 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0         0 colls,     0 par    0.00s    0.00s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.00s  (  0.00s elapsed)
  GC      time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.00s  (  0.00s elapsed)

  %GC     time       0.0%  (6.2% elapsed)

  Alloc rate    0 bytes per MUT second

  Productivity 100.0% of total user, 0.0% of total elapsed

> Date: Wed, 20 Mar 2013 08:04:01 +0200
> From: to.darkangel@gmail.com
> To: bmaxa@hotmail.com
> CC: haskell-cafe@haskell.org
> Subject: Re: [Haskell-cafe] Streaming bytes and performance
>
> On 03/20/2013 12:47 AM, Branimir Maksimovic wrote:
> > Your problem is that main_6 thunks 'i' and 'a' .
> > If you write (S6 !i !a) <- get
> > than there is no problem any more...
> >
>
> Nope :( Unfortunately that doesn't change anything. Still allocating...
>