Re: [Haskell-cafe] mdo with multiple values

16 Jan 2014

      Joachim Breitner wrote:
...
the background is a binary format assembler, so you can think of the
monad as the Put monad, only sufficiently lazy to admit a useful
MonadFix instance. Then one can do nice things like
mdo 
          putWord32 offset
          putBS someMoreHeaderData
          ...
          offset <- getCurrentOffset
          putBS byteString1
where I conceptually use the offset before it is known.
[..]     
Now I try to generalize that to a list of Bytestrings, and just from the
looks of it, this is what you want to do:
mdo 
        mapM_ putWord32 offsets
        putBS someMoreHeaderData
        ...
        offsets <- forM byteStrings $ \b -> 
           offset <- getCurrentOffset
           putBS b
           return offset
but that will <<loop>>.
So, the overall idea is that we can reserve 32 bits for the offset and 
defer its calculation to later. In other words,  putWord32  can "forward 
the file pointer" long before it actually writes the word.

Now, what's the problem with the `offsets` list? I think it may actually 
work in some monads that have a good MonadFix instance, but often, the 
MonadFix instances tend to be poor. One noteable example of the latter 
is the IO monad, and I have found the following rule to be very useful:

   To use value recursion in the IO monad, make sure
   that the sequence of *computations* is always
   defined in advance, only the values that these
   computations operate on may be lazy.

This rule is best explained with the problem at hand, by asking the 
following question: how many `putWord32` instructions does the following 
expression generate?

   mapM_ putWord32 offsets

Clearly, it should be `length offsets` many, but the problem is that 
this number is not known until the spine of `offsets` has been 
calculated. Now, some monads may be able to do that, but the rule for IO 
is that the number of `putWord32` must be known before the later 
definition of `offsets` can yield a value different from _|_ at all.

Now, if a recursive expression works in the IO monad, then it will work 
in any other monad as well.

Fortunately, we do known the spine of `offsets` in advance: it has the 
same spine as `byteStrings`. The solution is to make that explicit in 
the code, by using a lazy `zip`:

      ...
      mapM_ putWord32 (offsets `spine` byteStrings)
      ...

    where
    spine :: [a] -> [void] -> [a]
    spine ~[]     []     = []
    spine ~(x:xs) (y:ys) = x : tag xs ys

This code takes the spine of `byteStrings` and fills it with values from 
`offsets`.

It may also be possible to get rid of the <<loop>> by defining the monad 
appropriately: one pass calculates all offsets, a second pass calculates 
the actual output. The spine of `offsets` does not depend on the offset 
calculation, so there is a good chance that this might work recursively.

Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com