
Hi everyone, With stream-fusion, we can write functions which construct and destruct lists, such as (this is the main example from the Stream Fusion paper[1]) f :: Int -> Int f n = sum [k * m | k <- [1..n], m <- [1..k]] and the rewrite rules in stream-fusion replace these operations on lists with fusible operations on the Stream (non-recursive) datatype. In this example, the domain and codomain of the function don't mention the list datatype. There seem to be many functions like this: the lists are being used internally as composable loops rather than as data-structures. The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly? For example: In Data.Stream (from the stream-fusion package) we can find most of the Prelude list functions but with Stream in all the types instead of []. For example, Data.Stream.sum :: Num a => S.Stream a -> a Using this module, we can rewrite f without mentioning lists. We first need a Monad instance for Data.Stream.Stream:
import qualified Data.List.Stream as S
instance Monad S.Stream where return = S.return (>>=) = S.concatMap
Now we can write
f :: Int -> Int f n = S.sum $ do k <- S.enumFromToInt 1 n m <- S.enumFromToInt 1 k return (k*m)
which is essentially the same as the original f, although lacking the syntactic sugar of the list comprehension. To me, it seems that Stream is more sensible data type to use than [] when the algorithm being expressed is just a loop (rather than something which needs a [] as a cache/buffer), for the following reasons: 1. If I am programming with lists and I need some function which I can't express with Prelude functions, I have to write it myself. I will probably lose fusion in this case, because I will write it recursively in terms of lists. On the other hand, if I am programming with Streams, I will write it myself in terms of Streams, and it should be easier to maintain fusion because it won't be recursive. 2. Holding on to a [] too long can cause a space leak. This is not the case for Stream, because a Stream only ever contains one "state". More generally, the memory use of Stream is more easily predictable than that of [], since "running" a Stream only holds on to one "state" at a time, and we often know how big the "state" is. 3. Fusion doesn't rely on rewrite rules firing. I consider this point less significant than the other two. So, thoughts? Do people program with Streams directly? What have I not considered? Cheers, Reiner [1] http://www.cse.unsw.edu.au/~dons/papers/stream-fusion.pdf