
Hi everyone, With stream-fusion, we can write functions which construct and destruct lists, such as (this is the main example from the Stream Fusion paper[1]) f :: Int -> Int f n = sum [k * m | k <- [1..n], m <- [1..k]] and the rewrite rules in stream-fusion replace these operations on lists with fusible operations on the Stream (non-recursive) datatype. In this example, the domain and codomain of the function don't mention the list datatype. There seem to be many functions like this: the lists are being used internally as composable loops rather than as data-structures. The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly? For example: In Data.Stream (from the stream-fusion package) we can find most of the Prelude list functions but with Stream in all the types instead of []. For example, Data.Stream.sum :: Num a => S.Stream a -> a Using this module, we can rewrite f without mentioning lists. We first need a Monad instance for Data.Stream.Stream:
import qualified Data.List.Stream as S
instance Monad S.Stream where return = S.return (>>=) = S.concatMap
Now we can write
f :: Int -> Int f n = S.sum $ do k <- S.enumFromToInt 1 n m <- S.enumFromToInt 1 k return (k*m)
which is essentially the same as the original f, although lacking the syntactic sugar of the list comprehension. To me, it seems that Stream is more sensible data type to use than [] when the algorithm being expressed is just a loop (rather than something which needs a [] as a cache/buffer), for the following reasons: 1. If I am programming with lists and I need some function which I can't express with Prelude functions, I have to write it myself. I will probably lose fusion in this case, because I will write it recursively in terms of lists. On the other hand, if I am programming with Streams, I will write it myself in terms of Streams, and it should be easier to maintain fusion because it won't be recursive. 2. Holding on to a [] too long can cause a space leak. This is not the case for Stream, because a Stream only ever contains one "state". More generally, the memory use of Stream is more easily predictable than that of [], since "running" a Stream only holds on to one "state" at a time, and we often know how big the "state" is. 3. Fusion doesn't rely on rewrite rules firing. I consider this point less significant than the other two. So, thoughts? Do people program with Streams directly? What have I not considered? Cheers, Reiner [1] http://www.cse.unsw.edu.au/~dons/papers/stream-fusion.pdf

On 12/05/2009, at 14:45, Reiner Pope wrote:
The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly?
I think the main reason is that streams don't store data and therefore don't support sharing. That is, in let xs = map f ys in (sum xs, product xs) the elements of xs will be computed once if it is a list but twice if it is a stream. Roman

rl:
On 12/05/2009, at 14:45, Reiner Pope wrote:
The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly?
I think the main reason is that streams don't store data and therefore don't support sharing. That is, in
let xs = map f ys in (sum xs, product xs)
the elements of xs will be computed once if it is a list but twice if it is a stream.
The other issue is reminding developers to preserve stream invariants, so as not to break the heavy duty rewriting that's going to happen to their code. Still, if someone finds a use for it, proceed! -- Don

2009/5/13 Don Stewart
rl:
On 12/05/2009, at 14:45, Reiner Pope wrote:
The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly?
I think the main reason is that streams don't store data and therefore don't support sharing. That is, in
let xs = map f ys in (sum xs, product xs)
the elements of xs will be computed once if it is a list but twice if it is a stream.
The other issue is reminding developers to preserve stream invariants, so as not to break the heavy duty rewriting that's going to happen to their code.
Can you elaborate on this please? I've only seen a few invariants mentioned when reading the Stream Fusion paper and the stream-fusion source code. They are: 1. 'Skip' values should have no semantic significance. 2. Don't construct bottom Streams. However, this seems to only apply when the fusion rewrite rule is applied, which is not the case I am talking about. Cheers, Reiner

On Tue, May 12, 2009 at 1:39 PM, Roman Leshchinskiy
let xs = map f ys in (sum xs, product xs)
the elements of xs will be computed once if it is a list but twice if it is a stream.
If you're using lists for loops rather than data, that's what you want (what you probably really want is cfoldl' ((,) <$> sumF <*> productF) xs, in terms of combinable folds).
Roman
--Max

Roman Leshchinskiy wrote:
On 12/05/2009, at 14:45, Reiner Pope wrote:
The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly?
This is more or less the conclusion I came to myself the other day when I sat down and tried to implement stream fusion myself (just for giggles).
I think the main reason is that streams don't store data and therefore don't support sharing. That is, in
let xs = map f ys in (sum xs, product xs)
the elements of xs will be computed once if it is a list but twice if it is a stream.
...and I hadn't thought of this part! ;-)

Sure, but this definition leaks space, which I think is one of the
points that Reiner made.
-- ryan
On Tue, May 12, 2009 at 5:39 AM, Roman Leshchinskiy
On 12/05/2009, at 14:45, Reiner Pope wrote:
The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly?
I think the main reason is that streams don't store data and therefore don't support sharing. That is, in
let xs = map f ys in (sum xs, product xs)
the elements of xs will be computed once if it is a list but twice if it is a stream.
Roman
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, 12 May 2009, Reiner Pope wrote:
So, thoughts? Do people program with Streams directly? What have I not considered?
Yes, for signal processing I sometimes use my custom Streams data type that misses the Skip constructor since I do not use 'filter' functions: http://darcs.haskell.org/synthesizer/src/Synthesizer/State/Signal.hs It has however the problem, that sharing is not possible. In order to get sharing, too, I introduced an arrow for causal processes: http://darcs.haskell.org/synthesizer/src-4/Synthesizer/Causal/Process.hs (Note that Hudak et.al. have independently developed a theory on causal processes.)
participants (7)
-
Andrew Coppin
-
Don Stewart
-
Henning Thielemann
-
Max Rabkin
-
Reiner Pope
-
Roman Leshchinskiy
-
Ryan Ingram