space-efficient, composable list transformers [was: Re: Reifying case expressions [was: Re: On stream processing, and a new release of timeplot coming]]

Hello Heinrich, On Tue, Dec 27, 2011 at 1:09 PM, Heinrich Apfelmus < apfelmus@quantentunnel.de> wrote:
Sebastian Fischer wrote:
all functions defined in terms of `ListTo` and `interpret` are spine strict - they return a result only after consuming all input list constructors.
Indeed, the trouble is that my original formulation cannot return a result before it has evaluated all the case expressions. To include laziness, we need a way to return results early.
Sebastian's ListTransformer type does precisely that for the special case of lists as results,
Hmm, I think it does more than that.. (see below) but it turns out that there is also a completely generic way of returning
results early. In particular, we can leverage lazy evaluation for the result type.
This is nice! It would be cool if we could get the benefits of ListConsumer and ListTransformer in a single data type. I know that you chose these names to avoid confusion, but I would like to
advertise again the idea of choosing the *same* names for the constructors as the combinators they represent [...] This technique for designing data structures has the huge advantage that it's immediately clear how to interpret it and which laws are supposed to hold.
I also like your names better, although they suggest that there is a single possible interpretation function. Even at the expense of blinding eyes to the possibility of other interpretation functions, I agree that it makes things clearer to use names from a *motivating* interpretation. In hindsight, my names for the constructors of ListTransformer seem to be inspired by operations on handles. So, `Cut` should have been named `Close` instead..
Especially in the case of lists, I think that this approach clears up a lot of confusion about seemingly new concepts like Iteratees and so on.
A share the discomfort with seemingly alien concepts and agree that clarity of exposition is crucial, both for the meaning of defined combinators and their implementation. We should aim at combinators that people are already familiar with, either because they are commonplace (like id, (.), or fmap) or because they are used by many other libraries (like the Applicative combinators). A good way to explain the meaning of the combinators is via the meaning of the same combinators on a familiar type. Your interpretation function is a type-class morphism from `ListTo a b` to `[a] -> b`. For Functor we have: interpret (fmap f a) = fmap f (interpret a) On the left side, we use `fmap` for `ListTo a` on the right side for `((->) l)`. Similarly, we have the following properties for the coresponding Applicative instances: interpret (pure x) = pure x interpret (a <*> b) = interpret a <*> interpret b Such morphism properties simplify how to think about programs a lot, because one can think about programs as if they were written in the *meaning* type without knowing anything about the *implementation* type. The computed results are the same but they are computed more efficiently. Your `ListTo` type achieves space efficiency for Applicative composition of list functions by executing them in lock-step. Because of the additional laziness provided by the `Fmap` constructor, compositions like interpret a . interpret b can also be executed in constant space. However, we cannot use the space efficient Applicative combinators again to form parallel compositions of sequential ones because we are already in the meaning type. We could implement composition for the `ListTo` type as follows (<.) :: ListTo b c -> ListTo a [b] -> ListTo a c a <. b = interpret a <$> b But if we use a result of this function as argument of <*>, then the advantage of using `ListTo` is lost. While interpret ((,) <$> andL <*> andL) runs in constant space, interpret ((,) <$> (andL <. idL) <*> (andL <. idL)) does not. The ListTransformer type supports composition in lock-step via a category instance. The meaning of `ListTransformer a b` is `[a] -> [b]` with the additional restriction that all functions `f` in the image of the interpretation function are incremental: xs `isPrefixOf` ys ==> f xs `isPrefixOf` f ys Composition as implemented in the ListTransformer type satisfies morphism properties for the category instance: transformList id = id transformList (a . b) = transformList a . transformList b As it is implemented on the ListTransformer type directly (without using the interpretation function), it can be combined with the Applicative instance for parallel composition without losing space efficiency. The Applicative instance for `ListTransformer` is different from the Applicative instance for `ListTo` (or `ListConsumer`). While interpret ((,) <$> idL <*> idL) is of type `[a] -> ([a],[a])` transformList ((,) <$> idL <*> idL) is of type `[a] -> [(a,a)]`. We could achieve the latter behaviour with the former instance by using an additional fmap. But uncurry zip <$> ((,) <$> idL <*> idL) has the same disadvantages regarding space efficiency as referred to above because the computation of `zip` cannot occur in lock-step with another computation. The meaning of the Applicative instance of `ListTransformer` can again be described using morphism properties on Applicative structures. The corresponding structure on the meaning type `[a] -> [b]` is the following combination of the Applicative instances for functions and for ZipLists. fmap f a = liftA (fmap f) a pure x = pure (getZipList (pure x)) a <*> b = getZipList <$> liftA2 (<*>) (ZipList <$> a) (ZipList <$> b) I have a gut feeling that the laziness provided by the `Fmap` constructor is too implicit to be useful for the kind of lock-step composition provided by ListTransformer. So I don't have high hopes that we can unify `ListConsumer` and `ListTransformer` into a single type. Do you have an idea? Sebastian

Sebastian Fischer wrote:
Your `ListTo` type achieves space efficiency for Applicative composition of list functions by executing them in lock-step. Because of the additional laziness provided by the `Fmap` constructor, compositions like
interpret a . interpret b
can also be executed in constant space. However, we cannot use the space efficient Applicative combinators again to form parallel compositions of sequential ones because we are already in the meaning type.
We could implement composition for the `ListTo` type as follows
(<.) :: ListTo b c -> ListTo a [b] -> ListTo a c a <. b = interpret a <$> b
But if we use a result of this function as argument of <*>, then the advantage of using `ListTo` is lost. While
interpret ((,) <$> andL <*> andL)
runs in constant space,
interpret ((,) <$> (andL <. idL) <*> (andL <. idL))
does not.
The ListTransformer type supports composition in lock-step via a category instance. The meaning of `ListTransformer a b` is `[a] -> [b]` with the additional restriction that all functions `f` in the image of the interpretation function are incremental:
xs `isPrefixOf` ys ==> f xs `isPrefixOf` f ys
[..]
The Applicative instance for `ListTransformer` is different from the Applicative instance for `ListTo` (or `ListConsumer`). While
interpret ((,) <$> idL <*> idL)
is of type `[a] -> ([a],[a])`
transformList ((,) <$> idL <*> idL)
is of type `[a] -> [(a,a)]`. [..]
Ah, so ListTransformer is actually quite different from ListTo because the applicative instance yields a different type. Then again, the former can be obtained form the latter via unzip .
I have a gut feeling that the laziness provided by the `Fmap` constructor is too implicit to be useful for the kind of lock-step composition provided by ListTransformer. So I don't have high hopes that we can unify `ListConsumer` and `ListTransformer` into a single type.
Do you have an idea?
Well, the simple solution would be to restrict the type of (<.) to (<.) :: ListTo b c -> ListTransformer a b -> ListTo a c so that the second argument is guaranteed to be incremental. Of course, this is rather unsatisfactory. Fortunately, there is a nicer solution that keeps everything in the ListTo type. The main problem with Fmap is that it can be far from incremental, because we can plug in any function we like: example :: ListTo a [a] example = Fmap reverse Without an explicit guarantee that the function is incremental, we can't do anything here. But we can just add another constructor to that effect if we turn ListTo into a GADT: data ListTo a b where CaseOf :: b -> (a -> ListTo a b) -> ListTo a b Fmap :: (b -> c) -> ListTo a b -> ListTo a c FmapCons :: b -> ListTo a [b] -> ListTo a [b] The interpretation for this case is given by the morphism interpret (FmapCons x g) = fmap (x:) $ interpret g and sequential composition reads -- sequential composition -- interpret (a <. b) = interpret $ interpret a <$> b (<.) :: ListTo b c -> ListTo a [b] -> ListTo a c (CaseOf _ cons) <. (FmapCons y b) = cons y <. b (Fmap f a) <. (FmapCons y b) = Fmap f $ a <. (FmapCons y b) (FmapCons x a) <. (FmapCons y b) = FmapCons x $ a <. (FmapCons y b) a <. (CaseOf nil cons) = CaseOf (interpret a nil) ((a <.) . cons) a <. (Fmap f b) = fmap (interpret a . f) b Of course, the identity has to be redefined to make use of the new guarantee idL :: ListTo a [a] idL = caseOf [] $ \x -> FmapCons x idL I'm going to omit the new definition for the applicative instance, the full code can be found here: https://gist.github.com/1550676 With all these combinators in place, even examples like liftA2 (,) (andL <. takeL 3) (andL <. idL) should work as expected. While nice, the above solution is not perfect. One thing we can do with ListTransformer type is to perform an apply first and then do a sequential composition. a <. (b <*> c) This only works because the result of <*> is already zipped. And there is an even more worrisome observation: all this work would have been superfluous if we had *partial evaluation*, i.e. if it were possible to evaluate expressions like \xs -> f (4:xs) beneath the lambda. Then we could dispense with all the constructor yoga above and simply use a plain type ListTo a b = [a] -> b with the applicative instance instance Applicative (ListTo a) where pure b = const b (f <*> x) cs = case cs of [] -> f [] $ x [] (c:cs) -> let f' = f . (c:); x; = x . (c:) in f' `partialseq` x' `partialseq` (f' <*> x') to obtain space efficient parallel and sequential composition. In fact, by using constructors, we are essentially doing partial evaluation by hand. Best regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Hi, On Jan 2, 2012, at 2:34 PM, Heinrich Apfelmus wrote:
Without an explicit guarantee that the function is incremental, we can't do anything here. But we can just add another constructor to that effect if we turn ListTo into a GADT:
data ListTo a b where CaseOf :: b -> (a -> ListTo a b) -> ListTo a b Fmap :: (b -> c) -> ListTo a b -> ListTo a c
FmapCons :: b -> ListTo a [b] -> ListTo a [b]
I did not follow your discussion but how about using an additional GADT for the argument of Fmap, that is data Fun a b where Fun :: (a -> b) -> Fun a b Cons :: a -> Fun [a] [a] data ListTo a b where CaseOf :: b -> (a -> ListTo a b) -> ListTo a b Fmap :: Fun b c -> ListTo a b -> ListTo a c and provide a function to interpret this data type as well interpretFun :: Fun a b -> a -> b interpretFun (Fun f) = f interpretFun (Cons x) = (x:) for the sequential composition if I am not mistaken. (<.) :: ListTo b c -> ListTo a [b] -> ListTo a c (CaseOf _ cons) <. (Fmap (Cons y) b) = cons y <. b (Fmap f a) <. (Fmap g b) = Fmap f $ a <. (Fmap g b) a <. (CaseOf nil cons) = CaseOf (interpret a nil) ((a <.) . cons) a <. (Fmap f b) = fmap (interpret a . interpretFun f) b -- functor instance instance Functor (ListTo a) where fmap f = normalize . Fmap (Fun f) normalize :: ListTo a b -> ListTo a b normalize (Fmap (Fun f) (Fmap (Fun g) c)) = fmap (f . g) c normalize x = x Cheers, Jan

Jan Christiansen wrote:
On Jan 2, 2012, at 2:34 PM, Heinrich Apfelmus wrote:
Without an explicit guarantee that the function is incremental, we can't do anything here. But we can just add another constructor to that effect if we turn ListTo into a GADT:
data ListTo a b where CaseOf :: b -> (a -> ListTo a b) -> ListTo a b Fmap :: (b -> c) -> ListTo a b -> ListTo a c
FmapCons :: b -> ListTo a [b] -> ListTo a [b]
I did not follow your discussion but how about using an additional GADT for the argument of Fmap, that is
data Fun a b where Fun :: (a -> b) -> Fun a b Cons :: a -> Fun [a] [a]
data ListTo a b where CaseOf :: b -> (a -> ListTo a b) -> ListTo a b Fmap :: Fun b c -> ListTo a b -> ListTo a c
and provide a function to interpret this data type as well
interpretFun :: Fun a b -> a -> b interpretFun (Fun f) = f interpretFun (Cons x) = (x:)
for the sequential composition if I am not mistaken.
(<.) :: ListTo b c -> ListTo a [b] -> ListTo a c (CaseOf _ cons) <. (Fmap (Cons y) b) = cons y <. b (Fmap f a) <. (Fmap g b) = Fmap f $ a <. (Fmap g b) a <. (CaseOf nil cons) = CaseOf (interpret a nil) ((a <.) . cons) a <. (Fmap f b) = fmap (interpret a . interpretFun f) b
-- functor instance instance Functor (ListTo a) where fmap f = normalize . Fmap (Fun f)
normalize :: ListTo a b -> ListTo a b normalize (Fmap (Fun f) (Fmap (Fun g) c)) = fmap (f . g) c normalize x = x
Cheers, Jan
Nice, that is a lot simpler indeed, and even closer to the core of the idea. Best regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com
participants (3)
-
Heinrich Apfelmus
-
Jan Christiansen
-
Sebastian Fischer