encoding for least fixpoint

Hello, I am trying to understand the definition of (co)inductive types in Haskell. After reading the beginning of Vene's thesis [1] I was happy, because I understood the definition of the least fixpoint: newtype Mu f = In (f (Mu f)). But this definition coincides with his definition of the greatest fixpoint newtype Nu f = Wrap (f (Nu f)), which in reality is not true (e. g. for f x = a * x -- the 'stream functor' the type Mu f should be empty, isn't it?). Then I stumbled over a blog entry of Shin-Cheng Mu [2] and from there over an article of Wadler [3], where the least fixpoint is encoded as Lfix X. F X = All X. (F X -> X) -> X. and the greatest fixpoint as Gfix X. F X = Exists X. (X -> F X) * X. I would like to understand these definitions, or get an intuition about their meaning. Could somebody give some explanation or a pointer to a place where I can find one? Thanks a lot in advance, ben [1] http://www.cs.ut.ee/~varmo/papers/thesis.pdf [2] http://www.iis.sinica.edu.tw/~scm/2007/encoding-inductive-and-coinductive-ty... [3]http://homepages.inf.ed.ac.uk/wadler/papers/free-rectypes/free-rectypes.txt

On Tuesday 17 March 2009 7:36:21 pm ben wrote:
I am trying to understand the definition of (co)inductive types in Haskell. After reading the beginning of Vene's thesis [1] I was happy, because I understood the definition of the least fixpoint:
newtype Mu f = In (f (Mu f)).
But this definition coincides with his definition of the greatest fixpoint
newtype Nu f = Wrap (f (Nu f)),
which in reality is not true (e. g. for f x = a * x -- the 'stream functor' the type Mu f should be empty, isn't it?).
This is true in a categorical semantics where initial algebras and final coalbebras are distinct, like in a total language that gets its semantics from sets and total functions thereon. However, Haskell gets its semantics (modulo some potential weirdness in IO that people have been discussing lately) from categories of partial orders and monotone functions. It turns out that you can show that initial algebras and final coalgebras coincide in such a category, and so least and greatest fixed points are the same in Haskell (I don't know how to demonstrate this; my fairly cursory searching hasn't revealed any very good online references on DCPO categories and algebraic semantics), which is why there's no distinction in the data declaration (as opposed to, say, Agda).
Then I stumbled over a blog entry of Shin-Cheng Mu [2] and from there over an article of Wadler [3], where the least fixpoint is encoded as
Lfix X. F X = All X. (F X -> X) -> X.
and the greatest fixpoint as
Gfix X. F X = Exists X. (X -> F X) * X.
I would like to understand these definitions, or get an intuition about their meaning. Could somebody give some explanation or a pointer to a place where I can find one?
You can glean some of this from the initial algebra/final coalgebra definitions of (co)data. The least fixed point Mu F of F is an F-algebra, which means there's an operation: in : F (Mu F) -> Mu F And Mu F is initial in the category of F-algebras, which means that for any other F-algebra (X, f : F X -> X), there's a unique algebra homomorphism: cata_f : Mu F -> X such that: cata_f . in = f . map cata_f. And you can finesse that in Haskell and with higher order functions and make it: cata :: forall x. (F x -> x) -> Mu F -> x And, I suppose, since initial algebras Mu F correspond uniquely with such parameterized algebra homomorphisms, you're safe in *representing* them as such, which gives you: Mu F ~ forall x. (F x -> x) -> x which is what the Lfix equation gets you above. In the other case you have final coalgebras Nu F, which come with a coalgebra operation: out : Nu F -> F (Nu F) and for any other F-coalgebra (X, f : X -> F X), a unique homomorphism: ana_f : X -> Nu F which, parameterizing by f in the same way gets us: ana :: forall x. (x -> F x) -> x -> Nu F Which results in a Nu F this time. So suppose we took the definition: ana = C for some Haskell constructor C. Well, that means we need a constructor with the type of ana, and that is exactly the existential: data Nu f = forall x. C (x -> f x) x -- exists x. (x -> F x) * x which gets you the Gfix equation above. I realize some of that was hand-wavey, and maybe someone with more expertise could flesh it out a bit, but hopefully that helps. Cheers, -- Dan

On Wed, 2009-03-18 at 03:22 -0400, Dan Doel wrote:
On Tuesday 17 March 2009 7:36:21 pm ben wrote:
I am trying to understand the definition of (co)inductive types in Haskell. After reading the beginning of Vene's thesis [1] I was happy, because I understood the definition of the least fixpoint:
newtype Mu f = In (f (Mu f)).
But this definition coincides with his definition of the greatest fixpoint
newtype Nu f = Wrap (f (Nu f)),
which in reality is not true (e. g. for f x = a * x -- the 'stream functor' the type Mu f should be empty, isn't it?).
This is true in a categorical semantics where initial algebras and final coalbebras are distinct, like in a total language that gets its semantics from sets and total functions thereon. However, Haskell gets its semantics (modulo some potential weirdness in IO that people have been discussing lately) from categories of partial orders and monotone functions. It turns out that you can show that initial algebras and final coalgebras coincide in such a category, and so least and greatest fixed points are the same in Haskell (I don't know how to demonstrate this; my fairly cursory searching hasn't revealed any very good online references on DCPO categories and algebraic semantics), which is why there's no distinction in the data declaration (as opposed to, say, Agda).
You can explain it to yourself (not a proof) by writing out the example for lists and co-lists along with fold for the list and unfold function for the co-list. Then write conversion functions between them. You can go from lists to co-lists using fold or unfold, but going from co-list to list requires general recursion. And that's the key of course. In Agda or something that distinguishes inductive and co-inductive types you could do the first two conversions but not the conversion in the other direction. Duncan

On Wednesday 18 March 2009 5:28:35 am Duncan Coutts wrote:
You can explain it to yourself (not a proof) by writing out the example for lists and co-lists along with fold for the list and unfold function for the co-list. Then write conversion functions between them. You can go from lists to co-lists using fold or unfold, but going from co-list to list requires general recursion. And that's the key of course. In Agda or something that distinguishes inductive and co-inductive types you could do the first two conversions but not the conversion in the other direction.
Yeah, I know how to show it can be done in Haskell. I meant more along the lines of a proof (sketch) starting from a definition of a CPO category and ending with the fact that initial algebras and terminal coalgebras are the same. Because, of course, in addition to going from, "we have general recursion," to, "least and greatest fixed points are the same," we can do the opposite: fix :: (a -> a) -> a fix = cata (\(f,fixf) -> f fixf) . ana (\f -> (f,f)) I suppose it's obvious that the least fixed point of 'F x = a*x' isn't empty in a category like the one people use to model Haskell, because every type is inhabited by at least one element, _|_. Once you have that as a base, you can construct plenty of elements in ways that would be acceptable even in a total language: (1, _|_), (2, (1, _|_)), etc. What remains is to show that you can get the infinite elements somehow (and that the greatest fixed point has the above values, but seems less radical). Anyhow, lastly I wanted to mention to ben that it's probably less hand-wavey to explain where you get: Mu F = forall x. (F x -> x) -> x from by mentioning the paper Build, Augment and Destroy. Universally. That develops a slightly tweaked semantics that takes as its basis, for some functor F, essentially pairs (Mu' F, fold') where Mu' F is an F-algebra, and fold' is obviously similar in function to the cata you get from an initial algebra. You define a data type as a limit for such things, which gets you a unique morphism that, when you finesse it back into haskell types, gives you: build :: (forall x. (F x -> x) -> x) -> Mu F And the paper of course goes on to show that initial algebras also fit with this construction, and vice versa, so the two formulations are equivalent. But, the above type looks a lot like the one we had for terminal coalgebras in my last mail, so the definition: newtype Mu f = Build (forall x. (f x -> x) -> x) seems in order. I realize that was still pretty vague, but the paper gives a rigorous treatment which should hopefully clear up the details. (There's an analogous construction for greatest fixed points in the paper, but it gets you destroy :: (forall x. (x -> F x) -> x -> c) -> Nu f -> c which gives you interesting ways to take apart coinductive objects, not build them. Of course, I suppose you could recognize that this can be tweaked into: destroy :: ((exists x. (x -> F x) * x) -> c) -> Nu f -> c and further suppose that the above is just an identity function, modulo some wrapping to make the type nice, which would get you to a similar place.) Cheers, -- Dan

On Tue, Mar 17, 2009 at 4:36 PM, ben
Then I stumbled over a blog entry of Shin-Cheng Mu [2] and from there over an article of Wadler [3], where the least fixpoint is encoded as
Lfix X. F X = All X. (F X -> X) -> X.
and the greatest fixpoint as
Gfix X. F X = Exists X. (X -> F X) * X.
I would like to understand these definitions, or get an intuition about their meaning.
Could somebody give some explanation or a pointer to a place where I can find one?
This is interesting. So, there are two things going on. First, we only allow "x" to appear in positive form in f; for standard data types in Haskell, this is equivalent to saying that you can write an instance of Functor for f. (I have an interesting proof of this which I think I will write up for the Monad.Reader) Once you have that, then you can get to defining fixed points on those types. Lfix encodes a fixed point by its fold. For example, consider this type:
data Cell x xs = Nil | Cons x xs
Lfix (Cell a) is then equivalent to to (forall b. (Cell a b -> b) -> b), which, if you expand the function argument by the case analysis on Cell, you get (forall b. b -> (a -> b -> b) -> b); that is, foldr f z xs = xs f z. This is where the "for free" comes in to the description; it's like the encoding of datatypes in System F without actual datatypes/case. Let "Pair a b" be an abbreviation for "forall c. (a -> b -> c)", then you have: pair :: forall A B. a -> b -> Pair A B pair = /\A B -> \a b -> /\C -> \k -> k a b But in this case, instead of building simple datatypes, we are instead building *recursive* datatypes, without actually requiring fixed points in the language! Gfix is similar, but instead of encoding a type by its fold, it encodes it as a state machine. So a Gfix ((,) Integer) stream of the natural numbers could look something like this: Gfix (\x -> (x, x+1)) 0; the first element of the pair is the stream result, and the second is the existential state which is used to generate the next pair. For reference, here's some code implementing this concept. Keep in mind that we are working in a subset of Haskell without fixed points; so no recursion is allowed. {-# LANGUAGE ExistentialQuantification, RankNTypes #-} module FixTypes where newtype Lfix f = Lfix (forall x. (f x -> x) -> x) l_in :: Functor f => f (Lfix f) -> Lfix f l_in fl = Lfix (\k -> k (fmap (\(Lfix j) -> j k) fl)) -- derivation of l_in was complicated! -- l_out :: Functor f => Lfix f -> f (Lfix f) instance Functor (Either a) where fmap f (Left a) = Left a fmap f (Right x) = Right (f x) test_l :: Lfix (Either Int) test_l = Lfix (\k -> k $ Right $ k $ Right $ k $ Left 2) data Gfix f = forall x. Gfix (x -> f x) x g_out :: Functor f => Gfix f -> f (Gfix f) g_out (Gfix f s) = fmap (\x -> Gfix f x) (f s) -- g_in :: Functor f => f (Gfix f) -> Gfix f instance Functor ((,) a) where fmap f ~(a,x) = (a, f x) test_g :: Gfix ((,) Integer) test_g = Gfix (\x -> (x, x+1)) 0
[3]http://homepages.inf.ed.ac.uk/wadler/papers/free-rectypes/free-rectypes.txt
There's something from Wadler's draft that doesn't make sense to me. He says:
This introduces a new type, T = Lfix X. F X, satisfying the isomorphism T ~ F T. Note that it is an isomorphism, not an equality: the type comes equipped with functions in : T -> F T and out : F T -> T.
While I was able to derive "in" for Lfix, and "out" for Gfix, I don't think it's possible to derive a generic "out" for Lfix or "in" for Gfix; maybe such a function *can* always be generated (specific to a particular type), but I don't think it's possible to do so while just relying on Functor. Perhaps stronger generic programming methods are necessary. -- ryan

On Wed, Mar 18, 2009 at 5:15 AM, Ryan Ingram
newtype Lfix f = Lfix (forall x. (f x -> x) -> x)
l_in :: Functor f => f (Lfix f) -> Lfix f l_in fl = Lfix (\k -> k (fmap (\(Lfix j) -> j k) fl)) -- derivation of l_in was complicated!
I found l_in easiest to write in terms of cata and build. cata :: (f a -> a) -> Lfix f -> a cata f (Lfix t) = f t build :: (forall x. (f x -> x) -> c -> x) -> c -> Lfix f build t c = Lfix (\f -> t f c) l_in :: Functor f => f (Lfix f) -> Lfix f l_in = build (\f -> f . fmap (cata f)) And, dually, ana :: (a -> f a) -> a -> Gfix f ana f a = Gfix f a destroy :: (forall x. (x -> f x) -> x -> c) -> Gfix f -> c destroy t (Gfix f x) = t f x g_out :: Functor f => Gfix f -> f (Gfix f) g_out = destroy (\f -> fmap (ana f) . f)
There's something from Wadler's draft that doesn't make sense to me. He says:
This introduces a new type, T = Lfix X. F X, satisfying the isomorphism T ~ F T. Note that it is an isomorphism, not an equality: the type comes equipped with functions in : T -> F T and out : F T -> T.
While I was able to derive "in" for Lfix, and "out" for Gfix, I don't think it's possible to derive a generic "out" for Lfix or "in" for Gfix; maybe such a function *can* always be generated (specific to a particular type), but I don't think it's possible to do so while just relying on Functor. Perhaps stronger generic programming methods are necessary.
l_out :: Functor f => Lfix f -> f (Lfix f)
l_out = cata (fmap l_in)
g_in :: Functor f => f (Gfix f) -> Gfix f
g_in = ana (fmap g_out)
--
Dave Menendez

On Wed, Mar 18, 2009 at 8:10 AM, David Menendez
l_out :: Functor f => Lfix f -> f (Lfix f) l_out = cata (fmap l_in)
g_in :: Functor f => f (Gfix f) -> Gfix f g_in = ana (fmap g_out)
Well, you just blew my mind. I had an informal proof in my head of why g_in shouldn't be possible to write, but I must be missing something huge. Looks like I need to get out some paper and reduce this by hand, because there is some black magic going on here :) -- ryan

On Wed, Mar 18, 2009 at 8:10 AM, David Menendez
l_out :: Functor f => Lfix f -> f (Lfix f) l_out = cata (fmap l_in)
g_in :: Functor f => f (Gfix f) -> Gfix f g_in = ana (fmap g_out)
OK, I understand these now. But they both seem to have significant performance implications, which I think are related to the "tail-in-terms-of-foldr" problem. Consider this definition: safeTail [] = [] safeTail (x:xs) = xs The simplest way to write this directly with foldr is: safeTail' = fst . foldr f ([], []) where f x (t, l) = (l, x:l) But the difference is that safeTail' here reconstructs the list from scratch, instead of reusing the existing data as safeTail does. This means that (safeTail . safeTail . safeTail ...) takes O(n^2) time instead of O(n). Similarily, l_out and g_in both seem to add many administrative redexes to every element of the object being manipulated; consider gid = g_in . g_out then consider (gid . gid . gid . gid) x The result gets filled up quickly with administrative redexes that look like Gfix (fmap g_out) $ g_out $ Gfix (fmap g_out) $ g_out $ Gfix ... Is there a way to solve this problem and still maintain the nice totality guarantees that you get from System F without fix? -- ryan

On Wednesday 18 March 2009 5:15:32 am Ryan Ingram wrote:
There's something from Wadler's draft that doesn't make sense to me. He says:
This introduces a new type, T = Lfix X. F X, satisfying the isomorphism T ~ F T. Note that it is an isomorphism, not an equality: the type comes equipped with functions in : T -> F T and out : F T -> T.
While I was able to derive "in" for Lfix, and "out" for Gfix, I don't think it's possible to derive a generic "out" for Lfix or "in" for Gfix; maybe such a function *can* always be generated (specific to a particular type), but I don't think it's possible to do so while just relying on Functor. Perhaps stronger generic programming methods are necessary.
The isomorphism comes from the fact that f (Nu/Mu f) is an f-(co)algebra. fmap l_in :: Functor f => f (f (Mu f)) -> f (Mu f) fmap g_out :: Functor f => f (Nu f) -> f (f (Nu f)) Because of this and initiality/finality, there is a unique morphism from Mu f to f (Mu f), and from f (Nu f) to Nu f, namely: cata (fmap l_in) :: Functor f => Mu f -> f (Mu f) ana (fmap g_out) :: Functor f => f (Nu f) -> Nu f where cata f (Lfix k) = k f ana g x = Gfix g x Of course, this isomorphism is subject to caveats about bottom and seq in Haskell. -- Dan

Hi Ben,
But this definition coincides with his definition of the greatest fixpoint
In Haskell the least and greatest fixed points coincide. (Categorically, this is called "algebraically compact" I think). You can still "fake" coinductive types to some degree by consistently using unfolds rather than folds.
Then I stumbled over a blog entry of Shin-Cheng Mu [2] and from there over an article of Wadler [3], where the least fixpoint is encoded as
Lfix X. F X = All X. (F X -> X) -> X.
and the greatest fixpoint as
Gfix X. F X = Exists X. (X -> F X) * X.
I would like to understand these definitions, or get an intuition about their meaning.
So here's my attempt at an explanation. For every least fixed point of a functor: data Mu f = In (f (Mu f)) you can define a fold: fold :: forall a . (f a -> a) -> Mu f -> a fold algebra (In t) = algebra (fmap (fold algebra) t) Now your definition of Lfix above basically identifies the data type with all possible folds over it. (I suspect you will need some parametricity result to show that this is really an isomorphism) For codata, instead of having a fold you get an unfold: unfold :: forall a . (a -> f a) -> a -> Nu f unfold coalg x = In (fmap (unfold coalg) (g x)) And your Gfix type above identifies every codata type with its unfold. To see this, you need to realise that: forall a . (a -> f a) -> a -> Nu f is isomorphic to: forall a . (a -> f a , a) -> Nu f is isomporphic to: (exists a . (a -> f a, a)) -> Nu f which gives you one direction of the iso. Now in case you think this is all airy-fairy category theory, there's a really nice paper "Stream fusion: from lists to streams to nothing at all" that shows how to use this technology to get some serious speedups over all kinds of list-processing functions. Hope this helps, Wouter

Thanks a lot to all of you for your help. It took some time for me to realize that the only difference between Vene and Wadler is in fact, that Wadler has an explicit representation for the fixpoints - which answers the question of existence. I will spend some more time on digesting all the information :-) and will try to find some information about algebraic compacity. Greetings ben
participants (7)
-
ben
-
Benedikt Ahrens
-
Dan Doel
-
David Menendez
-
Duncan Coutts
-
Ryan Ingram
-
Wouter Swierstra