Re: [Hs-Generics] Traversable Functor Data,or: X marks the spot

27 Jun 2008

      Hi Claus!

We have a deadline coming soon, so this is only a brief comment about your
GMap implementation.

I played a bit with your fmap' function to see whether I could find an
obvious problem with the implementation. For instance, I used this datatype:

data Tricky a = Tricky a Char deriving (Data,Typeable)

and tried to transform a value of type "Tricky Char" to "Tricky Bool". I
wanted to see whether the second argument of Tricky would be (incorrectly)
transformed to Bool. But it turned out that fmap' behaved as expected. So I
think that SYB can pass the GMap test.

On the other hand, I think that using unsafeCoerce as a way to define
generic functions is inelegant and probably bad practice due to possible
runtime failures. However, I must admit that I was surprised when I saw your
trick :).

Probably we'll get back to you around next week. Meanwhile, you may find
useful to look at the paper we wrote on comparing generic programming
libraries[1]. In particular, you can look at caveats that apply to SYB.

Cheers,

Alexey

[1]
http://www.cs.uu.nl/wiki/bin/view/Alexey/ComparingLibrariesForGenericProgram...

On Tue, Jun 24, 2008 at 8:45 PM, Claus Reinke 
wrote:
...
Dear Generics;-)
this is a repost from cvs-libraries of an experiment which you might find
of interest, and on which I'd welcome feedback. One impact being that SYB
can implement your GMap test (on which it currently defaults), among other
things (see message 1 below, the technique probably applies to Uniplate as
well?). One issue being that Data instances for non-algebraic types that
default to
doing nothing render this technique unsafe (see message 2 below).
Btw, the announcement of the unified generics library project was so long
ago that I had forgotten about it and about this list (thanks to Simon PJ
for reminding me). My interest in this is partially from my past with HaRe,
partially from wanting generic traversal support over GHC AST types, so I'm
more interested
in generic traversal/analysis style libraries.
Claus
--------- message 1
The issue at hand: can we use Data/Typeable to do what Functor and
Traversible do so easily, namely type-changing maps? Or should
there be support for deriving all of Data/Typeable/Functor/Traversible
over GHC's AST?
A drastically simplified example from David's Haddock 2 code
(Haddock.Interface.Rename) needs to do a kind of mapM
renameDoc :: Monad m => HsDoc Name -> m (HsDoc DocName)
which is straightforward with Traversible, but that is not derivable
yet (David has been working on that, though), while the usual basis
of SYB, Data/Typeable, is derivable, but all SYB transformations are based
on gfoldl, the type of which does not permit type-changing maps:
gfoldl :: (Data a)       =>(forall a1 b. (Data a1) => c (a1 -> b) -> a1
-> c b)
      -> (forall g. g -> c g)
      -> a-> c a
One could probably go from heterogeneous Data types to a
homogeneously typed tree representation, do the map there, then transform
back, but that would require additional overhead
and a type coercion for the backtransform. Also, it seems a pity that the
derived code for gfoldl can handle such maps - it is just
gfoldl's type that gets in the way (and trying to generalize that
type is a whole different kind of headache..).
While boilerplate scrappers around the globe eagerly await the
release of
  "Generalizing the type of gfold, or: to braindamage and beyond"
  (author: unknown; release date: unknown; release: unlikely;-)
I thought I'd have a go at the smaller problem of finding a way
to bypass gfoldl's type systematically in order to use its derivable
code for things like fmap and traverse. This message summarizes
how far I've got so far, and asks for comments (does this do
what it should? is it safe? assuming it can be cleaned up to do
what it should in a safe way, does this mean that deriving Data/
Typeable for GHCs AST types will be sufficient, or should we
still have Traversible as well? etc.).
First, here is an implementation of renameDoc in terms of
gfoldl and unsafeCoerce (slightly cleaned up version of what
I sent in the other thread earlier):
data Name    = Name    String deriving (Show,Data,Typeable)
  data DocName = DocName String deriving (Show,Data,Typeable)
renameDoc :: Monad m => HsDoc Name -> m (HsDoc DocName)
  renameDoc (DocIdentifier ids) =     mapM (\(Name n)->return (DocName n))
ids >>= return . DocIdentifier
  renameDoc hsDoc = n2d (gfoldl k return hsDoc)
    where k c x = ap c (mkM (d2n . renameDoc) x)
n2d :: Monad m => m (HsDoc Name) -> m (HsDoc DocName)
          n2d = unsafeCoerce
          d2n :: Monad m => m (HsDoc DocName) -> m (HsDoc Name)
          d2n = unsafeCoerce
'DocIdentifier :: [id] -> HsDoc id' is the only constructor in HsDoc that
involves the parameter type 'id', so renameDoc either does the
parameter type conversion or -for other constructors- recurses into the
subexpressions, using gfoldl to build a monadic map.
The important insight here is that gfoldl's code can handle the task,
we just pretend that our map is type-preserving to conform to gfoldl's
restrictive type, by coercing the result types (inside gfoldl, we pretend
that renameDoc returns a (HsDoc Name), which outside gfoldl, we coerce back
to (HsDoc DocName)).
Assuming that noone looks at the return types inside gfoldl (at least,
not in a way that would be affected by this change in type - SYB
does support adhoc overloading, after all, as in mkM here), this
seems to work:
testDoc =       DocAppend (DocParagraph (DocAppend (DocIdentifier [Name
"well-typed"]) DocEmpty))            (DocAppend (DocString "programs")
     (DocIdentifier [Name "don't",Name "go",Name "anywhere"]))
*Main> renameDoc testDoc
  DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"])
DocEmpty))
   (DocAppend (DocString "programs")   (DocIdentifier [DocName
"don't",DocName "go",DocName "anywhere"]))
But can we generalize this, and what about those coercions?
Can we -ideally- define something like fmap and traverse in terms of
gfoldl, and hide the uglyness in their implementations?
Well, I'll spare you (and me;-) the details of my struggle with
the type system, and just show the results, with some comments. First, the
simpler fmap:
-- "X marks the spots";-) X should be private
  data X = X deriving (Data,Typeable)
fmap' :: (Data (f X)) => (a -> b) -> f a -> f b
  fmap' f x = markTheSpots (rec (wrap f)) x
    where
    markTheSpots :: (f X -> f X) -> (f a -> f b)
    markTheSpots f = unsafeCoerce . f . unsafeCoerce
    rec ::  (Data a) => (X -> X) -> a -> a
    rec f x = (gmapT (rec f) `extT` f) x
    wrap :: (a -> b) -> (X -> X)
    wrap f = unsafeCoerce . f . unsafeCoerce
Surprisingly simple for something that seemed impossible
at first, isn't it?-) Since we're already committed (for this
experiment, at least) to some type fiddling, we can make
more constructive use of unsafeCoerce, for two purposes:
1. We wrap the function parameter f, to make it look
  like a type-preserving function, on some private type X.
2. We mark the occurrences of the type constructor f's   parameter type, by
coercing 'f a' to 'f X' and 'f X' to
  'f b'.
Then, we simply use SYB to apply f, when its type matches
the parameter, or to recurse into the subexpressions using
gmapT, otherwise. If X is private, f will only be applied
to the "functor parameter" _positions_ in 'f a', not to other
"functor parameter" _type_ occurrences in 'f a':
*Main> fmap' not $ (True,True)
  (True,False)
  *Main> fmap' not $ [True,True]
  [False,False]
*Main> fmap' (\(Name s)->(DocName s)) testDoc
  DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"])
DocEmpty))
      (DocAppend (DocString "programs")       (DocIdentifier [DocName
"don't",DocName "go",DocName "anywhere"]))
Note how we use one kind of recursion where a manual implementation of fmap
would use two: handling subexpressions
(which we'd usually do by pattern matching and expression
construction) and functor type recursion.
Ecouraged by this success, we are ready to tackle the slightly more tricky
traverse, using the same techniques:
mark the spot, wrap the worker, one kind of recursion.
Only this time, we need to take care of the applicative
plumbing as well, so it's gfoldl instead of gmapT, and
some more complex types.
We need the usual SYB type extension support, but for
Applicative, not Monad (SYB was defined before
Applicative existed, it seems..):
-- type extension over Applicative f
  mkF :: forall f a b . (Applicative f,Typeable a,Typeable b)
       => (b -> f b) -> a -> f a
  mkF f x = case gcast (F f) of { Just (F f) -> f x; Nothing -> pure x }
extF :: forall t f a b . (Typeable a,Typeable b)        => (a -> f a) ->
(b -> f b) -> (a -> f a)
  (f `extF` fspec) x = case gcast (F fspec) of { Just (F fspec) -> fspec x;
Nothing -> f x }
newtype F f x = F { unF :: x -> f x }
And here we go:
traverse' :: forall f t a b . (Applicative f,Typeable1 f,
                                 Typeable1 t,Data (t X),
                                 Typeable a)
                             => (a -> f b) -> t a -> f (t b)
  traverse' f x = markTheSpots (rec (wrap f)) x
    where
    markTheSpots :: forall a b . (t X -> f (t X)) -> (t a -> f (t b))
    markTheSpots f = unsafeCoerce . f . unsafeCoerce
    wrap :: forall a b . (a -> f b) -> (X -> f X)
    wrap f = unsafeCoerce . f . unsafeCoerce
rec :: forall x . Data x => (X -> f X) -> x -> f x
    rec f x = (gfoldl (k f) z `extF` f) x     k :: forall a b . Data a =>
(X -> f X) -> f (a -> b) -> a -> f b
    k f c x = c <*> (mkF (rec f :: Data a => a -> f a) `extF` f) x
    z c = pure c
This does seem to do the right thing, so I don't seem to be
completely on the wrong track:
*Main> traverse' (pure . not) (True,True)
  (True,False)
*Main> traverse' (pure . not) [True,True]
  [False,False]
*Main> traverse' print testDoc
  Name "well-typed"
  Name "don't"
  Name "go"
  Name "anywhere"
  DocAppend (DocParagraph (DocAppend (DocIdentifier [()]) DocEmpty))
(DocAppend (DocString "programs") (DocIdentifier [(),(),()]))
*Main> traverse' (pure) testDoc
  DocAppend (DocParagraph (DocAppend (DocIdentifier [Name "well-typed"])
DocEmpty))       (DocAppend (DocString "programs")       (DocIdentifier
[Name "don't",Name "go",Name "anywhere"]))
*Main> traverse' (pure . (\(Name s)->DocName s)) testDoc
  DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"])
DocEmpty))
   (DocAppend (DocString "programs")    (DocIdentifier [DocName
"don't",DocName "go",DocName "anywhere"]))
but I'm too battered from trying to coerce gfoldl to analyze this
properly at the moment, so I'm sending this in the hope of (a)
not having to look at gfoldl's type for a while !-) and (b) getting
some feedback (is this useful? does the extra overhead matter?..), caveats
(cyclic programs, nested traversals, escaping Xs, ..?), etc.
Over to you,
Claus
PS. the "X marks the spot" trick reminds me of the popular
  medicine topic: delivery system plus targeted activation
  (SYB plus unsafeCoerce).
------------- message 2
fmap' :: (Data (f X)) => (a -> b) -> f a -> f b
...
fmap' f x = markTheSpots (rec (wrap f)) x
    where
    markTheSpots :: (f X -> f X) -> (f a -> f b)
    markTheSpots f = unsafeCoerce . f . unsafeCoerce
    rec ::  (Data a) => (X -> X) -> a -> a
    rec f x = (gmapT (rec f) `extT` f) x
    wrap :: (a -> b) -> (X -> X)
    wrap f = unsafeCoerce . f . unsafeCoerce
..
...
1. We wrap the function parameter f, to make it look
  like a type-preserving function, on some private type X.
2. We mark the occurrences of the type constructor f's   parameter type,
by coercing 'f a' to 'f X' and 'f X' to
  'f b'.
Then, we simply use SYB to apply f, when its type matches
the parameter, or to recurse into the subexpressions using
gmapT, otherwise. If X is private, f will only be applied
to the "functor parameter" _positions_ in 'f a', not to other
"functor parameter" _type_ occurrences in 'f a'
..but f might not be applied at all, which leads to the first
issue with this technique:
I was surprised to see Data instances for (a->b) and IO a, since for such
non-algebraic types, there isn't anything to gfoldl or gmap over. And those
instances do indeed seem
to offer very little functionality (not to mention the runtime
errors..).
For fmap'/traverse', this means that f will not be applied, and so it is
not a good idea to coerce the types as if f had been applied (because the
hidden parameter could be exposed after traversal, with changed type and
unchanged
representation!).. So we need to restrict the types of fmap'/traverse'.
Which leads to two questions:
- what is the rationale for having these non-functional
  Data instances?
If one is to have 'instance Data (a->b)', is there a way
  to make it more functional?
- how can we capture algebraic types in a type (class)?
I thought that Data would do just that, being designed
  to gfoldl over concrete data constructors, but apparently
  not. And I don't really want to have a separate list of   all the types
for which Data works, or of all the types   for which Data doesn't quite
work.
Claus
ps. What is the right list for this topic?
_______________________________________________
Generics mailing list
Generics@haskell.org
http://www.haskell.org/mailman/listinfo/generics

Re: [Hs-Generics] Traversable Functor Data,or: X marks the spot

Alexey Rodriguez