
Hi Claus!
We have a deadline coming soon, so this is only a brief comment about your
GMap implementation.
I played a bit with your fmap' function to see whether I could find an
obvious problem with the implementation. For instance, I used this datatype:
data Tricky a = Tricky a Char deriving (Data,Typeable)
and tried to transform a value of type "Tricky Char" to "Tricky Bool". I
wanted to see whether the second argument of Tricky would be (incorrectly)
transformed to Bool. But it turned out that fmap' behaved as expected. So I
think that SYB can pass the GMap test.
On the other hand, I think that using unsafeCoerce as a way to define
generic functions is inelegant and probably bad practice due to possible
runtime failures. However, I must admit that I was surprised when I saw your
trick :).
Probably we'll get back to you around next week. Meanwhile, you may find
useful to look at the paper we wrote on comparing generic programming
libraries[1]. In particular, you can look at caveats that apply to SYB.
Cheers,
Alexey
[1]
http://www.cs.uu.nl/wiki/bin/view/Alexey/ComparingLibrariesForGenericProgram...
On Tue, Jun 24, 2008 at 8:45 PM, Claus Reinke
Dear Generics;-)
this is a repost from cvs-libraries of an experiment which you might find of interest, and on which I'd welcome feedback. One impact being that SYB can implement your GMap test (on which it currently defaults), among other things (see message 1 below, the technique probably applies to Uniplate as well?). One issue being that Data instances for non-algebraic types that default to doing nothing render this technique unsafe (see message 2 below).
Btw, the announcement of the unified generics library project was so long ago that I had forgotten about it and about this list (thanks to Simon PJ for reminding me). My interest in this is partially from my past with HaRe, partially from wanting generic traversal support over GHC AST types, so I'm more interested in generic traversal/analysis style libraries.
Claus
--------- message 1
The issue at hand: can we use Data/Typeable to do what Functor and Traversible do so easily, namely type-changing maps? Or should there be support for deriving all of Data/Typeable/Functor/Traversible over GHC's AST?
A drastically simplified example from David's Haddock 2 code (Haddock.Interface.Rename) needs to do a kind of mapM
renameDoc :: Monad m => HsDoc Name -> m (HsDoc DocName)
which is straightforward with Traversible, but that is not derivable yet (David has been working on that, though), while the usual basis of SYB, Data/Typeable, is derivable, but all SYB transformations are based on gfoldl, the type of which does not permit type-changing maps:
gfoldl :: (Data a) =>(forall a1 b. (Data a1) => c (a1 -> b) -> a1 -> c b) -> (forall g. g -> c g) -> a-> c a
One could probably go from heterogeneous Data types to a homogeneously typed tree representation, do the map there, then transform back, but that would require additional overhead and a type coercion for the backtransform. Also, it seems a pity that the derived code for gfoldl can handle such maps - it is just gfoldl's type that gets in the way (and trying to generalize that type is a whole different kind of headache..).
While boilerplate scrappers around the globe eagerly await the release of "Generalizing the type of gfold, or: to braindamage and beyond" (author: unknown; release date: unknown; release: unlikely;-)
I thought I'd have a go at the smaller problem of finding a way to bypass gfoldl's type systematically in order to use its derivable code for things like fmap and traverse. This message summarizes how far I've got so far, and asks for comments (does this do what it should? is it safe? assuming it can be cleaned up to do what it should in a safe way, does this mean that deriving Data/ Typeable for GHCs AST types will be sufficient, or should we still have Traversible as well? etc.).
First, here is an implementation of renameDoc in terms of gfoldl and unsafeCoerce (slightly cleaned up version of what I sent in the other thread earlier):
data Name = Name String deriving (Show,Data,Typeable) data DocName = DocName String deriving (Show,Data,Typeable)
renameDoc :: Monad m => HsDoc Name -> m (HsDoc DocName) renameDoc (DocIdentifier ids) = mapM (\(Name n)->return (DocName n)) ids >>= return . DocIdentifier renameDoc hsDoc = n2d (gfoldl k return hsDoc) where k c x = ap c (mkM (d2n . renameDoc) x)
n2d :: Monad m => m (HsDoc Name) -> m (HsDoc DocName) n2d = unsafeCoerce d2n :: Monad m => m (HsDoc DocName) -> m (HsDoc Name) d2n = unsafeCoerce
'DocIdentifier :: [id] -> HsDoc id' is the only constructor in HsDoc that involves the parameter type 'id', so renameDoc either does the parameter type conversion or -for other constructors- recurses into the subexpressions, using gfoldl to build a monadic map. The important insight here is that gfoldl's code can handle the task, we just pretend that our map is type-preserving to conform to gfoldl's restrictive type, by coercing the result types (inside gfoldl, we pretend that renameDoc returns a (HsDoc Name), which outside gfoldl, we coerce back to (HsDoc DocName)).
Assuming that noone looks at the return types inside gfoldl (at least, not in a way that would be affected by this change in type - SYB does support adhoc overloading, after all, as in mkM here), this seems to work:
testDoc = DocAppend (DocParagraph (DocAppend (DocIdentifier [Name "well-typed"]) DocEmpty)) (DocAppend (DocString "programs") (DocIdentifier [Name "don't",Name "go",Name "anywhere"]))
*Main> renameDoc testDoc DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"]) DocEmpty)) (DocAppend (DocString "programs") (DocIdentifier [DocName "don't",DocName "go",DocName "anywhere"]))
But can we generalize this, and what about those coercions? Can we -ideally- define something like fmap and traverse in terms of gfoldl, and hide the uglyness in their implementations?
Well, I'll spare you (and me;-) the details of my struggle with the type system, and just show the results, with some comments. First, the simpler fmap:
-- "X marks the spots";-) X should be private data X = X deriving (Data,Typeable)
fmap' :: (Data (f X)) => (a -> b) -> f a -> f b fmap' f x = markTheSpots (rec (wrap f)) x where markTheSpots :: (f X -> f X) -> (f a -> f b) markTheSpots f = unsafeCoerce . f . unsafeCoerce rec :: (Data a) => (X -> X) -> a -> a rec f x = (gmapT (rec f) `extT` f) x wrap :: (a -> b) -> (X -> X) wrap f = unsafeCoerce . f . unsafeCoerce
Surprisingly simple for something that seemed impossible at first, isn't it?-) Since we're already committed (for this experiment, at least) to some type fiddling, we can make more constructive use of unsafeCoerce, for two purposes:
1. We wrap the function parameter f, to make it look like a type-preserving function, on some private type X.
2. We mark the occurrences of the type constructor f's parameter type, by coercing 'f a' to 'f X' and 'f X' to 'f b'.
Then, we simply use SYB to apply f, when its type matches the parameter, or to recurse into the subexpressions using gmapT, otherwise. If X is private, f will only be applied to the "functor parameter" _positions_ in 'f a', not to other "functor parameter" _type_ occurrences in 'f a':
*Main> fmap' not $ (True,True) (True,False) *Main> fmap' not $ [True,True] [False,False]
*Main> fmap' (\(Name s)->(DocName s)) testDoc DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"]) DocEmpty)) (DocAppend (DocString "programs") (DocIdentifier [DocName "don't",DocName "go",DocName "anywhere"]))
Note how we use one kind of recursion where a manual implementation of fmap would use two: handling subexpressions (which we'd usually do by pattern matching and expression construction) and functor type recursion.
Ecouraged by this success, we are ready to tackle the slightly more tricky traverse, using the same techniques: mark the spot, wrap the worker, one kind of recursion. Only this time, we need to take care of the applicative plumbing as well, so it's gfoldl instead of gmapT, and some more complex types.
We need the usual SYB type extension support, but for Applicative, not Monad (SYB was defined before Applicative existed, it seems..):
-- type extension over Applicative f mkF :: forall f a b . (Applicative f,Typeable a,Typeable b) => (b -> f b) -> a -> f a mkF f x = case gcast (F f) of { Just (F f) -> f x; Nothing -> pure x }
extF :: forall t f a b . (Typeable a,Typeable b) => (a -> f a) -> (b -> f b) -> (a -> f a) (f `extF` fspec) x = case gcast (F fspec) of { Just (F fspec) -> fspec x; Nothing -> f x }
newtype F f x = F { unF :: x -> f x }
And here we go:
traverse' :: forall f t a b . (Applicative f,Typeable1 f, Typeable1 t,Data (t X), Typeable a) => (a -> f b) -> t a -> f (t b) traverse' f x = markTheSpots (rec (wrap f)) x where markTheSpots :: forall a b . (t X -> f (t X)) -> (t a -> f (t b)) markTheSpots f = unsafeCoerce . f . unsafeCoerce wrap :: forall a b . (a -> f b) -> (X -> f X) wrap f = unsafeCoerce . f . unsafeCoerce
rec :: forall x . Data x => (X -> f X) -> x -> f x rec f x = (gfoldl (k f) z `extF` f) x k :: forall a b . Data a => (X -> f X) -> f (a -> b) -> a -> f b k f c x = c <*> (mkF (rec f :: Data a => a -> f a) `extF` f) x z c = pure c
This does seem to do the right thing, so I don't seem to be completely on the wrong track:
*Main> traverse' (pure . not) (True,True) (True,False)
*Main> traverse' (pure . not) [True,True] [False,False]
*Main> traverse' print testDoc Name "well-typed" Name "don't" Name "go" Name "anywhere" DocAppend (DocParagraph (DocAppend (DocIdentifier [()]) DocEmpty)) (DocAppend (DocString "programs") (DocIdentifier [(),(),()]))
*Main> traverse' (pure) testDoc DocAppend (DocParagraph (DocAppend (DocIdentifier [Name "well-typed"]) DocEmpty)) (DocAppend (DocString "programs") (DocIdentifier [Name "don't",Name "go",Name "anywhere"]))
*Main> traverse' (pure . (\(Name s)->DocName s)) testDoc DocAppend (DocParagraph (DocAppend (DocIdentifier [DocName "well-typed"]) DocEmpty)) (DocAppend (DocString "programs") (DocIdentifier [DocName "don't",DocName "go",DocName "anywhere"]))
but I'm too battered from trying to coerce gfoldl to analyze this properly at the moment, so I'm sending this in the hope of (a) not having to look at gfoldl's type for a while !-) and (b) getting some feedback (is this useful? does the extra overhead matter?..), caveats (cyclic programs, nested traversals, escaping Xs, ..?), etc.
Over to you, Claus
PS. the "X marks the spot" trick reminds me of the popular medicine topic: delivery system plus targeted activation (SYB plus unsafeCoerce).
------------- message 2
fmap' :: (Data (f X)) => (a -> b) -> f a -> f b
fmap' f x = markTheSpots (rec (wrap f)) x where markTheSpots :: (f X -> f X) -> (f a -> f b) markTheSpots f = unsafeCoerce . f . unsafeCoerce rec :: (Data a) => (X -> X) -> a -> a rec f x = (gmapT (rec f) `extT` f) x wrap :: (a -> b) -> (X -> X) wrap f = unsafeCoerce . f . unsafeCoerce
..
1. We wrap the function parameter f, to make it look like a type-preserving function, on some private type X.
2. We mark the occurrences of the type constructor f's parameter type, by coercing 'f a' to 'f X' and 'f X' to 'f b'.
Then, we simply use SYB to apply f, when its type matches the parameter, or to recurse into the subexpressions using gmapT, otherwise. If X is private, f will only be applied to the "functor parameter" _positions_ in 'f a', not to other "functor parameter" _type_ occurrences in 'f a'
..but f might not be applied at all, which leads to the first issue with this technique:
I was surprised to see Data instances for (a->b) and IO a, since for such non-algebraic types, there isn't anything to gfoldl or gmap over. And those instances do indeed seem to offer very little functionality (not to mention the runtime errors..). For fmap'/traverse', this means that f will not be applied, and so it is not a good idea to coerce the types as if f had been applied (because the hidden parameter could be exposed after traversal, with changed type and unchanged representation!).. So we need to restrict the types of fmap'/traverse'. Which leads to two questions:
- what is the rationale for having these non-functional Data instances?
If one is to have 'instance Data (a->b)', is there a way to make it more functional?
- how can we capture algebraic types in a type (class)?
I thought that Data would do just that, being designed to gfoldl over concrete data constructors, but apparently not. And I don't really want to have a separate list of all the types for which Data works, or of all the types for which Data doesn't quite work.
Claus
ps. What is the right list for this topic?
_______________________________________________ Generics mailing list Generics@haskell.org http://www.haskell.org/mailman/listinfo/generics