Type hackery help needed!

Niklas Broberg

4 Aug 2006 4 Aug '06

11:19 p.m.

Hi fellow Haskelleers, I have a tricky problem that I just can't seem to solve. The problem is one of unresolved overloading, much like the (show . read) issue, but unlike that particular problem I feel there should be a solution to the one I'm wrestling with. I've tried to strip away all the unnecessary bits so the following will be a bit abstract, but it shows the gist of the problem. I have two overloaded functions - one I call 'build' that creates values from subparts, and one I call 'embed' that turns values into a form appropriate to call 'build' on. What they really represent is creating values of some XML tree structure, with embed making sure that the children are of the right type. Conceptually, we could imagine that these two functions to be defined as

...

class Build x c | x -> c, c -> x where build :: String -> [c] -> x

class Embed a b where embed :: a -> b

They would be used together as in e.g.

...

p c = build "p" [embed c]

This works pretty well, the fundep x -> c tells me what b should be, assuming I infer a proper type for the result of the composition. The type of p is then p :: (Embed a c, Build x c) => a -> x where the c in the middle is determined by x via the fundep. My problems arise because I want to start nesting these to form tree structures, as in

...

tree = p (p "foo")

Expanding the definition of p, the argument to the outer call to build is now embed $ build "p" [embed "foo"] :: (Embed String c, Build x c, Embed x x1) => x1 Through the fundep on the outer build I can infer what x1 should be, but there's no way I can infer x (without inserting an explicit type signature which is out of the question). This problem is probably unsolvable in general, but for my particular problem there are some restrictions that makes me feel there should be a clever way of working with classes and fundeps to make this all work out. I just can't seem to find one. These are the rules that the setup must obey: * A value of any type should be embeddable inside a build expression of any result type, i.e. a -> b or b -> a cannot hold on Embed in general. * The exception to the above is that if an expression using 'build' is embedded inside an outer 'build', as in 'tree' above, the inner build should have the same result type as the outer build (in a sense forming b -> a only for types instantiating Build). In other words, children of an xml element must be of the same type as their parent, even before the call to embed. * However, it would be nice, but probably not crucial, if by using explicit type signatures where I want to, I could actually embed values of "other" xml types than the outer one I am embedding them in, and letting embed convert them to xml of the correct type. I suspect this cannot be. The types of build and embed are negotiable, and certainly the type classes, as long as the two can be used together in roughly the same way as indicated above. The calls to build and embed will be autogenerated by a preprocessor from a HSP-style XML syntax, e.g. <p><% c %></p> <==> build "p" [embed c] and for this reason any solution *must* be syntactically expressible as expressions on a form similar to the above, but those expressions can be arbitrarily convoluted. For instance in my failed attempts so far I have used let expressions around build to form self-recursive versions to "pass a type" down through the evaluation, as in p c = let x = build "p" [embed x c] in x This is an ok solution syntactically, if only it solved the problem, I still can't see how to propagate it to the next level of build. :-( Is there anyone out there with the proper type class fu who can see a way out for me? Is this even possible to do at all? If yes, tell me please, and if not, I would be most interested in seeing why it cannot work. Any and all comments are welcome. Thanks for reading this long :-) /Niklas

Show replies by date

oleg＠pobox.com

5 Aug 5 Aug

9:43 a.m.

...

A value of any type should be embeddable inside a build expression of any result type, i.e. a -> b or b -> a cannot hold on Embed in general.

It seems you might benefit from local functional dependencies, which are asserted per instance rather than for the whole class. They are explained in http://pobox.com/~oleg/ftp/Haskell/typecast.html Incidentally, that web page's source also gives an illustration of their use: http://pobox.com/~oleg/ftp/Haskell/typecast.hs The page itself is written in HSXML, which had to deal with a similar problem: in HSXML, 'p' is a polyvariadic function, which has to accept as many strings as the user cares to specify -- and in addition, arbitrary HSXML fragments like 'em', 'code', 'cite', etc. The latter are too built by polyvariadic functions. So, at some point I had to force the constraints and tell the functions that their arguments are over and they better give some data structure. Incidentally, that's where [] notation comes in quite handy. The brackets are not just the embellishment; by lucky accident, they actually force the constraints. BTW, the rendering of the above HSXML code checks local file links (and inserts the file sizes into the HTML code, while at it). There is a version of the HSXML rendering that, in addition to formatting Haskell code, passes it to a Haskell system to verify its typing -- and to optionally run it as well. The complete source code for the renderers is http://pobox.com/~oleg/ftp/Haskell/HSXML.tar.gz

Niklas Broberg

10:04 a.m.

Yes, this will surely do the trick, thanks a lot! :-) I got as far as defining a TypeEq class myself in one of my attempts, trying to trick the inference engine, but now seeing the full ingenuity of the TypeCast class I realize how far from the solution I really was. Again, thanks a million! /Niklas On 8/5/06, oleg@pobox.com wrote:

...

...
A value of any type should be embeddable inside a build expression of any result type, i.e. a -> b or b -> a cannot hold on Embed in general.

It seems you might benefit from local functional dependencies, which are asserted per instance rather than for the whole class. They are explained in

http://pobox.com/~oleg/ftp/Haskell/typecast.html

Incidentally, that web page's source also gives an illustration of their use: http://pobox.com/~oleg/ftp/Haskell/typecast.hs

The page itself is written in HSXML, which had to deal with a similar problem: in HSXML, 'p' is a polyvariadic function, which has to accept as many strings as the user cares to specify -- and in addition, arbitrary HSXML fragments like 'em', 'code', 'cite', etc. The latter are too built by polyvariadic functions. So, at some point I had to force the constraints and tell the functions that their arguments are over and they better give some data structure. Incidentally, that's where [] notation comes in quite handy. The brackets are not just the embellishment; by lucky accident, they actually force the constraints. BTW, the rendering of the above HSXML code checks local file links (and inserts the file sizes into the HTML code, while at it). There is a version of the HSXML rendering that, in addition to formatting Haskell code, passes it to a Haskell system to verify its typing -- and to optionally run it as well. The complete source code for the renderers is http://pobox.com/~oleg/ftp/Haskell/HSXML.tar.gz

Niklas Broberg

9 Aug 9 Aug

10:56 p.m.

|> It seems you might benefit from local functional dependencies, which |> are asserted per instance rather than for the whole class. They are |> explained in |> |> http://pobox.com/~oleg/ftp/Haskell/typecast.html Unfortunately I come crawling back with a failure. Either my fu was not strong enough to fully tame the power of the TypeCast, or there's something here that's trickier than I realize. This message is a literate haskell source file, and I'll set the scene more carefully this time. Note that many of my definitions here are not really definitions but part of the literate comments.

...

{-# OPTIONS_GHC -fglasgow-exts #-} {-# OPTIONS_GHC -fallow-overlapping-instances #-} {-# OPTIONS_GHC -fallow-undecidable-instances #-} module GenXML where

We start out with a (very simplified) datatype for XML:

...

data XML = Element String [XML] | CDATA String

We want to be able to generate XML values of this type using a function (not that this is still a comment) build :: String -> [XML] -> XML build = Element The intended use of build is as the unsugared counterpart of HSP-style XML-like syntax, for instance p c = <p><% c %></p> =desugar=> p c = build "p" [embed c] However, we also want to use this syntax to generate values of other possible representations of XML, so we put the function definition in a type class:

...

class Build xml child | xml -> child where build :: String -> [child] -> xml asChild :: xml -> child

Read out, we can build values of type xml holding children of type child. The use of asChild will be evident below. Clearly XML should fit for this, so we define the instance

...

instance Build XML XML where build = Element asChild = id

The next step is to allow values of different types to be embedded inside XML elements using the embed function:

...

class Build xml child => Embed a xml child where embed :: xml -> a -> child

In English, we can embed values of type a into a tree of type xml, by turning it into something of type child. (The first argument to 'embed' is only there to guide type inference, instantiations of the class are not allowed to look at it.) As an example, now we can embed String values into a tree of type XML:

...

instance Embed String XML XML where embed _ = CDATA

Clearly we also want to be able to embed XML values as children of some element, so we could define (comment) instance Embed XML XML XML where embed _ = id Now we can define p as

...

p c = let x = build "p" [embed x c] in x

and define a test function (comment) test :: XML test1 = p (p "foo") and if we do we get the following error from GHCi: ---------------------------------------------------------------- GenXML.hs:25:8: No instance for (Embed a xml XML) arising from use of `p' at GenXML.hs:25:8 Probable fix: add an instance declaration for (Embed a xml XML) In the definition of `test1': test1 = p (p "foo") GenXML.hs:25:11: No instance for (Embed [Char] xml child) arising from use of `p' at GenXML.hs:25:11 Probable fix: add an instance declaration for (Embed [Char] xml child) In the first argument of `p', namely `(p "foo")' In the definition of `test1': test1 = p (p "foo") ---------------------------------------------------------------- The problem is that the type of the intermediate value (p "foo") cannot be determined. Looking at the type of p we see p :: (Build a1 child, Embed a xml child) => a -> a1 This is pretty obvious, we have no way of knowing what the result of p should be just by its use, it is polymorphic in its result type, and we get no help with inference from the usage site either since it then occurs in a polymorphic position too. But my intention here, which is really the core of my problem, is that I want to disambiguate this problem, by stating (somehow) that if the result of a build is embedded inside another build, the result types of the two should be identical. That is, if we want to generate subtrees of some tree, we should generate them as having the correct type immediately. My first attempt was to define the instance instance (Build xml child) => Embed xml xml child where embed _ x = asChild x but it didn't quite work out, the instance selection still couldn't know what the result of the generation should be, so I still get the same error as above. When I saw TypeCast I thought I had the answer to my problems, and tried to define

...

class TypeCast a b | a -> b, b->a where typeCast :: a -> b class TypeCast' t a b | t a -> b, t b -> a where typeCast' :: t->a->b class TypeCast'' t a b | t a -> b, t b -> a where typeCast'' :: t->a->b instance TypeCast' () a b => TypeCast a b where typeCast x = typeCast' () x instance TypeCast'' t a b => TypeCast' t a b where typeCast' = typeCast'' instance TypeCast'' () a a where typeCast'' _ x = x

...

instance (Build xml child, TypeCast x xml) => Embed x xml child where embed _ x = asChild (typeCast x :: xml)

(Btw, why is the type signature needed for typeCast? Shouldn't it be given by the FD from the TypeCast class?) The point here is the very general instance head, containing only type variables, so it will match the result of any call to build. TypeCast is then used to ensure that what I really get is the same type as that which I'm trying to embed in. This works, and I can write for instance

...

test2 :: XML test2 = p (p (CDATA "foo"))

...

From my use of CDATA here I'm sure you can see where I'm headed, if you didn't already. The above test2 definition type checks and works as expected, but I still run into a problem if I try to define test1 from above:

...

From the type signature for test1 we know a1 to be XML, and thus by

test1 :: XML test1 = p (p "foo") When I try to define this I get the following error from GHCi: ---------------------------------------------------------------- GenXML.hs:29:8: Overlapping instances for Embed a XML XML arising from use of `p' at GenXML.hs:29:8 Matching instances: GenXML.hs:31:0: instance (Build xml child, TypeCast x xml) => Embed x xml child GenXML.hs:20:0: instance Embed String XML XML (The choice depends on the instantiation of `a' Use -fallow-incoherent-instances to use the first choice above) In the definition of `test1': test1 = p (p "foo") GenXML.hs:29:11: Overlapping instances for Embed [Char] a child arising from use of `p' at GenXML.hs:29:11 Matching instances: GenXML.hs:31:0: instance (Build xml child, TypeCast x xml) => Embed x xml child GenXML.hs:20:0: instance Embed String XML XML (The choice depends on the instantiation of `a, child' Use -fallow-incoherent-instances to use the first choice above) In the first argument of `p', namely `(p "foo")' In the definition of `test1': test1 = p (p "foo") ---------------------------------------------------------------- For the first one, if we turn on incoherent instances we get what we want. But we would then get the wrong instance for the second one. Looking at the type signature of the expression p (p "foo") we see that p (p "foo") :: (Embed a a1 child, Embed [Char] a child1) => a1 the FD that child should be XML, so we can simplify this to p (p "foo") :: (Embed a XML XML, Embed [Char] a child) => XML If we look at the first constraint, we would get two possible instances matching as per the error message above, and similarly for the second. So, the error message seems right on the spot, but that doesn't make me any happier, since I have no idea how to get around this problem. It seems that the TypeCast trick only works when you have no more instances than that most general one, but that won't work in my case. So I come crawling back, hoping to pick up some more wisdom to help me solve this problematic case. If you read this long, then thanks a lot for your interest! :-) /Niklas

oleg＠pobox.com

14 Aug 14 Aug

4 p.m.

New subject: Local functional dependencies: solving show . read and XML generation problems

Sorry for a late reply, I'm out of town. As I understand it, the problem is as follows: we'd like to construct different realizations of XML documents from data of different types. We wish to write p (p "foo") and specify the desired type of the XML document, like (p (p "foo")) :: XMLTree or (p (p "foo")) :: IO () -- for writing out the XML document etc. The function 'p' obviously has to be overloaded with the type |p :: a -> xml|. However, |p (p "foo")| exhibits the `show . read' problem. What should be the result type of the internal `p'? Functional dependencies help resolve the ambiguity; alas, we can't assert any dependency here. We should be able to use Strings with XML documents of different types, so we can't assert the argument of 'p' determines the result. Also, XML documents of the same type can be created from children of different types. We indeed have the `show . read' problem. Fortunately, there is a solution that does not involve proxies or type annotations. We use a `syntactic hint' to tell the typechecker which intermediate type we want. To be more precise, we assert local functional dependencies. Thus we can write: p c = build "p" [embed c] test1 :: XML test1 = p [[p [[p "foo"]]]] Our syntactic crutch is the list notation: [[x]]. We could have used a single pair of brackets, but we'd like to avoid overlapping instances (as is done in the following self-contained code). {-# OPTIONS_GHC -fglasgow-exts #-} {-# OPTIONS_GHC -fallow-undecidable-instances #-} module GenXML where data XML = Element String [XML] | CDATA String deriving Show class Build child xml | xml -> child where build :: String -> [child] -> xml instance Build XML XML where build = Element -- This type class has no functional dependency class Embed a child where embed :: a -> child instance Embed String XML where embed = CDATA -- This instance exhibits the functional dependency child -> a instance TypeCast a XML => Embed [[a]] XML where embed [[x]] = typeCast x p c = build "p" [embed c] test1 :: XML test1 = p [[p [[p "foo"]]]] -- Our silver bullet class TypeCast a b | a -> b, b->a where typeCast :: a -> b class TypeCast' t a b | t a -> b, t b -> a where typeCast' :: t->a->b class TypeCast'' t a b | t a -> b, t b -> a where typeCast'' :: t->a->b instance TypeCast' () a b => TypeCast a b where typeCast x = typeCast' () x instance TypeCast'' t a b => TypeCast' t a b where typeCast' = typeCast'' instance TypeCast'' () a a where typeCast'' _ x = x

Niklas Broberg

16 Aug 16 Aug

10:32 p.m.

New subject: Local functional dependencies: solving show . read and XML generation problems

Hi Oleg, Thanks a lot for your reply. I see now where my attempt went wrong and why it couldn't work in the first place, the instances will indeed overlap. I'm not completely satisfied with your solution though, but seeing how you did it has lead me to the solution I want. Details below. :-) ] Fortunately, there is a solution that does not involve proxies or ] type annotations. We use a `syntactic hint' to tell the typechecker ] which intermediate type we want. To be more precise, we assert local ] functional dependencies. Thus we can write: ] ] p c = build "p" [embed c] ] ] test1 :: XML ] test1 = p [[p [[p "foo"]]]] ] ] Our syntactic crutch is the list notation: [[x]]. We could have used a ] single pair of brackets, but we'd like to avoid overlapping ] instances (as is done in the following self-contained code). While I appreciate the ingenuity of the solution, unfortunately I cannot use it. First of all I don't want to require my users to write double brackets everywhere, it makes the code a lot uglier IMO. Another problem is that in my real library (as opposed to the simplified example I gave here) I allow the embedding of lists, which means that the [[x]] is not safe from overlap as it is in your example. But I still see the general pattern here, the point is just to get something that won't clash with other instances. I could define data X a = X a instance (TypeCast a XML) => Embed (X a) XML where embed (X a) = typeCast a and write test1 = p (X $ p (X $ p "foo")) Not quite so pretty, even worse than with the [[ ]] syntax. However, I have an ace up my sleeve, that allows me to get exactly what I want using your trick. Let's start the .lhs file first:

...

{-# OPTIONS_GHC -fglasgow-exts #-} {-# OPTIONS_GHC -fallow-overlapping-instances #-} {-# OPTIONS_GHC -fallow-undecidable-instances #-} module HSP where

import Control.Monad.State import Control.Monad.Writer import TypeCast -- putting your six lines in a different module

Now, the thing I haven't told you in my simplified version is that all the XML generation I have in mind takes place in monadic code. In other words, all instances of Build will be monadic. My whole point of wanting more than one instance is that I want to use one monad, with an XML representation, in server-side code and another in client-side code, as worked on by Joel Björnson. Since everything is monadic, I can define what it means to be an XML-generating monad in terms of a monad transformer:

...

newtype XMLGen m a = XMLGen (m a) deriving (Monad, Functor, MonadIO)

and define the Build and Embed classes as

...

class Build m xml child | m -> xml child where build :: String -> [child] -> XMLGen m xml

class Embed a child where embed :: a -> child

Now for the server-side stuff:

...

data XML = CDATA String | Element String [XML] deriving Show

newtype HSPState = HSPState Int -- just to have something type HSP' = StateT HSPState IO type HSP = XMLGen HSP'

Note that by including XMLGen we define HSP to be an XML-generation monad. Now we can declare our instances. First we can generate XML values in the HSP monad (we use HSP [XML] as the child type to enable embedding of lists):

...

instance GenXML HSP' XML (HSP [XML]) where genElement s chs = do xmls <- fmap concat $ sequence chs return (Element s xmls)

Second we do the TypeCast trick, with XMLGen as the marker type:

...

instance TypeCast (m x) (HSP' XML) => Embed (XMLGen m x) (HSP [XML]) where embed (XMLGen x) = XMLGen $ fmap return $ typeCast x

And now we can safely declare other instances that will not clash with the above because of XMLGen, e.g.:

...

instance Embed String (HSP [XML]) where embed s = return [CDATA s]

instance Embed a (HSP [XML]) => Embed [a] (HSP [XML]) where embed = fmap concat . mapM embed -- (why is there no concatMapM??)

This last instance is why I cannot use lists as disambiguation, and also why I need overlapping instances. Now for some testing functions:

...

p c = build "p" [embed c]

...

test0 :: HSP XML test0 = p "foo"

...

test1 :: HSP XML test1 = p (p "foo")

...

test2 :: HSP XML test2 = p [p "foo", p "bar"]

All of these now work just fine. We could end here, but just to show that it works we do the same stuff all over again for the clientside stuff (mostly dummy code, the clientside stuff doesn't work like this at all, this is just for show):

...

data ElementNode = ElementNode String [ElementNode] | TextNode String deriving Show

type HJScript' = WriterT [String] (State Int) type HJScript = XMLGen HJScript'

instance Build HJScript' ElementNode (HJScript ElementNode) where build s chs = do xs <- sequence chs return $ ElementNode s xs

instance TypeCast (m x) (HJScript' ElementNode) => Embed (XMLGen m x) (HJScript ElementNode) where embed (XMLGen x) = XMLGen $ typeCast x

instance Embed String (HJScript ElementNode) where embed s = return $ TextNode s

Testing the new stuff, using the same p as above:

...

test3 :: HJScript ElementNode test3 = p "foo"

test4 :: HJScript ElementNode test4 = p (p "foo")

And these also work just as expected! :-) Thanks a lot for teaching my the zen of TypeCast, it works like a charm once you learn to use it properly. Really cool stuff! :-) /Niklas

6905

Age (days ago)

6917

Last active (days ago)

List overview

Download

5 comments

2 participants

participants (2)

Niklas Broberg
oleg＠pobox.com