
Hi all, I am writing, for my own amusement, a more general version of the trick to implement variadic functions in Haskell outlined at http://okmij.org/ftp/Haskell/vararg-fn.lhs. (If someone else has done this already, please point me to it!) Code is attached at the end of the message. My question concerns the use of `typeCast', which is defined as follows: class TypeCast x y | x -> y, y -> x where typeCast :: x -> y instance TypeCast x x where typeCast = id This is taken from the paper "Strongly typed heterogeneous collections" by Oleg Kiselyov, Ralf Lammel, and Keean Schupke. As observed there, the TypeCast instance declaration must occur in a separate module; otherwise the compiler is too eager about type simplification. (I also get the same errors if VarArg is interpreted by GHCi, even if TypeCast is in a separate module.) I think I understand the reason for this, but I find it a little surprising; it wasn't clear to me from the documentation that fundeps introduce a new dependence on module boundaries. Anyway, my main question about typeCast is this: why is it needed here at all? If I omit it entirely, the code compiles fine, but then using it gives error messages like the following: Prelude VarArg> build 'h' 'i' :: String <interactive>:1:0: No instances for (VarFold [a] [a] [Char], VarAccum Char [a]) arising from use of `build' at <interactive>:1:0-4 ... I don't understand why the type-checker is unable to infer that the type variable `a' should be specialized to `Char' here, since the only available instance of VarFold whose third type is [Char] has the first type also being [Char]. I've given the compiler all the type hints I can think of. Can someone explain this to me? I should say that if I add a functional dependency "l -> a" to the class VarAccum, then this particular example works. However, I have other examples in mind for which l doesn't functionally determine a, so I don't want to do that. And I don't see why it's necessary. Here's the code: class VarAccum a l where accum :: a -> l -> l class VarFold b l r where varFold :: (l -> b) -> l -> r instance (TypeCast b c) => VarFold b l c where varFold f xs = typeCast (f xs) instance (VarFold b l r, VarAccum a l) => VarFold b l (a -> r) where varFold f xs x = varFold f (accum x xs) -- This is the type of variadic functions from lots of As to a B. type a :--> b = forall r. (VarFold b [a] r) => r infixr 0 :--> instance VarAccum a [a] where accum x xs = (x:xs) varArg :: forall a b. ([a] -> b) -> (a :--> b) varArg f = varFold (f . reverse) ([] :: [a]) build :: forall a. a :--> [a] build = varArg (id :: [a] -> [a]) Thanks! Mike

Michael Shulman wrote:
class TypeCast x y | x -> y, y -> x where typeCast :: x -> y
instance TypeCast x x where typeCast = id
Anyway, my main question about typeCast is this: why is it needed here at all?
First of all, there is a version of TypeCast that works within the same module, please see any code described in http://pobox.com/~oleg/ftp/Haskell/typecast.html Appendix D of the full HList paper (or, HList technical report, I should say) gives the reasons for TypeCast. Briefly, TypeCast lets us replace a (ground) type T with a type variable "t" subject to the constraint: "TypeCast t T => t". The difference seems superficial since the typechecker can immediately discharge the constraint and replace "t" with "T". Well, placing the TypeCast declaration in a different module, or the more convenient magic 6-line declaration of class TypeCast make sure that the typechecker delays the discharging of the "TypeCast" constraint. So the type variable "t" stays uninstantiated for a while. It matters in instance declarations. For example, class Foo x where foo :: x -> Int instance Foo [Int] instance Foo Bool where foo x = fromEnum x If we write test1 = foo [read "1"] we get an error: No instance for (Foo [a]). Indeed, `read "1"' has the type 'a', which does not match "Int". The typechecker certainly cannot chose the "Foo [Int]" instance because who knows what other instances of Foo may occur in other modules. We can tell the typechecker however that "Foo [Int]" is the only instance that can ever be in the whole program (that is, the world is `locally' closed, as far as the Foo [a] is concerned). We do that by writing instance TypeCast x Int => Foo [x] where foo [x] = typeCast x and now test1 = foo [read "1"] typechecks (and works), without any type annotations. BTW, if we lied and did write another instance for "Foo [T]" for some T, the typecheker will complain.

On 9/27/06, oleg@pobox.com
First of all, there is a version of TypeCast that works within the same module, please see any code described in http://pobox.com/~oleg/ftp/Haskell/typecast.html
Yes, I was aware of that; I gave the shorter version just because it's shorter. I didn't realize how well-known this TypeCast is, so forgive my ignorance. Thanks for your answer. I think the real point at which I was confused is that the type-checker never unifies types in order to *find* an instance of a class, but functional dependencies of a class can make it unify types. Thus, by making the instance look more general but moving the unification into a TypeCast constraint, since TypeCast has a bidirectional fundep, we can make the instance match a pair of types that aren't yet unified, and then use TypeCast's fundep to force their unification. Is that right? Mike

This message is intended as a long answer to Michael Shulman's question (Re: variadic functions and typeCast) and Jason Dagit's question (Re: Duplicate Instance problem). Incidentally, the short answer to Jason Dagit's question is `constraints are disregarded during instance selection'. The answer to Michael Shulman is yes. There is a great temptation to read the following declarations class Pred a b instance (Ord b, Eq b) => Pred [Int] b instance Pred Bool Bool as a Prolog program: pred([int],B) <- ord(B),eq(B). pred(bool,bool). (In Prolog, the names of variables are capitalized and the names of constants begin with a lower-case letter. In Haskell type expressions, it's the other way around). There are significant differences between the evaluation of the two programs. The Prolog evaluator, given a goal 'pred([X],bool)', scans the rules in their order of appearance, trying to unify the goal with rule's head. If successful, the evaluator then tries to solve the goals in the body of the rule, again strictly in their order of appearance. On failure, it backtracks. In the above example, the goal unifies with the head of the first rule, yielding the substitution {X->int,B->bool}. So, the evaluator will try to solve the goal ord(bool), and then, if successful, eq(bool). On failure, the evaluator will try to unify the goal with the fact 'pred(bool,bool)'. So, Prolog follows the top-to-bottom, left-to-right strategy with backtracking -- so called SLD resolution. The evaluation of the typeclass program has two significant differences. First, there is no backtracking. That makes the evaluation deterministic, and, furthermore, independent on the order of the instances and the order of constraints. That is a very good thing. For comparison, consider XSLT 1.0, Sec 5.5, and the whole mess with import precedences and priorities. The NodeTest pattern is assigned the priority -0.5. Naturals are not enough for them. The second difference from the Prolog evaluation is the use of matching rather than the full unification when selecting a rule. For example, the goal "Pred [x] Bool" does _not_ match the head of the first Pred instance. The type variable 'x' in the goal is not considered variable during the instance selection. The absence of backtracking also means that the Haskell typechecker does not need to consider the constraints when selecting the instance. If the head of the instance matches the goal, the typechecker commits to that instance, and only then checks for the constraints such as Ord b and Eq b in our example. The failure to (eventually) satisfy these constraints is an unrecoverable type error. Thus, the following instances in Jason Dagit's question: instance Num a => StringValue a where instance RealFloat a => StringValue a where are identical for the instance selection purposes, where only the instance head matters. The absence of backtracking and head-matching rather than the full unification do not mean that the typeclass language is deficient. The TypeCast generalizes head-matching to controlled unification. Indeed, if we write the first instance as instance (TypeCast z Int, Ord b, Eq b) => Pred [z] b then the goal "Pred [x] Bool" will match the head, giving the substitution {z->x, b->Bool}. The typechecker commits to the instance and adds to the current constraints TypeCast x Int, Ord Bool, Eq Bool The latter two are obviously satisfied and so discharged. The former leads to the substitution {x->Int}. So, the net effect is as if we _unified_ "Pred [x] Bool" with "Pred [Int] b", rather than merely matched. Compared with Prolog, we have more flexibility: each rule may specify which of its arguments should be unified against the corresponding arguments of the goal, and which should be merely matched. The backtracking can also be emulated -- but it has to be programmed explicitly. We have to program the typechecker in the typechecker. That is not so difficult: the `Smash-your-boilerplate' message essentially implements the (conventional, backtracking-less) instance resolution algorithm in the typechecker itself. Please compare gsize in SYB3 with gsize in `Smash'. The special processing cases in the former are specified as methods in typeclass instances. With `Smash', exactly the same information is given in a heterogenous list. Perhaps the former is more convenient, but the latter is more flexible as we can add, inspect, and remove `instances'. The class SApply at the end of the `Smash' message implements the instance selection algorithm.

On 9/27/06, oleg@pobox.com
This message is intended as a long answer to Michael Shulman's question (Re: variadic functions and typeCast) and Jason Dagit's question (Re: Duplicate Instance problem). Incidentally, the short answer to Jason Dagit's question is `constraints are disregarded during instance selection'. The answer to Michael Shulman is yes.
Hmm...Perhaps I'm just sleepy but I read through the descriptions below and I'm still not sure if what I want to do is easy or extremely difficult. I think you're suggesting that I should use something like TypeCast to control the unification if I want to get rid of my duplicate instances problem? I'll re-read this a few times and look at the TypeCast stuff. I feel like this is quite a bit over my head at the moment, but your description is very good and hopefully it will make more sense in the morning. Thanks, Jason

Jason Dagit wrote:
I tried to create a type class for making instances of Show display a custom way. After using my class for a while I found that sometimes RealFloats would display as 'NaN' and this is unacceptable. So at this point I had something like:
class Show a => StringValue a where toString :: a -> String
instance Num a => StringValue a where toString = show
instance RealFloat a => StringValue a where toString x | isNaN x = "" -- as an example | otherwise = show x
The simplest solution: replace the constraint RealFloat with its members. That is, replace "instance RealFloat a => StringValue a" with instance StringValue Float where toString x | isNaN x = "" -- as an example | otherwise = show x and the same for Double. There are currently only two members of RealFloat, so code duplication should not be so terrible. Or if it is, then you can do
{-# OPTIONS -fglasgow-exts #-} {-# OPTIONS -fallow-undecidable-instances #-} {-# OPTIONS -fallow-overlapping-instances #-}
class Show a => StringValue a where toString :: a -> String toString = show
instance Num a => StringValue a -- works for all Num except the following
instance StringValue Float where toString = myrealfloatthing instance StringValue Double where toString = myrealfloatthing
myrealfloatthing x | isNaN x = "" -- as an example | otherwise = show x
t1 = toString (1::Int) t2 = toString (1::Double) t3 = toString ((0.0::Double)/0.0)
A more general solution will use generic programming. I doubt however it will be much shorter than the above.

Thank you Oleg! That explanation is very clear.
On 9/28/06, oleg@pobox.com
The typechecker commits to the instance and adds to the current constraints TypeCast x Int, Ord Bool, Eq Bool The latter two are obviously satisfied and so discharged. The former leads to the substitution {x->Int}.
By the rules just enumerated, this substitution would be forbidden, since the type variable x is not considered variable, right? So this happens via the magic of functional dependencies? The typechecker encounters (eventually) the instance instance TypeCast'' () x x and since TypeCast'' is declared with a functional dependency: class TypeCast'' t x y | t x -> y, t y -> x it concludes that the first x in the above instance is uniquely determined by the second, which in our case is Int, and therefore x must be Int. Is that right? This is all very beautiful, but it's a little annoying that the cornerstone "silver bullet" TypeCast has to be defined in a way that fools the typechecker into doing the right thing in spite of itself. Is this all part of the general "fundeps are very hard to get right" problem? Mike

I previously wrote:
The typechecker commits to the instance and adds to the current constraints TypeCast x Int, Ord Bool, Eq Bool The latter two are obviously satisfied and so discharged. The former leads to the substitution {x->Int}.
I should have been more precise and said: The former _eventually_ leads to the substitution {x->Int}. Your analysis is right on the mark. That's exactly how I think of TypeCast.
This is all very beautiful, but it's a little annoying that the cornerstone "silver bullet" TypeCast has to be defined in a way that fools the typechecker into doing the right thing in spite of itself.
One of the drafts of the HList paper, when describing TypeCast, literally had a phrase about `fooling the typechecker'... Well, it seems things like TypeCast were not envisioned by the Founding Fathers. In this respect, the story of C++ templates come to mind. My feeling is that we're still discovering the capabilities of the Haskell typechecker and the right abstractions. We should view TypeCast as an ungainly _encoding_ of a simple abstraction, or just as a tool for implementing type-level typecase and local improvement rules. When the right abstraction emerges (and perhaps it already has: CHR), GHC might implement it directly. Until then, we can use the encoding...

oleg@pobox.com writes:
There is a great temptation to read the following declarations
class Pred a b instance (Ord b, Eq b) => Pred [Int] b instance Pred Bool Bool
as a Prolog program:
pred([int],B) <- ord(B),eq(B). pred(bool,bool).
(In Prolog, the names of variables are capitalized and the names of constants begin with a lower-case letter. In Haskell type expressions, it's the other way around).
Why not simply say that type classes have a natural interpretation in terms of Constraint Handling Rules (CHRs). For example, the above instances translate to rule Pred [Int] <==> Ord b, Eq b rule Pred Bool Bool <==> True CHRs specify rewrite rules among constraints (from left to right). A CHRs fires if we find a constraint which matches the left hand side (compare this to Prolog where we use unification). Why CHRs are great for reasoning about type classes is well-documented here: http://www.comp.nus.edu.sg/~sulzmann/research/node4.html Martin
participants (4)
-
Jason Dagit
-
Martin Sulzmann
-
Michael Shulman
-
oleg@pobox.com