Re: [Haskell-cafe] data type declaration

From: Patrick Browne
Andrew, Thanks for your detailed feedback, it is a great help. I appreciate that the code does not do anything useful, nor is it an appropriate way to write Haskell, but it does help me understand language constructs. I have seen statements like
data C3 c3 a => Address c3 a = Address c3 a
Incidentally, there seems to be a consensus that this a Bad Idea [1]. Even when you specify a type class context on a data declaration, Haskell still requires you to specify the context on functions that use that data (Address c a). What's worse is that you need the class restriction for *all* functions that use an Address, even if they don't operate on the component parts and don't make use of the type class at all. Basically, it ends up making all your type signatures longer with no benefit. If you really want to ensure that every "Address c a" has this context, then use a smart constructor: address :: C3 c3 a => c3 -> a -> Address c3 a address = Address and don't export the Address data constructor. Also, Andrew said that type classes are closer to Java interfaces than C++ objects. This is true, but you may find the difference useful too. A small example of a common OOP-er mistake should demonstrate: -- return a Float if True, else an Int. At least it would if type classes were interfaces. toFloat :: Num a => Int -> Bool -> a toFloat x True = (fromIntegral x) :: Float toFloat x False = x OOP-style interfaces have two features: they specify methods that are available for all types which implement the interface, and they are an existentially-quantified data constructor. Equivalent code in Java would return an existentially-quantified type - we know the interface methods are supported but not the data type, so the interface methods are all we can do. The exact type (either Float or Int) is wrapped in this existentially quantified box. Haskell type classes don't do existential quantification, which is why the above doesn't compile. The return value of this function must be any a with a Num instance, and the caller (not toFloat) gets to choose the exact type to instantiate. Translating an interface to Haskell requires that you write both parts: class Num a where ... data INum = forall a. Num a => INum a --requires the ExistentialQuantification extension -- now toFloat works toFloat :: Int -> Bool -> INum toFloat x True = INum ((fromIntegral x) :: Float) toFloat x False = INum x Now the type variable is stuck in the INum type instead of being universally quantified (top-level forall), which is what an interface does. I think most Haskellers would prefer to choose a design that doesn't require existential quantification, but it really depends on the problem. John [1] GHC sources refer to this context as "stupid theta", see further discussion at http://hackage.haskell.org/trac/haskell-prime/wiki/NoDatatypeContexts

On Jul 26, 2010, at 12:35 PM, John Lato wrote:
Incidentally, there seems to be a consensus that this a Bad Idea [1]. Even when you specify a type class context on a data declaration, Haskell still requires you to specify the context on functions that use that data (Address c a).
This has always puzzled me. Take the obvious data Ord key => BST key val = Empty | Node key val (BST key val) (BST key val) Why would anyone say this if they didn't *want* the constraint implied on every use? If you want the constraint implied on every use of any constructor, including ones where the constructor is used for pattern matching, what do you do if not this? Good software engineering involves *controlled* use of redundancy. Having it *stated* in one place and *checked* in others is an example. Requiring the same information to be repeated everywhere is not.
What's worse is that you need the class restriction for *all* functions that use an Address,
and if you didn't WANT that, you wouldn't say this. Oh sure, something like is_empty (Empty) = True is_empty (Node _ _ _ _) = Fase doesn't happen to make use of any constrained component. But it is part of a *group* of methods which collectively don't make any sense without it, so there's no real practical advantage to having some functions constrained and some not (unless you count delaying error message as an advantage).
and don't export the Address data constructor.
This doesn't help _within_ the defining module where you are pattern matching. In "stupid theta", the only stupidity would seem to be refusing to honour the programmer's evident intent. It's rather like saying "Oh the programmer said this constructor argument must be an Int, but I'll require him to repeat that everywhere".

Richard,
I'm not sure that I agree or disagree with you; I think the decision
is above my pay grade.
On Mon, Jul 26, 2010 at 4:49 AM, Richard O'Keefe
On Jul 26, 2010, at 12:35 PM, John Lato wrote:
Incidentally, there seems to be a consensus that this a Bad Idea [1]. Even when you specify a type class context on a data declaration, Haskell still requires you to specify the context on functions that use that data (Address c a).
This has always puzzled me.
Take the obvious data Ord key => BST key val = Empty | Node key val (BST key val) (BST key val)
Why would anyone say this if they didn't *want* the constraint implied on every use? If you want the constraint implied on every use of any constructor, including ones where the constructor is used for pattern matching, what do you do if not this?
Currently you include the constraint manually every time you use the constructor (but you already know that). Another approach (which I wouldn't advocate) is to use existentially-quantified data, which carries it's context automatically. I don't know if any other extensions would help, possibly GADT's?
Good software engineering involves *controlled* use of redundancy. Having it *stated* in one place and *checked* in others is an example. Requiring the same information to be repeated everywhere is not.
What's worse is that you need the class restriction for *all* functions that use an Address,
and if you didn't WANT that, you wouldn't say this.
I think this makes more sense when I think about a class context as a dictionary instead of a type restriction. If I think of a type class as meaning "I want these types to have this relationship", then I want that to be always true for this data. If I think of a type class as meaning "here's an extra set of functions that are available for these types", then I'd prefer not to carry it around unless it's necessary. In any case, even if you want to specify a type relation which is always valid, it's frequently irrelevant to the operation at hand, and can be ignored (left out) in those cases. If the behavior of class contexts on data types were changed to what you think it should mean, i.e. contexts specified in a data declaration are carried around for all uses of that type instead of just the data constructor, I wouldn't mind at all. Whether this is a good idea or would cause other problems, I can't say.
Oh sure, something like is_empty (Empty) = True is_empty (Node _ _ _ _) = Fase doesn't happen to make use of any constrained component. But it is part of a *group* of methods which collectively don't make any sense without it, so there's no real practical advantage to having some functions constrained and some not (unless you count delaying error message as an advantage).
You don't delay an error message though; this is resolved at compile time. This function "is_empty" doesn't need the context, but any function that calls is_empty is likely to have it available anyway. If you write functionWithNoContext x = do_something_with (needsContext x) The compiler complains that "functionWithNoContext" needs the context, exactly where it's required. Would this be easier if "BST key val" carried the context implicitly? Probably so. And I do agree that for many data types it makes sense to have contexts available implicitly. Until that happens, though, I prefer to keep my type signatures as simple as possible.
and don't export the Address data constructor.
This doesn't help _within_ the defining module where you are pattern matching.
No, and it's particularly irksome that the only options are programmer discipline or creating a separate module for the data type and losing pattern matching. One thing on my wish list for Haskell' would be allowing for data constructors to be exported for pattern matching only. That is, you could do this: case x of Foo x -> ... but not let y = Foo x John

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 7/26/10 06:02 , John Lato wrote:
If the behavior of class contexts on data types were changed to what you think it should mean, i.e. contexts specified in a data declaration are carried around for all uses of that type instead of just the data constructor, I wouldn't mind at all. Whether this is a good idea or would cause other problems, I can't say.
As I understand it: 1) carrying them around complicates Haskell98 (and now Haskell2010) compatibility (also see below); 2) GADTs do what you want, since they don't have backward compatibility baggage. As to the current proposal, I think nobody's certain what would happen to older programs if data were changed to carry contexts around --- someone might be relying on the current behavior, and changing it might produce runtime oddness instead of a compile-time error --- whereas making contexts illegal will produce an easily-fixed error message in all relevant cases. - -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxNo5EACgkQIn7hlCsL25WDzgCdE/QmWy/Do1M73n+rt829Dyb7 HuMAni+vw//HuanYc4LJ5uXPYdPDBmuu =/ivE -----END PGP SIGNATURE-----

On Jul 27, 2010, at 3:02 AM, Brandon S Allbery KF8NH wrote:
As I understand it: 1) carrying [contexts] around complicates Haskell98 (and now Haskell2010) compatibility (also see below);
Like the availability of so many other features, this one could be controlled by a language pragma.
2) GADTs do what you want, since they don't have backward compatibility baggage.
They are also more complex than is needed for the problem at hand.
As to the current proposal, I think nobody's certain what would happen to older programs if data were changed to carry contexts around --- someone might be relying on the current behavior, and changing it might produce runtime oddness instead of a compile-time error --- whereas making contexts illegal will produce an easily-fixed error message in all relevant cases.
Does anyone know why `data' contexts were broken in the first place?
participants (3)
-
Brandon S Allbery KF8NH
-
John Lato
-
Richard O'Keefe