Has anyone looked into adding subtyping to Haskell?

OCaml has been getting a lot of mileage from its polymorphic variants (which allow structural subtyping on sum types) especially on problems relating to AST transformations and the infamous "expression problem". Has there been any work on extending Haskell's type system with structural subtyping? What is the canonical solution to the expression problem in Haskell? What techniques do Haskellers use to simulate subtyping where it is appropriate? I bring this up because I have been working on a Scheme compiler in Haskell for fun, and something like polymorphic variants would be quite convinent to allow you to specify versions of the AST (input ast, after closure conversion, after CPS transform, etc.), but allow you to write functions that work generically over all the ASTs (getting the free vars, pretty printing, etc.). -- Alan Falloon

Al,
Has there been any work on extending Haskell's type system with structural subtyping?
Koji Kagawaga. Polymorphic variants in Haskell. In Andres Loeh, editor, Proceedings of the 2006 ACM SIGPLAN Workshop on Haskell, Portland, Oregon, USA, September 17, 2006, pages 37--47. ACM Press, 2006. [1]
What is the canonical solution to the expression problem in Haskell?
Not canonical but Loeh and Hinze have proposed open data types: Andres Loeh and Ralf Hinze. Open data types and open functions. In Annalisa Bossi and Michael J. Maher, editors, Proceedings of the 8th International ACM SIGPLAN Conference on Principles and Practice of Declarative Programming, July 10--12, 2006, Venice, Italy, pages 133--144. ACM Press, 2006. [2] Cheers, Stefan [1] http://portal.acm.org/citation.cfm? id=1159842.1159848&coll=ACM&dl=ACM&type=series&idx=1159842&part=Proceedi ngs&WantType=Proceedings&title=Haskell&CFID=24140934&CFTOKEN=52915302 [2] http://portal.acm.org/citation.cfm? id=1140335.1140352&coll=ACM&dl=ACM&type=series&idx=1140335&part=Proceedi ngs&WantType=Proceedings&title=International%20Conference%20on% 20Principles%20and%20Practice%20of%20Declarative% 20Programming&CFID=24141725&CFTOKEN=61657761

Stefan Holdermans wrote:
Al,
Has there been any work on extending Haskell's type system with structural subtyping?
Koji Kagawaga. Polymorphic variants in Haskell. In Andres Loeh, editor, Proceedings of the 2006 ACM SIGPLAN Workshop on Haskell, Portland, Oregon, USA, September 17, 2006, pages 37--47. ACM Press, 2006. [1]
What is the canonical solution to the expression problem in Haskell?
Not canonical but Loeh and Hinze have proposed open data types:
Andres Loeh and Ralf Hinze. Open data types and open functions. In Annalisa Bossi and Michael J. Maher, editors, Proceedings of the 8th International ACM SIGPLAN Conference on Principles and Practice of Declarative Programming, July 10--12, 2006, Venice, Italy, pages 133--144. ACM Press, 2006. [2]
Thanks for the pointers. I will definitely be reading these papers.

stefan:
Al,
Has there been any work on extending Haskell's type system with structural subtyping?
Koji Kagawaga. Polymorphic variants in Haskell. In Andres Loeh, editor, Proceedings of the 2006 ACM SIGPLAN Workshop on Haskell, Portland, Oregon, USA, September 17, 2006, pages 37--47. ACM Press, 2006. [1]
What is the canonical solution to the expression problem in Haskell?
Not canonical but Loeh and Hinze have proposed open data types:
For a short term solution, we used Typeable + type classes to provide a open Message data type. Similar techniques are used in Simon Marlow's extensible exceptions paper. -- An open Message type class Typeable a => Message a -- -- A wrapped value of some type in the Message class. -- data SomeMessage = forall a. Message a => SomeMessage a -- -- And now, unwrap a given, unknown Message type, performing a (dynamic) -- type check on the result. -- fromMessage :: Message m => SomeMessage -> Maybe m fromMessage (SomeMessage m) = cast m -- Don

I bring this up because I have been working on a Scheme compiler in Haskell for fun, and something like polymorphic variants would be quite convinent to allow you to specify versions of the AST (input ast, after closure conversion, after CPS transform, etc.), but allow you to write functions that work generically over all the ASTs (getting the free vars, pretty printing, etc.).
Proper subtyping or at least extendable ADTs would be nicer, but you can have type-checked progress flags using phantom types, e.g.: data LT flag = L String (LT flag) | A (LT flag) (LT flag) | Var String data Input data Renamed data CPSed data ConstPropd rename :: LT Input -> LT Renamed cps :: LT Renamed -> LT CPSed constantPropagate :: LT CPSed -> LT ConstPropd dumpExpr :: (forall a. LT a) -> String -- ignores progress flag This way you have at least a way to check that the proper phases have been run before. It might even be possible to store different things in the nodes (not tested), like in: newtype Ident = MkIdent String class VarType flag vt | flag -> vt instance VarType Input String instance VarType Renamed Ident instance VarType CPSed Ident instance VarType ConstPropd Ident data LT flag = (VarType flag vt => L vt (LT flag)) | ... (This probably doesn't work, but you get the idea.) / Thomas

Thomas Schilling wrote:
I bring this up because I have been working on a Scheme compiler in Haskell for fun, and something like polymorphic variants would be quite convinent to allow you to specify versions of the AST (input ast, after closure conversion, after CPS transform, etc.), but allow you to write functions that work generically over all the ASTs (getting the free vars, pretty printing, etc.).
Proper subtyping or at least extendable ADTs would be nicer, but you can have type-checked progress flags using phantom types, e.g.: <snip>
I thought that phantom types might be a solution, but how to you statically ensure that the AST has only the constructors that are appropriate for each phase? GADTS maybe. Constructors allowed in all phases, or in just one can be encoded easily enough. But then how do you encode a constructor that can be in two out of three phases? Maybe two phantom types? It all starts getting a little hairy. I will have to go through the papers that the others have provided to see if someone has already explored this direction. I have an inkling that there is a good idea here, it just might take some time to tease it out. Thanks for the replies.

On Thursday 31 May 2007 15:36:13 Al Falloon wrote:
I bring this up because I have been working on a Scheme compiler in Haskell for fun, and something like polymorphic variants would be quite convinent to allow you to specify versions of the AST (input ast, after closure conversion, after CPS transform, etc.), but allow you to write functions that work generically over all the ASTs (getting the free vars, pretty printing, etc.).
Although this is the theoretical justification for OCaml's polymorphic variants, it is not their most common use. They are more commonly used to implement overlapping enumerations (see LablGL), to avoid sum type declarations with short scope (e.g. [`Some|`None| `Maybe] inside a single function) and when sum types keep changing during development. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e

Jon Harrop wrote:
On Thursday 31 May 2007 15:36:13 Al Falloon wrote:
I bring this up because I have been working on a Scheme compiler in Haskell for fun, and something like polymorphic variants would be quite convinent to allow you to specify versions of the AST (input ast, after closure conversion, after CPS transform, etc.), but allow you to write functions that work generically over all the ASTs (getting the free vars, pretty printing, etc.).
Although this is the theoretical justification for OCaml's polymorphic variants, it is not their most common use.
They are more commonly used to implement overlapping enumerations (see LablGL), to avoid sum type declarations with short scope (e.g. [`Some|`None| `Maybe] inside a single function) and when sum types keep changing during development.
I kind of saw the overlapping enumeration case as just a common special case of the AST problem. I can't think of a lightweight way to encode overlapping enumerations in Haskell. If someone can come up with one, it would probably shed some light on the right direction for the AST problem. The other uses of PV, I hadn't been aware of, but it makes a lot of sense.

On Thursday 31 May 2007 17:46:38 Al Falloon wrote:
I kind of saw the overlapping enumeration case as just a common special case of the AST problem.
Theoretically, yes. In practice, this is quite an important distinction because enumerations do not suffer from the obfuscated error messages that arise when you have polymorphic variants derived from "too much" inference.
I can't think of a lightweight way to encode overlapping enumerations in Haskell.
I'd like to know if this is possible in Haskell.
If someone can come up with one, it would probably shed some light on the right direction for the AST problem.
I should emphasise that not everyone agrees with me here. I've noticed that Jacques Garrigue uses polymorphic variants in that way (their intended way). I have tried several times but I always give up and revert to ordinary variants for their simpler error messages. Sometimes I replace the ordinary variants with polymorphic variants when the program is complete, but only to make me look more intelligent. Just to clarify what you mean, the following "subst" function substitutes variables `Var _ in an expression and OCaml infers that the output expression needs no `Var constructor (the type 'b): # let rec subst vars = function | `Int _ as e -> e | `Var var -> List.assoc var vars | `Add(f, g) -> `Add(subst vars f, subst vars g);; val subst : ('a * ([> `Add of 'b * 'b | `Int of 'c ] as 'b)) list -> ([< `Add of 'd * 'd | `Int of 'c | `Var of 'a ] as 'd) -> 'b = <fun> You can see how inferred sum types can get unwieldy quite quickly.
The other uses of PV, I hadn't been aware of, but it makes a lot of sense.
Yes. I've found them quite useful in application areas that are better suited to more dynamic typing, like GUI and web programming. Polymorphic variants do a lot to close the gap there. Incidentally, I have heard people express concerns that polymorphic variants will let more bugs through the static type checker because they are weaker. I have never had a run-time problem caused by an unintended inference. My problems with polymorphic variants always boil down to the error messages. For example, forgetting the last "var" in that function allows it to pass type checking with a completely different inferred type: # let rec subst vars = function | `Int _ as e -> e | `Var var -> List.assoc var vars | `Add(f, g) -> `Add(subst vars f, subst g);; val subst : (('b * ([> `Add of 'c * (([< `Add of 'd * 'a | `Int of 'e | `Var of 'b ] as 'd) -> 'c) | `Int of 'e ] as 'c)) list as 'a) -> 'd -> 'c = <fun> -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e

On Thu, May 31, 2007 at 06:16:20PM +0100, Jon Harrop wrote:
I can't think of a lightweight way to encode overlapping enumerations in Haskell.
I'd like to know if this is possible in Haskell.
Maybe this way using GADTs and typeclasses? I haven't used such code in practice - there may be some hidden traps. {-# OPTIONS -fglasgow-exts #-} module Enum where data Height data Size class HasMEDIUM t instance HasMEDIUM Height instance HasMEDIUM Size data ENUM t where LOW :: ENUM Height MEDIUM :: HasMEDIUM t => ENUM t HIGH :: ENUM Height SMALL :: ENUM Size BIG :: ENUM Size Example use: *Enum> :t [MEDIUM, LOW] [MEDIUM, LOW] :: [ENUM Height] *Enum> :t [MEDIUM, SMALL] [MEDIUM, SMALL] :: [ENUM Size] *Enum> :t [MEDIUM, SMALL, LOW] <interactive>:1:16: Couldn't match expected type `Size' against inferred type `Height' Expected type: ENUM Size Inferred type: ENUM Height In the expression: LOW Best regards Tomek

On Thu, May 31, 2007 at 09:04:30PM +0200, Tomasz Zielonka wrote:
On Thu, May 31, 2007 at 06:16:20PM +0100, Jon Harrop wrote:
I can't think of a lightweight way to encode overlapping enumerations in Haskell.
I'd like to know if this is possible in Haskell.
Maybe this way using GADTs and typeclasses? I haven't used such code in practice - there may be some hidden traps.
Let me be the first one to admit that this approach is highly unmodular - you have to define all cases in one place, or at least each group of overlapping enum types. Best regards Tomek

Al Falloon wrote:
OCaml has been getting a lot of mileage from its polymorphic variants (which allow structural subtyping on sum types) especially on problems relating to AST transformations and the infamous "expression problem".
Has there been any work on extending Haskell's type system with structural subtyping?
There's OO'Haskell but I don't know much about it. The problem with subtyping is that it renders type inference undecidable and is more limited than parametric polymorphism. It's more like a "syntactic sugar", you can always explicitly pass around embeddings (a' -> a) and projections (a -> Maybe a').
What is the canonical solution to the expression problem in Haskell?
What techniques do Haskellers use to simulate subtyping where it is appropriate?
I bring this up because I have been working on a Scheme compiler in Haskell for fun, and something like polymorphic variants would be quite convinent to allow you to specify versions of the AST (input ast, after closure conversion, after CPS transform, etc.), but allow you to write functions that work generically over all the ASTs (getting the free vars, pretty printing, etc.).
For this use case, there are some techniques available, mostly focussing on traversing the AST and not so much on the different data constructors. Functors, Monads and Applicative Functors are a structured way to do that. S. Liang, P. Hudak, M.P. Jones. Monad Transformers and Modular Interpreters. http://web.cecs.pdx.edu/~mpj/pubs/modinterp.html C. McBride, R. Paterson. Applicative Programming with Effects. http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf B. Bringert, A. Ranta. A Pattern for Almost Compositional Functions. http://www.cs.chalmers.se/~bringert/publ/composOp/composOp.pdf The fundamental way is to compose your data types just like you compose your functions. Here's an example data Expr a = A (ArithExpr a) | C (Conditional a) data ArithExpr a = Add a a | Mul a a data Conditional a = IfThenElse a a a | CaseOf a [(Int,a)] newtype Expression = E (Expr Expression) Now, functions defined on (Conditional a) can be reused on Expressions, although with some tedious embedding an projecting. I think that the third paper mentioned above makes clever use of GADTs to solve this. The topic of Generic programming is related to that and Applicative Functors make a reappearance here (although not always explicitly mentioned). http://haskell.org/haskellwiki/Research_papers/ /Generics#Scrap_your_boilerplate.21 Regards, apfelmus

apfelmus wrote:
Al Falloon wrote:
OCaml has been getting a lot of mileage from its polymorphic variants (which allow structural subtyping on sum types) especially on problems relating to AST transformations and the infamous "expression problem".
Has there been any work on extending Haskell's type system with structural subtyping?
There's OO'Haskell but I don't know much about it. The problem with subtyping is that it renders type inference undecidable and is more limited than parametric polymorphism. It's more like a "syntactic sugar", you can always explicitly pass around embeddings (a' -> a) and projections (a -> Maybe a').
Quite. So another interesting question to ask is 'has anyone proposed any good syntactic sugar?' Jules

apfelmus wrote:
Al Falloon wrote:
OCaml has been getting a lot of mileage from its polymorphic variants (which allow structural subtyping on sum types) especially on problems relating to AST transformations and the infamous "expression problem".
Has there been any work on extending Haskell's type system with structural subtyping?
There's OO'Haskell but I don't know much about it. The problem with subtyping is that it renders type inference undecidable and is more limited than parametric polymorphism. It's more like a "syntactic sugar", you can always explicitly pass around embeddings (a' -> a) and projections (a -> Maybe a').
I can see how nominal subtyping would make type inference a pain, but I'm pretty sure its solvable for structural subtyping because the inference algorithm can simply grow the type as it unifies the different terms. I think this paper describes how they pulled it off for OCaml: http://citeseer.ist.psu.edu/garrigue02simple.html Its true that subtyping can be considered syntactic sugar, but its the same sort of syntactic sugar as typeclasses, which are pretty popular. In fact, the mapping you suggest is so close to the mapping of typeclasses that it suggests some sort of convenient subtyping library using typeclasses might be possible. Has anyone looked into this? Is that what Data.Typeable provides?
What is the canonical solution to the expression problem in Haskell?
What techniques do Haskellers use to simulate subtyping where it is appropriate?
I bring this up because I have been working on a Scheme compiler in Haskell for fun, and something like polymorphic variants would be quite convinent to allow you to specify versions of the AST (input ast, after closure conversion, after CPS transform, etc.), but allow you to write functions that work generically over all the ASTs (getting the free vars, pretty printing, etc.).
For this use case, there are some techniques available, mostly focussing on traversing the AST and not so much on the different data constructors. Functors, Monads and Applicative Functors are a structured way to do that.
S. Liang, P. Hudak, M.P. Jones. Monad Transformers and Modular Interpreters. http://web.cecs.pdx.edu/~mpj/pubs/modinterp.html
C. McBride, R. Paterson. Applicative Programming with Effects. http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf
B. Bringert, A. Ranta. A Pattern for Almost Compositional Functions. http://www.cs.chalmers.se/~bringert/publ/composOp/composOp.pdf
Thank you, I will definitely look through those.

On Thursday 31 May 2007, Al Falloon wrote:
OCaml has been getting a lot of mileage from its polymorphic variants (which allow structural subtyping on sum types) especially on problems relating to AST transformations and the infamous "expression problem".
Has there been any work on extending Haskell's type system with structural subtyping?
What is the canonical solution to the expression problem in Haskell?
What techniques do Haskellers use to simulate subtyping where it is appropriate?
I bring this up because I have been working on a Scheme compiler in Haskell for fun, and something like polymorphic variants would be quite convinent to allow you to specify versions of the AST (input ast, after closure conversion, after CPS transform, etc.), but allow you to write functions that work generically over all the ASTs (getting the free vars, pretty printing, etc.).
I'm not sure if it qualifies (as, I don't know OCaml), but you might want to look into some of the papers proposing extensible record systems for Haskell. Some of them note that the same type system extensions for records can be used for variants. Specifically, Daan Leijen's paper "Extensible records with scoped labels" goes into how they might work, although there may be other good papers on the subject. Would such a proposal be comparable to what OCaml has? Unfortunately, I don't think implementing such a proposal is high on the to-do list for any of the compilers currently. There's too little time in the day to build all the type system fanciness everyone could want, I guess. :) -- Dan
participants (9)
-
Al Falloon
-
apfelmus
-
Dan Doel
-
dons@cse.unsw.edu.au
-
Jon Harrop
-
Jules Bean
-
Stefan Holdermans
-
Thomas Schilling
-
Tomasz Zielonka