Proposal: Sum type branches as extended types (as Type!Constructor)

Extending sum types with data constructors would spare runtime errors or exception control, when applying functions to inappropriate branches, as in the example ... data List a = Nil | Cons a (List a) -- List!Nil and List!Cons -- as extended types * Actual system, with runtime errors (as in GHC Data.List head) or exception throwing hd :: List a -> a hd (Cons x _) -> x hd Nil -> error "error: hd: empty list" -- error or exception throwing * Proposed system extending types with constructors as Type!Constructor: User must do pattern matching before applying the constructor-specific type function. In ''var @ (Constructor _ _)'' the compiler should append the constructor to the type as a pair (Type, Constructor) as an extended type for ''var'' No need for runtime errors or exception control hd :: List!Cons a -> a hd (Cons x _) = x using it: headOf :: List a -> Maybe a headOf list = case list of li @ (Cons _ _) -> Just hd li -- extTypeOf li == ( 'List', 'Cons') -- should pass typechecker for List!Cons li @ Nil -> Just hd li -- compiler error !! -- extTypeOf ('List','Nil') don't match _ -> Just hd list -- compiler error !! -- extTypeOf ('List',Nothing) don't match Maybe we could take out importance on the number of _ wildcards (constructor arity) with a syntax like. li @ (Cons ...) li @ (Nil ...) Cheers! Gabriel Riba Faura.

On 3 Jun 2010, at 16:14, Gabriel Riba wrote:
Extending sum types with data constructors would spare runtime errors or exception control,
when applying functions to inappropriate branches, as in the example ...
data List a = Nil | Cons a (List a) -- List!Nil and List!Cons -- as extended types
* Actual system, with runtime errors (as in GHC Data.List head) or exception throwing
hd :: List a -> a hd (Cons x _) -> x hd Nil -> error "error: hd: empty list" -- error or exception throwing
* Proposed system extending types with constructors as Type!Constructor:
User must do pattern matching before applying the constructor-specific type function.
In ''var @ (Constructor _ _)'' the compiler should append the constructor to the type as a pair (Type, Constructor) as an extended type for ''var''
No need for runtime errors or exception control
hd :: List!Cons a -> a
hd (Cons x _) = x
using it:
headOf :: List a -> Maybe a
headOf list = case list of
li @ (Cons _ _) -> Just hd li -- extTypeOf li == ( 'List', 'Cons') -- should pass typechecker for List!Cons
li @ Nil -> Just hd li -- compiler error !! -- extTypeOf ('List','Nil') don't match
_ -> Just hd list -- compiler error !! -- extTypeOf ('List',Nothing) don't match
Maybe we could take out importance on the number of _ wildcards (constructor arity) with a syntax like. li @ (Cons ...) li @ (Nil ...)
This looks fairly similar to total functional programming, though putting the onus on the caller to make sure it meets preconditions, rather than the callee to make sure it's the right type. In total functional programming we would say head :: List a -> Maybe a head (Cons x _) = Just x head Nil = Nothing We'd then allow the caller to deal with the maybe any way it likes (commonly with fmap, or with the maybe function). Bob

On 4 June 2010 03:18, Ozgur Akgun
On 3 June 2010 16:14, Gabriel Riba
wrote: Maybe we could take out importance on the number of _ wildcards (constructor arity) with a syntax like. li @ (Cons ...) li @ (Nil ...)
can't you already use {} to get rid of the underscores?
li@(Cons {}) li@(Nil {})
Even better: you shouldn't need the parentheses: li@Cons{} li@Nil{} -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

On 06/03/2010 10:14 AM, Gabriel Riba wrote:
No need for runtime errors or exception control
hd :: List!Cons a -> a
hd (Cons x _) = x
This is already doable using GADTs: data Z data S n data List a n where Nil :: List a Z Cons :: a -> List a n -> List a (S n) hd :: List a (S n) -> a hd (Cons x _) = x tl :: List a (S n) -> List a n tl (Cons _ xs) = xs - Jake

Jake McArthur wrote:
On 06/03/2010 10:14 AM, Gabriel Riba wrote:
No need for runtime errors or exception control
hd :: List!Cons a -> a
hd (Cons x _) = x
This is already doable using GADTs:
data Z data S n
data List a n where Nil :: List a Z Cons :: a -> List a n -> List a (S n)
hd :: List a (S n) -> a hd (Cons x _) = x
tl :: List a (S n) -> List a n tl (Cons _ xs) = xs
Sure, it is the whipping boy of dependent types afterall. However, * Haskell's standard lists (and Maybe, Either,...) aren't GADTs, * last I heard GADTs are still a ways off from being accepted into haskell', * and, given that this is the whipping boy of dependent types, you may be surprised to learn that pushing the proofs of correctness through for the standard library of list functions is much harder than it may at first appear. Which is why, apparently, there is no standard library for length-indexed lists in Coq.[1] But more to the point, this proposal is different. Gabriel is advocating for a form of refinement types (aka weak-sigma types), not for type families. This is something I've advocated for in the past (under the name of "difference types")[2] and am still an avid supporter of-- i.e., I'm willing to contribute to the research and implementation for it, given a collaborator familiar with the GHC code base and given sufficient community interest. The reason why length-indexed lists (and other type families) are surprisingly difficult to work with is because of the need to manipulate the indices in every library function. With refinement types predicated on the head constructor of a value, however, there is no need to maintain this information throughout all functions. You can always get the proof you need exactly when you need it by performing case analysis. The benefit of adding this to the type system is that (a) callees can guarantee that the necessary checks have already been done, thereby improving both correctness and efficiency, and (b) it opens the door for the possibility of moving the witnessing case analysis further away from the use site of the proof. While #b in full generality will ultimately lead to things like type families, there are still a number of important differences. Perhaps foremost is that you needn't define the proof index at the same point where you define the datatype. This is particularly important for adding post-hoc annotations to standard types like lists, Maybe, Either, etc. And a corollary to this is that you needn't settle on a single index for an entire programming community, nor do you impose the cost of multiple indices on users who don't care about them. In short, there's no reason why the equality proofs generated by case analysis should be limited to type equivalences of GADT indices. Value equivalences for ADTs are important too. [1] Though I'm working on one: http://community.haskell.org/~wren/coq/vecs/docs/toc.html [2] http://winterkoninkje.livejournal.com/56979.html -- Live well, ~wren

Edward Kmett suggests the use of @ instead of ! Edward Kmett wrote: This is just a form of refinement type. Googling the term should turn up a couple dozen papers on the topic. You can emulate them (and more) with phantom types in the existing type system, though admittedly the encoding is occasionally uncomfortable. As a result there hasn't been much interest in them from on high. And a minor technical issue with the use of ! there is that it would be awkward from a grammar perspective because it would require that ! bind more tightly than juxtaposition, but only when parsing a type. Another syntactic operator like @ that already binds very tightly might be more appropriate. -Edward Kmett New proposal draft: Proposal: Type supplement for constructor specific uses of sum types Purpose: Avoid error clauses (runtime errors), exception control or Maybe types in partially defined (constructor specific) functions on sum types. As an example, with data List a = Nil | Cons a (List a) * Actual system, with runtime errors (as in GHC Data.List head) or exception throwing or optional Maybe results. hd :: List a -> a hd (Cons x _) -> x hd Nil -> error "error: hd: empty list" -- error or exception throwing * Proposed system extending types with a suffix @ Constructor or @ {Constructor1, Constructor2, ..} hd :: List @ Cons a -> a hd (Cons x _) = x The caller must do pattern matching before applying the constructor-specific function. In ''var @ (Constructor {})'' the compiler should set a type supplement property for ''var'' to the specified pattern constructor and check the supplement when used in a constructor specific parameter position. using it: accumHead :: (a->b) -> a -> List a -> b accumHead f accum list = case list of li @ Cons {} -> f accum $ hd li -- typeSupplementOf li == 'Cons' -- should pass typechecker at ''hd'' for List!Cons li @ Nil -> f accum $ hd li -- compiler error !! -- typeSupplementOf li == 'Nil' ; no match _ -> f accum $ hd list -- compiler error !! -- typeSupplementOf list == Nothing ; no match _ -> accum -- default closing pattern matching exhaustivity. (from Jason Dagit contribution) For more than one constructor say data List2 a = Nil | Cons a (List2 a) | Cons2 a a (List2 a) hd :: List2 @ {Cons, Cons2} a -> a

Gabriel Riba wrote:
New proposal draft:
Proposal: Type supplement for constructor specific uses of sum types
Purpose: Avoid error clauses (runtime errors), exception control or Maybe types in partially defined (constructor specific) functions on sum types.
As an example, with
data List a = Nil | Cons a (List a)
* Actual system, with runtime errors (as in GHC Data.List head) or exception throwing or optional Maybe results.
hd :: List a -> a hd (Cons x _) -> x hd Nil -> error "error: hd: empty list" -- error or exception throwing
* Proposed system extending types with a suffix @ Constructor or @ {Constructor1, Constructor2, ..}
hd :: List @ Cons a -> a hd (Cons x _) = x
The caller must do pattern matching before applying the constructor-specific function.
Since the goal is to propagate case analysis information, the syntax should reflect that. That is, there should be support for nested patterns, e.g., cdar :: [a]@{_:_:_} -> a cdar (_:x:_) = x cadr :: [[a]]@{(_:_):_} -> a cadr ((_:xs):_) = xs headFromJust :: [Maybe a]@{Just _ : _} -> a ... The type@cons syntax is a nice shorthand, but we need to express the arguments to the data constructors in the extended syntax in order to support nested patterns. For delimiting multiple alternatives, it's not clear that comma is the best delimiter to use, especially since it could be the data constructor for tuples. Perhaps using ; or | would be better. Unless there's a syntactic reason for preferring braces over parentheses, perhaps we should just use parentheses for symmetry with as-patterns. Finally, there should also be support for negative patterns, i.e., propagation of the failure to match a pattern. One place this is useful is for distinguishing 0 from other numbers, which allows removing the error branches from functions like division. Sometimes we can't enumerate all the positive patterns we want to allow, but it's easy to express what should be disallowed. To match case analysis we should allow for a conjunction of negative patterns followed by a positive pattern. Or, if we want to incorporate multiple positive patterns, then a conjunction of negative patterns followed by a disjunction of positive patterns. (Disjunctive case matching has been an independent proposal in the past, and there's nothing prohibiting supporting it.) Thus, if we use | to delimit disjunctions, & to delimit conjunctions, and \\ to separate the disjuncts from the conjuncts, given the following case analysis: case x : T of p1 y1... -> e1 p2 y2... -> e2 _ -> eF The variable x has type T outside of the case expression. Within the branch e1 it is given the refinement type T@{p1 _...} where variables bound by the pattern are replaced with wildcards. In branch e2 it is given the refinement type T@{p2 _... \\ p1 _...}. This can be simplified to T@{p2 _...} if the head constructors of p2 and p1 are distinct. And in the eF branch x would be given the refinement type T@{_ \\ p1 _... & p2_...}. If this semantics is too hard to implement, we could instead require the use of as-patterns for introducing the refinements. The variable introduced by @ in the as-pattern would be given the refinement type, but the scrutinee would continue to have the unrefined type. This latter semantics is common in dependently typed languages, but it's verbose and ugly so it'd be nice to avoid it if we can. Other notes: Case matching on non-variable expressions would gain no extra support, since we have no variable to associate the refinement information with (unless as-patterns are used). A refinement type T@{p1} can always be weakened to T@{p1 | p2}. Similarly, a refinement type can always be weakened by erasing it. For type inference, I'd suggest that functions which do not have rigid signatures are treated the way they currently are; that is, all refinement information is weakened away unless it is explicitly requested or returned by a function's type signature. This could be improved upon later, but seems like the most reasonable place to start. One complication of trying to infer refinement types is that if we are too liberal then we won't catch bugs arising from non-exhaustive pattern matching. Syntax-wise, there's no particular reason for distinguishing difference from conjunctions under difference. That is, the type T@{... \\ p1 & p2} could just as well be written T@{... \\ p1 \\ p2}. And there's no need for conjunctions under disjunction because we can unify the patterns to get their intersection. Thus, it might be best to just have disjunction and difference for simplicity. -- Live well, ~wren

On Thu, Jun 3, 2010 at 8:14 AM, Gabriel Riba
Extending sum types with data constructors would spare runtime errors or exception control,
when applying functions to inappropriate branches, as in the example ...
data List a = Nil | Cons a (List a) -- List!Nil and List!Cons -- as extended types
* Actual system, with runtime errors (as in GHC Data.List head) or exception throwing
hd :: List a -> a hd (Cons x _) -> x hd Nil -> error "error: hd: empty list" -- error or exception throwing
* Proposed system extending types with constructors as Type!Constructor:
User must do pattern matching before applying the constructor-specific type function.
In ''var @ (Constructor _ _)'' the compiler should append the constructor to the type as a pair (Type, Constructor) as an extended type for ''var''
No need for runtime errors or exception control
hd :: List!Cons a -> a
hd (Cons x _) = x
How will this proposal scale with data types that have multiple alternatives that make sense? No natural examples come to mind so how about a contrived example: data List2 a = Nil | Cons a (List2 a) | Cons2 a a (List2 a) Now I want to define hd for both Cons and Cons2, but not Nil. Do I use an either type like this? hd :: Either (List2!Cons a) (List2!Cons2 a) -> a It seems like some other syntax would be desirable here, maybe: hd :: List2!{Cons, Cons2} a -> a How should it work for functions where no type signature is supplied? Should it infer the type we would now and only enable the subset of constructors when the type is explicit as above? Jason

Jason Dagit
How will this proposal scale with data types that have multiple alternatives that make sense? No natural examples come to mind so how about a contrived example:
data List2 a = Nil | Cons a (List2 a) | Cons2 a a (List2 a)
Now I want to define hd for both Cons and Cons2, but not Nil. Do I use an either type like this? hd :: Either (List2!Cons a) (List2!Cons2 a) -> a
It seems like some other syntax would be desirable here, maybe: hd :: List2!{Cons, Cons2} a -> a
How should it work for functions where no type signature is supplied? Shouldit infer the type we would now and only enable the subset of constructors when the type is explicit as above?
Jason
I agree with your syntax proposal. hd :: List2!{Cons, Cons2} a -> a in functions where no type signature, type inference could work as usual. After normal types passes typechecker with type inference, then Constructor matching could be checked.
participants (7)
-
Gabriel Riba
-
Ivan Miljenovic
-
Jake McArthur
-
Jason Dagit
-
Ozgur Akgun
-
Thomas Davie
-
wren ng thornton