irrefutable patterns for existential types / GADTs

Ross Paterson wrote:
The story so far: apfelmus: why are there no irrefutable patterns for GADTs? Conor: because you could use them to write unsafeCoerce Ross: how about irrefutable patterns (or newtypes) for existential types? Simon: Try giving the translation into System F + (existential) data types
I'd like to add that I see no problem with coerce :: Eq a b -> a -> b coerce ~Refl x = x as long as we have coerce _|_ x === _|_ The wish is that f = \refl -> Just . coerce refl = \~Refl x -> Just x should satisfy f _|_ x === Just _|_ f _|_ x =/= _|_ and likewise for any other constructor than Just. Regards, apfelmus

apfelmus@quantentunnel.de wrote:
Ross Paterson wrote:
The story so far: apfelmus: why are there no irrefutable patterns for GADTs? Conor: because you could use them to write unsafeCoerce Ross: how about irrefutable patterns (or newtypes) for existential types? Simon: Try giving the translation into System F + (existential) data types
I'd like to add that I see no problem with
coerce :: Eq a b -> a -> b coerce ~Refl x = x
as long as we have
coerce _|_ x === _|_
But that makes it refutable! For the above, either coerce _|_ x === x or the notation is being abused. The trouble is that GADT pattern matching has an impact on types, as well as being a selector-destructor mechanism, and for the impact on types to be safe, the match must be strict. For existentials, I'm not sure, but it seems to me that there's not such a serious issue. Isn't the only way you can use the type which allegedly exists to project out some dictionary/other data which is packed inside the existential? Won't this projection will cause a nice safe _|_ instead of a nasty unsafe segfault? I think it's the extra power of GADTs to tell you more about type variables already in play which does the damage. All the best Conor

But that makes it refutable! For the above, either
coerce _|_ x === x
or the notation is being abused.
Making a pattern irrefutable does not mean that the function in question will become lazy: fromJust (~Just x) = x fromJust _|_ === _|_ The point with coerce is that it looks very much like being lazy in its first argument but in fact it is not.
The trouble is that GADT pattern matching has an impact on types, as well as being a selector-destructor mechanism, and for the impact on types to be safe, the match must be strict.
I think it's the extra power of GADTs to tell you more about type variables already in play which does the damage.
But I think that something still can be squeezed out, strictness is not absolutely necessary. I thought something along the lines of f :: Eq a b -> a -> Maybe b f ~Refl x = Just x with f _|_ x === Just _|_ The point is one can always output the constructor Just, it does not inspect the type of x. Now, I don't think anymore that this is feasible as the type of (Just x) still depends on the type of x (even if the constructor Just does not mention it). Nevertheless, I still want to remove strictness, see my next mail in this thread.
For existentials, I'm not sure, but it seems to me that there's not such a serious issue. Isn't the only way you can use the type which allegedly exists to project out some dictionary/other data which is packed inside the existential? Won't this projection will cause a nice safe _|_ instead of a nasty unsafe segfault?
I agree. The only practical problem I can imagine is that GHC internally treats existentials as GADTs. Regards, apfelmus

Here is a formulation of what exactly I require from irrefutable pattern matches for GADTs. The problem arouse from the "Optimization problem" thread. In short, here is a GADT-using, type safe version of Bertram's solution (without balancing)
-- a binary search tree with witness about its shape data Map s k a where Leaf :: Map () k a Node :: k -> a -> Map s k a -> Map t k a -> Map (s,t) k a
empty :: Map () k a empty = Leaf
member :: Ord k => k -> Map s k a -> Bool member _ Leaf = False member k (Node k' _ l r) = case compare k k' of LT -> member k l EQ -> True GT -> member k r
-- a wrapper for an existential type data Undoer s k a where Undoer :: (Map t k a) -> (Map t k a -> (a,Map s k a)) -> Undoer s k a
-- insert key element blueprint map (blueprint, result, map) insert :: Ord k => k -> a -> Map s k a -> Undoer s k a insert k a Leaf = Undoer (Node k a Leaf Leaf) (\(Node k a Leaf Leaf) -> (a,Leaf)) insert k a (Node k' b (l :: Map l k a) (r :: Map r k a) :: Map s k a) = case compare k k' of LT -> case insert k a l of Undoer (m :: Map t k a) f -> Undoer (Node k' b m r :: Map (t,r) k a) (\(Node k' b' m' r' :: Map (t,r) k a) -> let (a,l') = f m' in (a,Node k' b' l' r' :: Map s k a)) EQ -> error "inserting existing element" GT -> case insert k a r of Undoer (m :: Map t k a) f -> Undoer (Node k' b l m :: Map (l,t) k a) (\(Node k' b' l' m' :: Map (l,t) k a) -> let (a,r') = f m' in (a,Node k' b' l' r' :: Map s k a))
update :: Ord k => k -> (a -> a) -> Map s k a -> Map s k a -- the culprit, to be defined later
splitSeq :: Ord a => [(a,b)] -> [(a,[b])] splitSeq = fst . splitSeq' empty
splitSeq' :: Ord a => Map s a [b] -> [(a,b)] -> ([(a,[b])], Map s a [b]) splitSeq' bp [] = ([], bp) splitSeq' bp ((a,b):xs) = case member a bp of True -> let (l, m) = splitSeq' bp xs in (l, update a (b:) m) _ -> case insert a [] bp of Undoer bp' f -> let (rs,m) = splitSeq' bp' xs (bs,m') = f m in ((a, b:bs) : rs, m')
To make this work in a lazy manner (it becomes an online algorithm then and works for infinite lists), I'd like to have
update :: Ord k => k -> (a -> a) -> Map s k a -> Map s k a update k f ~(Node k' a l r) = case compare k k' of LT -> Node k' a (update k f l) r EQ -> Node k' (f a) l r GT -> Node k' a l (update k f r)
reasoning that the Node constructor should be output before one inspects the incoming ~Node. I thought that "(l, m) = splitSeq' bp xs" witnesses that bp and m have the same Shape s, but this is the point where the not-normalizing argument throws in: the type of splitSeq' claims to be a proof that bp and m have the same s but who knows whether it really terminates? So, I'm opting for a different update which is more along the lines of Bertram's original:
update :: Ord k => k -> (a -> a) -> Map s k a -> Map t k a -> Map s k a update k f (Node k' _ l' r') ~(Node _ a l r) = case compare k k' of LT -> Node k' a (update k f l' l) r EQ -> Node k' (f a) l r GT -> Node k' a l (update k f r)
The blueprint gives immediate witness that splitSeq' preserves shape, so this update should be fine. To summarize, the main problem is to get a lazy/online algorithm (the problem here falls into the "more haste, less speed" category) while staying more type safe. @Conor: how does this issue look like in Epigram? Regards, apfelmus

Hi apfelmus@quantentunnel.de wrote:
To summarize, the main problem is to get a lazy/online algorithm (the problem here falls into the "more haste, less speed" category) while staying more type safe. @Conor: how does this issue look like in Epigram?
Thanks for asking! In the current Epigram prototype editor/checker, nothing clever happens. Apart from anything else, typechecking relies on /partial/ evaluation, so we can't presume that the only normal forms in a datatype are made with constructors: they might be expressions which have got stuck. However, Edwin Brady's prototype compiler delivers code for run-time-only usage, and here we begin to get paid for working in a total (once suitably stratified) language. It isn't necessary to perform constructor discrimination when it's statically known that exactly one constructor is possible, so those patterns can always be made irrefutable, with matching replaced by projection. It's not yet obvious whether this is always, never, predictably sometimes, or unpredictably sometimes a good idea. However, we'd certainly be able to support an explicit notation for irrefutable patterns, guaranteed to match whenever the named subobjects were needed. So we don't just give that coerce example an irrefutable pattern, we can erase all run-time trace of equality evidence, and indeed any other data whose value is completely determined by the indices of its type. If you don't need to discriminate on it, you don't need to look at it, so you don't need to keep it. Even though type-vs-value distinction no longer aligns with the static-vs-dynamic distinction, you don't get any less run-time type-erasure. You should actually get more run-time value-erasure! Online algorithms do look like a good place to try to get some leverage from this, but we haven't really been in a position to experiment so far. I'm sure that will change. All the best Conor This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.

Thanks for asking! [...] Online algorithms do look like a good place to try to get some leverage from this, but we haven't really been in a position to experiment so far. I'm sure that will change.
Well, I asked because laziness with memoization gives a definite edge in terms of algorithmic time complexity over strictness (proved for online algorithms in "more haste, less speed") and I wonder how this goes together with totality and dependent types. A forthcoming ultimate programming language should retain this edge, shouldn't it? ;-)
It isn't necessary to perform constructor discrimination when it's statically known that exactly one constructor is possible, so those patterns can always be made irrefutable, with matching replaced by projection.
Thanks for pinpointing the culprit, because translated to GADTs, it means that "whenever we know from the type index that only one constructor is possible, it can be made irrefutable." So this is the kind of irrefutable pattern one could safely add. In particular, this can be weakened to "whenever we know the type index, i.e. there is no type refinement depending on the value, an irrefutable pattern is safe." But once no type refinement happens, the irrefutable pattern can already be obtained via let bindings in existing Haskell! Below is a lazy version of Bertram solution to the "Optimization problem" following Ross' suggestions with existentials and GADTs:
{-# OPTIONS_GHC -fglasgow-exts #-}
data Map s k a where Leaf :: Map () k a Node :: k -> a -> Map s k a -> Map t k a -> Map (s,t) k a
(Map s k a) is a tree which makes its structure explicit in the type index s.
unNode :: Map (s,t) k a -> (k,a,Map s k a, Map t k a) unNode (Node k a l r) = (k,a,l,r)
unNode knows the type index and is to be used for a lazy pattern match let (k,a,l,r) = unNode ..
empty :: Map () k a empty = Leaf
member :: Ord k => k -> Map s k a -> Bool member _ Leaf = False member k (Node k' _ l r) = case compare k k' of LT -> member k l EQ -> True GT -> member k r
data Undoer s k a where Undoer :: (Map t k a) -> (Map t k a -> (a,Map s k a)) -> Undoer s k a
Undoer is just (exists t. ...) wrapped into a digestible type.
-- insert key element blueprint map (blueprint, result, map) insert :: Ord k => k -> a -> Map s k a -> Undoer s k a insert k a Leaf = Undoer (Node k a Leaf Leaf) (\map -> let (_,a,_,_) = unNode map in (a,Leaf))
Note how the type system proves that the function (\map -> ..) always has to expect (map) == Node ... Then, unNode binds k',b',... /lazily/. Same goes for the rest of insert:
insert k a (Node k' b (l :: Map l k a) (r :: Map r k a) :: Map s k a) = case compare k k' of LT -> case insert k a l of Undoer (m :: Map t k a) f -> Undoer (Node k' b m r :: Map (t,r) k a) (\map -> let (k',b',m',r') = unNode map (a,l') = f m' in (a,Node k' b' l' r' :: Map s k a)) EQ -> error "inserting existing element" GT -> case insert k a r of Undoer (m :: Map t k a) f -> Undoer (Node k' b l m :: Map (l,t) k a) (\map -> let (k',b',l',m') = unNode map (a,r') = f m' in (a,Node k' b' l' r' :: Map s k a))
-- update k fun blueprint map update :: Ord k => k -> (a -> a) -> Map s k a -> Map s k a -> Map s k a update k f (Node k' _ l' r') map = case compare k k' of LT -> Node k' a (update k f l' l) r EQ -> Node k' (f a) l r GT -> Node k' a l (update k f r' r) where (_,a,l,r) = unNode map
Again a lazy pattern match on (map). Note that the type system enforces that blueprint and actual (map) have the same shape s. The type refinement for GADTs happens in the strict pattern match on the blueprint and this allows us to lazily match the (map).
splitSeq :: Ord a => [(a,b)] -> [(a,[b])] splitSeq = fst . splitSeq' empty
splitSeq' :: Ord a => Map s a [b] -> [(a,b)] -> ([(a,[b])], Map s a [b]) splitSeq' bp [] = ([], bp) splitSeq' bp ((a,b):xs) = case member a bp of True -> let (l, m) = splitSeq' bp xs in (l, update a (b:) bp m) _ -> case insert a [] bp of Undoer bp' f -> let (rs,m) = splitSeq' bp' xs (bs,m') = f m in ((a, b:bs) : rs, m')
To summarize, the problem can be solved without irrefutable patterns for GADTs: the code above works for infinite lists. Yet, they are handy and can be introduced safely in the case where the type indices are known in advance and no type refinement happens. Regards, apfelmus
participants (3)
-
apfelmus@quantentunnel.de
-
Conor McBride
-
Conor McBride