irrefutable patterns for existential types / GADTs

It seems that irrefutable pattern match with existentials is safe. The fact that irrefutable pattern match with GADT is unsafe has been demonstrated back in September 2004. Let us consider the following regular existential data type
data TFoo where Foo :: Show a => a -> TFoo Bar :: Int -> TFoo
Despite the 'where' syntax, it is NOT GADT; it is just a regular data type. We can write
test_foo vf = case vf of ~(Foo x) -> body
Now, if 'x' does not occur in `body', we could have just as well written test_foo vf = body If `x' does occur in body and the scrutinee is not of the form `Foo', then 'x' is undefined, and so 'body' bottoms out when it demands the value of 'x'. No surprise, and no concern here. Let is now consider a GADT
data GTFoo a where GFoo :: GTFoo Int GBar :: GTFoo Bool
test_gfoo :: GTFoo a -> a test_gfoo vf = case vf of GFoo -> (1::Int)
It can be faithfully emulated as follows
-- the data constructors Em_GFoo and Em_GBar must be hidden! data EmulateGTFoo a = Em_GFoo | Em_GBar
em_gfoo :: EmulateGTFoo Int em_gfoo = Em_GFoo em_gbar :: EmulateGTFoo Bool em_gbar = Em_GBar
The constructors Em_GFoo and Em_GBar should be hidden. The user should use `smart' constructors em_gfoo and em_gbar, which not only set the correct type (Int vs Bool) but also produce the witness, viz. Em_GFoo or Em_GBar. Now, the test_gfoo function should be written as
tesd_emulate_gfoo :: EmulateGTFoo a -> a tesd_emulate_gfoo vf = case vf of Em_GFoo -> unsafeCoerce (1::Int)
So, we test for evidence, Em_GFoo, and if it is presented, we proceed with unsafeCoerce, which generalizes Int to the desired type 'a'. We know this generalization is safe because the evidence Em_GFoo told us that 'a' was really Int. The similarity with Dynamics is uncanny. Now, had we used an irrefutable match instead, we would have proceeded with unsafeCoerce without checking for the evidence. The following code, written back in Sep 13, 2004, shows that the above is not an empty concern. The code did indeed typecheck, with the version of GHC (tidt-branch CVS branch) that existed at that time. Running the code produced the result one'd expect when reading an Int as a Bool. I think the code below was the reason GHC prohibited irrefutable GADT pattern matching since.
{-# OPTIONS -fglasgow-exts #-}
module Main where
import Data.IORef
data T a where Li:: Int -> T Int Lb:: Bool -> T Bool La:: a -> T a
writeInt:: T a -> IORef a -> IO () writeInt v ref = case v of ~(Li x) -> writeIORef ref (1::Int)
readBool:: T a -> IORef a -> IO () readBool v ref = case v of ~(Lb x) -> readIORef ref >>= (print . not)
tt::T a -> IO () tt v = case v of ~(Li x) -> print "OK"
main = do tt (La undefined) ref <- newIORef undefined writeInt (La undefined) ref readBool (La undefined) ref

{-# OPTIONS -fglasgow-exts #-}
module Main where
import Data.IORef
data T a where Li:: Int -> T Int Lb:: Bool -> T Bool La:: a -> T a
writeInt:: T a -> IORef a -> IO () writeInt v ref = case v of ~(Li x) -> writeIORef ref (1::Int)
readBool:: T a -> IORef a -> IO () readBool v ref = case v of ~(Lb x) -> readIORef ref >>= (print . not)
tt::T a -> IO () tt v = case v of ~(Li x) -> print "OK"
main = do tt (La undefined) ref <- newIORef undefined writeInt (La undefined) ref readBool (La undefined) ref
This code is more intricate than data Eq a b where Refl :: Eq a a coerce :: Eq a b -> a -> b coerce ~Refl x = x but I think it amounts to exactly the same thing: ref and x are forced to a particular type witnessed by the GADT. But I think that something still can be squeezed out, strictness is not absolutely necessary (see upcoming mail on this thread). Regards, apfelmus

On 9/30/06, apfelmus@quantentunnel.de
data Eq a b where Refl :: Eq a a
coerce :: Eq a b -> a -> b coerce ~Refl x = x
But this works well with Leibniz-style equality ( http://homepage.mac.com/pasalic/p2/papers/thesis.pdf ), because the Equality proof/term is actually used: data Equal a b = Equal (forall f . f a -> f b) newtype Id x = Id { unId :: x} coerce :: Equal a b -> a -> b coerce ~(Equal f) x = unId (f (Id x)) Jim

On Sat, Sep 30, 2006 at 01:38:28AM -0700, oleg@pobox.com wrote:
It seems that irrefutable pattern match with existentials is safe. The fact that irrefutable pattern match with GADT is unsafe has been demonstrated back in September 2004.
it is safe in that you can't express coerce, but irrefutable pattern matching on existentials has other practical issues for an implementation. namely, even though you may not ever evaluate the irrefutable pattern, its type is probably used in the underlying translation somewhere like. if you happen to use any polymorphic functions. for instance
data Foo = exists a . Foo a f ~(Foo x) = const 4 (id x)
now, you would think that f _|_ would evaluate to 4, but (depending on your translation) it might evaluate to _|_. the reason being that id translates internally to
id = /\a . \x::a . x
since you cannot pull out an appropriate type to pass to id without evaluating the 'irrefutable' pattern, you end up with _|_ instead of 4. basically, allowing irrefutable matching on existentials would expose whether the underlying implementation performs type-erasure. even if an implementation does type erasure like ghc. suddenly odd things occur, like adding an unusned class constraint to a type signature might change the behavior of a program. (since it will suddenly have to pull out a dictionary) All in all, even though type-safety is not broken, irrefutable patterns should not be allowed for existentials. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote:
the reason being that id translates internally to
id = /\a . \x::a . x
since you cannot pull out an appropriate type to pass to id without evaluating the 'irrefutable' pattern, you end up with _|_ instead of 4.
basically, allowing irrefutable matching on existentials would expose whether the underlying implementation performs type-erasure. even if an implementation does type erasure like ghc. suddenly odd things occur, like adding an unusned class constraint to a type signature might change the behavior of a program. (since it will suddenly have to pull out a dictionary)
So you mean that id = /\a . \x :: a . x is /strict/ in its first argument, i.e. in the /type/ a. So to speak, type erasure is equivalent to strictness in all and every type argument. For an irrefutable existential pattern, the type argument should be lazy, of course. This allows you to "pull out an appropriate type", only that it's pulled out only when needed. The point is that const c = /\a . \x :: a . c has no problems of being lazy in its type argument. Strictness in type arguments is of course a consequence of the fact that there is no _|_ type. A systematic way to add irrefutable pattern matches to existentials (or even GADTs) would therefore be the introduction of lazy type arguments (alas introduction of a type _|_). Type erasure then becomes a form of /strictness analysis/. I remember that Wadler's projection based strictness analysis can discover unused values and that's what happens very often with types as they are seldomly calculated with at runtime. So you can erase them by virtue of your strictness analyzer. Thinking further, this would even allow to free type classes from the dictionary approach: an overloaded function like (+) = /\a . \x :: a . \y :: a . x+y performs a case analysis on its type argument and selects the right specialized function. In case where this type is known at compile time, the /inliner/ will select the right specialization. This type based dispatch is more natural for type classes than dictionaries because the latter would in principle allow to supply different dictionaries for one and the same type. Regards, apfelmus

On Tue, Oct 03, 2006 at 12:10:52PM +0200, apfelmus@quantentunnel.de wrote:
So you mean that
id = /\a . \x :: a . x
is /strict/ in its first argument, i.e. in the /type/ a. So to speak, type erasure is equivalent to strictness in all and every type argument. For an irrefutable existential pattern, the type argument should be lazy, of course. This allows you to "pull out an appropriate type", only that it's pulled out only when needed. The point is that
const c = /\a . \x :: a . c
has no problems of being lazy in its type argument.
yeah, that is what I mean. however, since we don't have explicit type passing in haskell, reasoning about the lazyness of types would be quite tricky, leading to odd things like changing type signatures (without changing the underlying types) can change the behavior of a program. not to mention the optimizations that would be excluded by having to preserve the strictness in types behavior... worrying about it in values is tricky enough. :) It would actually not be difficult to add lazy types to jhc at all. I just don't see any use for them when dealing with haskell source. but another front end.. (omega or cayenne perhaps?) certainly might make use of them.
Strictness in type arguments is of course a consequence of the fact that there is no _|_ type. A systematic way to add irrefutable pattern matches to existentials (or even GADTs) would therefore be the introduction of lazy type arguments (alas introduction of a type _|_). Type erasure then becomes a form of /strictness analysis/. I remember that Wadler's projection based strictness analysis can discover unused values and that's what happens very often with types as they are seldomly calculated with at runtime. So you can erase them by virtue of your strictness analyzer.
this is actually exactly what jhc does. it is based on the lambda cube and makes no distinction between types and values, so there is no particular type erasure pass, but a lot of the types happen to be erased by run-time due to standard program transformations.
Thinking further, this would even allow to free type classes from the dictionary approach: an overloaded function like
(+) = /\a . \x :: a . \y :: a . x+y
performs a case analysis on its type argument and selects the right specialized function. In case where this type is known at compile time, the /inliner/ will select the right specialization. This type based dispatch is more natural for type classes than dictionaries because the latter would in principle allow to supply different dictionaries for one and the same type.
again, this is precicely how jhc implements type classes. There are numerous benefits, in particular a single scrutinization of a type will determine concrete methods for _every_ class used on that type, in effect you get run-time specialization for free. The huge disadvantage is that it does not play well nice at all with separate compilation, this is an active area of research for me in jhc. John -- John Meacham - ⑆repetae.net⑆john⑈

yeah, that is what I mean. however, since we don't have explicit type passing in haskell, reasoning about the lazyness of types would be quite tricky, leading to odd things like changing type signatures (without changing the underlying types) can change the behavior of a program.
You mean f.i. by specializing a polymorphic type via type signature? So that data Foo = exists a . Foo a f ~(Foo x) = const 4 (id x) would behave different for (const :: a -> Int) and (const :: (a,a) -> Int)? At least the latter does not even typecheck. I suspect that all possible program behavior changes are already catched at the type/implementation level: when all types are strict, there just cannot be irrefutable patterns for existentials :)
not to mention the optimizations that would be excluded by having to preserve the strictness in types behavior... worrying about it in values is tricky enough. :)
True. But you don't have different optimizations for types and values, do you?
It would actually not be difficult to add lazy types to jhc at all. I just don't see any use for them when dealing with haskell source. but another front end.. (omega or cayenne perhaps?) certainly might make use of them.
In Haskell, only irrefutable patterns for existentials and perhaps GADTs come to mind. But it would be for the good health of our poor computer programs ("whaa, My mind just exploded.") :)
again, this is precicely how jhc implements type classes. There are numerous benefits, in particular a single scrutinization of a type will determine concrete methods for _every_ class used on that type, in effect you get run-time specialization for free. The huge disadvantage is that it does not play well nice at all with separate compilation, this is an active area of research for me in jhc.
Mh, type classes are in essence polymorphic variants on the type level. As types and values do not really differ, you could gain something from existing polymorphic variant implementations like in some ML dialects. A quick google revealed http://caml.inria.fr/pub/papers/garrigue-polymorphic_variants-ml98.pdf but I don't know if it helps. Polymorphic recursion makes things difficult. As a side effect, once you got type classes separately compiled, you get some form of polymorphic variants on the value level for free! Then, it's only a step backward to support open data types proposed by Löh and Hinze http://www.informatik.uni-bonn.de/~ralf/publications/OpenDatatypes.pdf Regards, apfelmus
participants (4)
-
apfelmus@quantentunnel.de
-
Jim Apple
-
John Meacham
-
oleg@pobox.com