proposal: add 'unsafeCoerce'

Currently, all implementations provide either the 'unsafeCoerce' or 'unsafeCoerce#' functions, but they live in implementation-specific locations. This proposal is to adopt the Haskell'98-compatible name 'unsafeCoerce', and to add it to the standard base library package. Suggested location: Data.Function.Unsafe? http://hackage.haskell.org/trac/ghc/ticket/994 Regards, Malcolm

Hi I strongly support this - I have programs which have to use CPP just to find this single function. As to the module name, its Unsafe, but it doesn't operate on data really, and its a function but so is everything else. Maybe we need to put it in Type.Unsafe, and start a new Type.* hierarchy for type level computation? Thanks Neil

ndmitchell:
Hi
I strongly support this - I have programs which have to use CPP just to find this single function.
Agreed. I have cpp for the very same thing. The # must go!
As to the module name, its Unsafe, but it doesn't operate on data really, and its a function but so is everything else. Maybe we need to put it in Type.Unsafe, and start a new Type.* hierarchy for type level computation?
Hmm. Not sure. Nothing better suggests itself other than System.IO.Unsafe. -- Don

Simon Marlow writes:
On the existence of unsafeCoerce itself, I'm happy to see it in a standard place, but we should be very careful not to guarantee _anything_ about what it does. For example, GHC has some strange restrictions on what you can unsafeCoerce, and it's possible to crash the compiler by using it.
but on the other hand: On Fri, Nov 10, 2006 at 11:32:40PM +1100, Donald Bruce Stewart wrote:
ndmitchell:
I have programs which have to use CPP just to find this single function.
Agreed. I have cpp for the very same thing. The # must go!
So presumably Neil and Don think there are situations where it can be safely used across implementations. What are they?

On Fri, 17 Nov 2006, Ross Paterson wrote:
So presumably Neil and Don think there are situations where it can be safely used across implementations. What are they?
Coq's Haskell extraction uses unsafe coerce to defeat the type system. Some of Coq's dependent types are difficult to represent in Haskell (such as (\n -> (Bool -> ... (n times) ... -> Bool)). So it uses unsafeCoerce to get around this. -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

Hi
So presumably Neil and Don think there are situations where it can be safely used across implementations. What are they?
Yhc translates Haskell to a well-typed (using rank-2 types, but not explicitly typed) language. To translate this language back to Haskell requires type annotations (which were lost ages ago), or lots of unsafeCoerce, which is the significantly easier method. In practice the unsafeCoerce's have never gone wrong on any compiler. Thanks Neil

On Sun, Nov 19, 2006 at 01:30:16PM +0000, Neil Mitchell wrote:
Yhc translates Haskell to a well-typed (using rank-2 types, but not explicitly typed) language. To translate this language back to Haskell requires type annotations (which were lost ages ago), or lots of unsafeCoerce, which is the significantly easier method. In practice the unsafeCoerce's have never gone wrong on any compiler.
This is what jhc does when compiling via ghc as well. John -- John Meacham - ⑆repetae.net⑆john⑈

Hello,
I think that it is a really bad idea to make 'unsafeCoerce' a part of
the standard libraries. As far as I understand, 'unsafeCoerce' is
only "safe" if the programmer assumes something about the
representations of values (in particular, that values of different
types have the same representations). Haskell makes no such
guarantees so, by definition, any program that uses 'unsafeCoerce' is
using an implementation specific extension. I was trying ot think of
cases where 'unsafeCoerce' might be somewhat safe, and the main
example I came up with is when the coersion happens on a phantom type.
Are there other reasonably portable examples?
-Iavor
On 11/19/06, John Meacham
On Sun, Nov 19, 2006 at 01:30:16PM +0000, Neil Mitchell wrote:
Yhc translates Haskell to a well-typed (using rank-2 types, but not explicitly typed) language. To translate this language back to Haskell requires type annotations (which were lost ages ago), or lots of unsafeCoerce, which is the significantly easier method. In practice the unsafeCoerce's have never gone wrong on any compiler.
This is what jhc does when compiling via ghc as well.
John
-- John Meacham - ⑆repetae.net⑆john⑈ _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

iavor.diatchki:
Hello, I think that it is a really bad idea to make 'unsafeCoerce' a part of the standard libraries. As far as I understand, 'unsafeCoerce' is only "safe" if the programmer assumes something about the representations of values (in particular, that values of different types have the same representations). Haskell makes no such guarantees so, by definition, any program that uses 'unsafeCoerce' is using an implementation specific extension. I was trying ot think of cases where 'unsafeCoerce' might be somewhat safe, and the main example I came up with is when the coersion happens on a phantom type. Are there other reasonably portable examples?
Perhaps the implementations of Data.Typeable (assuming the underlying Typeable stuff was portable): cast :: (Typeable a, Typeable b) => a -> Maybe b cast x = r where r = if typeOf x == typeOf (fromJust r) then Just $ unsafeCoerce x else Nothing Its another case of the programmer having a proof not expressible in the type system. I suspect there will be more of these in the future as people write more and more proof/type system preprocessing tools and compilers. -- Don

dons:
iavor.diatchki:
Hello, I think that it is a really bad idea to make 'unsafeCoerce' a part of the standard libraries. As far as I understand, 'unsafeCoerce' is only "safe" if the programmer assumes something about the representations of values (in particular, that values of different types have the same representations). Haskell makes no such guarantees so, by definition, any program that uses 'unsafeCoerce' is using an implementation specific extension. I was trying ot think of cases where 'unsafeCoerce' might be somewhat safe, and the main example I came up with is when the coersion happens on a phantom type. Are there other reasonably portable examples?
A quick grep in the core libraries, for insight: FFI stuff: ./base/Data/Array/Base.hs:678: nullStablePtr = StablePtr (unsafeCoerce# 0#) ./base/Data/Array/Base.hs:1522: case unsafeCoerce# memcpy marr'# marr# n# s2# of { (# s3#, () #) -> ./base/Data/Array/Base.hs:1592: case unsafeCoerce# memcpy marr# arr# n# s2# of { (# s3#, () #) -> ./base/Data/Array/Base.hs:1657: return (STUArray l u (unsafeCoerce# marr#)) Equalities not expressible in the type system: ./base/Data/Dynamic.hs:124: toDyn v = Dynamic (typeOf v) (unsafeCoerce v) ./base/Data/Dynamic.hs:135: | typeOf def == t = unsafeCoerce v ./base/Data/Dynamic.hs:147: case unsafeCoerce v of ./base/Data/Dynamic.hs:155: Just t3 -> Just (Dynamic t3 ((unsafeCoerce f) x)) ./base/Data/Typeable.hs:452: then Just $ unsafeCoerce x ./base/Data/Typeable.hs:460: then Just $ unsafeCoerce x ./base/Data/Typeable.hs:470: then Just $ unsafeCoerce x ./base/Data/Typeable.hs:480: then Just $ unsafeCoerce x I was actually surprised to see. That could be removed I'm pretty sure: ./base/Data/ByteString/Char8.hs:304: go :: Addr# -> [Char] -> ST a () go _ [] = return () go p (C# c:cs) = writeByte p (unsafeCoerce# c) >> go (p `plusAddr#` 1#) cs And some GHC things ./base/GHC/Handle.hs:1661: puts s = do write_rawBuffer 1 (unsafeCoerce# (packCString# s)) 0 (fromIntegral (length s)) ./base/GHC/ForeignPtr.hs:159: (# s, ForeignPtr (byteArrayContents# (unsafeCoerce# mbarr#)) ./base/GHC/ForeignPtr.hs:171: (# s, ForeignPtr (byteArrayContents# (unsafeCoerce# mbarr#)) ./base/GHC/ForeignPtr.hs:193: (# s, ForeignPtr (byteArrayContents# (unsafeCoerce# mbarr#)) ./base/GHC/ForeignPtr.hs:205: (# s, ForeignPtr (byteArrayContents# (unsafeCoerce# mbarr#)) ./base/GHC/ForeignPtr.hs:318: castForeignPtr f = unsafeCoerce# f And some FFI binding stuff: ./ObjectIO/Graphics/UI/ObjectIO/Process/IOState.hs:139: | ioStGetIOId ioSt == id = unsafeCoerce# (ps, ioSt) ./ObjectIO/Graphics/UI/ObjectIO/Window/Access.hs:578: | identifyWIDS wid wids = Just (unsafeCoerce# ls) So the uses fall into 2 categories: * FFI binding and raw pointer/foreign data manipulation * type equalities known, but not expressible statically Now, can we say something portable about these uses? -- Don

On Mon, Nov 20, 2006 at 04:51:29PM +1100, Donald Bruce Stewart wrote:
So the uses fall into 2 categories:
* FFI binding and raw pointer/foreign data manipulation * type equalities known, but not expressible statically
Now, can we say something portable about these uses?
The second group, perhaps, but the first group (unsafeCoerce#) depend on GHC's internal representations.

Ross Paterson wrote:
On Mon, Nov 20, 2006 at 04:51:29PM +1100, Donald Bruce Stewart wrote:
So the uses fall into 2 categories:
* FFI binding and raw pointer/foreign data manipulation * type equalities known, but not expressible statically
Now, can we say something portable about these uses?
The second group, perhaps, but the first group (unsafeCoerce#) depend on GHC's internal representations.
I'd like to have a precise (sound, if not complete) description of when it's safe to use unsafeCoerce in GHC, but it needs some careful thought. I *think* the following cases are guaranteed to work: * cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a', * casting a polymorphic type to the actual type of the runtime value. That is, you can safely cast a value to its correct type. (eg. in Typeable.cast). * casting an unboxed type to another unboxed type of the same size. The difficult cases arise when you want to cast a value to a type other than its real type, such as you need to do in the implementation of Dynamic for example. I doubt that we could provide a portable implementation of Dynamic. Casting boxed types to unboxed types or vice versa is pretty much guaranteed to end in tears. Cheers, Simon

simonmarhaskell:
Casting boxed types to unboxed types or vice versa is pretty much guaranteed to end in tears.
Ah, you forget the 2004 IOHCC ;) module Main where{import GHC.Base;(&)=unsafeCoerce#;main=print(tail ((\(!)->map(\(?)->C#((&)(indexWordArray#((&)(?))4#)))(!))([106,117, 115,116,32,97,32,115,105,109,112,108,101,32,102,117,110,99,116,105, 111,110,97,108,32,108,97,110,103,117,97,103,101,(0x11)]::[Int])));} $ ./a.out "just a simple functional language" Admittedly, I took about 3 days to find an example that didn't end in segf^h^h^h^htears.. :) -- Don P.S. And it seems this only ever compiled with a snapshot of ghc 6.3.

Simon Marlow
Now, can we say something portable about these uses?
I'd like to have a precise (sound, if not complete) description of when it's safe to use unsafeCoerce in GHC, but it needs some careful thought.
And not just GHC. I think all the points you mention (below) would be entirely reasonable for all implementations.
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',
* casting a polymorphic type to the actual type of the runtime value. That is, you can safely cast a value to its correct type. (eg. in Typeable.cast).
* casting an unboxed type to another unboxed type of the same size.
There is one more important use case you haven't mentioned: * casting from a newtype to the contained value (or vice versa). This latter type of cast is the only one I can remember ever having used myself. Regards, Malcolm

On 11/20/06, Malcolm Wallace
There is one more important use case you haven't mentioned:
* casting from a newtype to the contained value (or vice versa).
This latter type of cast is the only one I can remember ever having used myself.
Why do you need to cast for this? You have "casting" operators in the
form of the constructor and (synthesized) destructor.
--
Taral

Taral
* casting from a newtype to the contained value (or vice versa).
Why do you need to cast for this? You have "casting" operators in the form of the constructor and (synthesized) destructor.
OK, to be more explicit, newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs Here, the convert function actually traverses the list at runtime. convert' xs = unsafeCoerce xs But this version does not - it is a type-avoiding identity. Regards, Malcolm

On 11/20/06, Malcolm Wallace
newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs
Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\
Your coercion does assume that the underlying runtime doesn't have
some kind of type-tag implementation of type classes.
--
Taral

Taral
On 11/20/06, Malcolm Wallace
wrote: newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs
Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\
And not all compilers have optimisation phases.
Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes.
I believe this is guaranteed by the definition of newtype in the Language Report. Regards, Malcolm

Malcolm Wallace wrote:
Taral
writes: On 11/20/06, Malcolm Wallace
wrote: newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs
Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\
And not all compilers have optimisation phases.
Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes.
I believe this is guaranteed by the definition of newtype in the Language Report.
The language doesn't say anything about the runtime representation of newtype. It so happens that the semantics lead to the obvious implementation of a newtype as a type cast (that was the reason for introducing newtype, after all), but there's nothing to say you have to implement it this way. It would be wrong to require that unsafeCoerce let you convert between a newtype and its underlying type across implementations. Cheers, Simon

On Nov 21, 2006, at 8:01 AM, Simon Marlow wrote:
Malcolm Wallace wrote:
Taral
writes: On 11/20/06, Malcolm Wallace
wrote: newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs
Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\ And not all compilers have optimisation phases. Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes. I believe this is guaranteed by the definition of newtype in the Language Report.
The language doesn't say anything about the runtime representation of newtype. It so happens that the semantics lead to the obvious implementation of a newtype as a type cast (that was the reason for introducing newtype, after all), but there's nothing to say you have to implement it this way. It would be wrong to require that unsafeCoerce let you convert between a newtype and its underlying type across implementations.
From Section 4.2.3, from the Haskell report: A declaration of the form newtype cx => T u1 ... uk = N t introduces a new type whose representation is the same as an existing type. The type (T u1 ... uk) renames the datatype t. It differs from a type synonym in that it creates a distinct type that must be explicitly coerced to or from the original type. Also, unlike type synonyms, newtype may be used to define recursive types. The constructor N in an expression coerces a value from type t to type (T u1 ... uk). Using N in a pattern coerces a value from type (T u1 ... uk) to type t. These coercions may be implemented without execution time overhead; newtype does not change the underlying representation of an object. I'd say that discusses the runtime representation.
Cheers, Simon
Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Robert Dockins wrote:
On Nov 21, 2006, at 8:01 AM, Simon Marlow wrote:
Malcolm Wallace wrote:
Taral
writes: On 11/20/06, Malcolm Wallace
wrote: newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs
Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\
And not all compilers have optimisation phases.
Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes.
I believe this is guaranteed by the definition of newtype in the Language Report.
The language doesn't say anything about the runtime representation of newtype. It so happens that the semantics lead to the obvious implementation of a newtype as a type cast (that was the reason for introducing newtype, after all), but there's nothing to say you have to implement it this way. It would be wrong to require that unsafeCoerce let you convert between a newtype and its underlying type across implementations.
From Section 4.2.3, from the Haskell report:
A declaration of the form
newtype cx => T u1 ... uk = N t
introduces a new type whose representation is the same as an existing type. The type (T u1 ... uk) renames the datatype t. It differs from a type synonym in that it creates a distinct type that must be explicitly coerced to or from the original type. Also, unlike type synonyms, newtype may be used to define recursive types. The constructor N in an expression coerces a value from type t to type (T u1 ... uk). Using N in a pattern coerces a value from type (T u1 ... uk) to type t. These coercions may be implemented without execution time overhead; newtype does not change the underlying representation of an object.
*blink* I stand corrected (for the second time today, duh, maybe I should check facts before trusting my memory next time...). I have no idea why the report does say that though. Seems very odd, there's no need to mention the representation. Indeed, the language provides no way (absent unsafeCoerce) for a programmer to determine what the representation is, so how should we interpret that paragraph? An invisible requirement or an implementation hint? Cheers, Simon

Simon Marlow
From Section 4.2.3, from the Haskell report:
A declaration of the form
newtype cx => T u1 ... uk = N t
introduces a new type whose representation is the same as an existing type. The type (T u1 ... uk) renames the datatype t. It differs from a type synonym in that it creates a distinct type that must be explicitly coerced to or from the original type. Also, unlike type synonyms, newtype may be used to define recursive types. The constructor N in an expression coerces a value from type t to type (T u1 ... uk). Using N in a pattern coerces a value from type (T u1 ... uk) to type t. These coercions may be implemented without execution time overhead; newtype does not change the underlying representation of an object.
*blink*
I stand corrected (for the second time today, duh, maybe I should check facts before trusting my memory next time...).
I have no idea why the report does say that though. Seems very odd, there's no need to mention the representation. Indeed, the language provides no way (absent unsafeCoerce) for a programmer to determine what the representation is, so how should we interpret that paragraph? An invisible requirement or an implementation hint?
Or as a hint that we should introduce a ``pseudo-polymorphic pseudo-function'' safeCoerce :: a -> b which can only be used if a and b have the same representation, as mandated by the language definition? (Or if the compiler can derive that for the type language currently used...) Wolfram

On 21 Nov 2006 15:41:01 -0500, kahl@cas.mcmaster.ca
Or as a hint that we should introduce a ``pseudo-polymorphic pseudo-function''
safeCoerce :: a -> b
which can only be used if a and b have the same representation, as mandated by the language definition?
It wouldn't be unreasonable for the compiler to provide instances of
TypeEq for newtypes and product types.
--
Taral

Hi
Or as a hint that we should introduce a ``pseudo-polymorphic pseudo-function''
safeCoerce :: a -> b
While this may be a useful function (whatever semantics it might end up having), its not the function that I need to use :) The whole point of my use is to make Haskell compilers accept a program that does not have a type system that can be expressed in Haskell, or that can't be easily expressed. This has to be unsafe, because if the necessary safety invariants could be inferred by Haskell, that's basically the same as it doing the type checking. If you give unsafeCoerce any safety properties, I think you're going to destroy at least some of its uses. Thanks Neil

On Tue, Nov 21, 2006 at 01:01:46PM +0000, Simon Marlow wrote:
The language doesn't say anything about the runtime representation of newtype. It so happens that the semantics lead to the obvious implementation of a newtype as a type cast (that was the reason for introducing newtype, after all), but there's nothing to say you have to implement it this way. It would be wrong to require that unsafeCoerce let you convert between a newtype and its underlying type across implementations.
indeed, in fact, even though jhc uses the same representation for newtypes as their value inside, unsafeCoerce here is still not valid. The reason being that various core-transformations inside the compiler are type directed, and the explicit knowledge that we are flipping something to its newtype or back is very important to guarenteed the safety of these optimizations. John -- John Meacham - ⑆repetae.net⑆john⑈

On Mon, Nov 20, 2006 at 02:45:13PM +0000, Malcolm Wallace wrote:
Simon Marlow
wrote: Now, can we say something portable about these uses?
I'd like to have a precise (sound, if not complete) description of when it's safe to use unsafeCoerce in GHC, but it needs some careful thought.
And not just GHC. I think all the points you mention (below) would be entirely reasonable for all implementations.
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',
* casting a polymorphic type to the actual type of the runtime value. That is, you can safely cast a value to its correct type. (eg. in Typeable.cast).
* casting an unboxed type to another unboxed type of the same size.
There is one more important use case you haven't mentioned:
* casting from a newtype to the contained value (or vice versa).
This latter type of cast is the only one I can remember ever having used myself.
there are very few safe uses of unsafeCoerce in jhc. the only ones guarenteed safe are * casting a recursive newtype to its representation and back (note, recursive newtypes are chosen via a loop-breaking algorithm in the compiler, so it is best to let it worry about this) * casting arbitrary values of kind * to a system provided type 'Box' and back again * casting arbitrary values of kind ! (the kind of strict boxed values) to a system provided type 'BoxBang' and back again * values may not ever be cast to anything other than Box and back _to their original type_ again. Even passing through any other type will poison them. in particular, even 'seq' is unsafe on casted values until they have been cast back to their original type. casting of unboxed types is right out (and guarenteed impossible by the type system) there are specific primitives to cast from kind * values to the analog type of kind ! values and back again these are not casts and are not subject to the above rules. (though, converting something of kind * to kind ! necessarily evaluates it). any cross-platform use of unsafeCoerce is folly, unless of course you have verified it is possible to use it in each compilers documentation you wish to support. since this is a perfectly reasonable thing to do, I support putting unsafeCoerce in a standard spot, but with no standardized behavior. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote:
On Mon, Nov 20, 2006 at 02:45:13PM +0000, Malcolm Wallace wrote:
Simon Marlow
wrote: Now, can we say something portable about these uses?
I'd like to have a precise (sound, if not complete) description of when it's safe to use unsafeCoerce in GHC, but it needs some careful thought.
And not just GHC. I think all the points you mention (below) would be entirely reasonable for all implementations.
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',
* casting a polymorphic type to the actual type of the runtime value. That is, you can safely cast a value to its correct type. (eg. in Typeable.cast).
* casting an unboxed type to another unboxed type of the same size.
There is one more important use case you haven't mentioned:
* casting from a newtype to the contained value (or vice versa).
This latter type of cast is the only one I can remember ever having used myself.
there are very few safe uses of unsafeCoerce in jhc. the only ones guarenteed safe are
* casting a recursive newtype to its representation and back (note, recursive newtypes are chosen via a loop-breaking algorithm in the compiler, so it is best to let it worry about this) * casting arbitrary values of kind * to a system provided type 'Box' and back again * casting arbitrary values of kind ! (the kind of strict boxed values) to a system provided type 'BoxBang' and back again
Is kind ! visible to the programmer? How? What are you allowed/not allowed to do with something of kind !? I added a kind ! to GHC recently, but it has a very limited use: it's the kind of boxed/unlifted primitive types like ByteArray# and MutVar#. So you can write polymorphic functions over boxed/unlifted things, and one day maybe have arrays of them. Cheers, Simon

On Thu, Nov 30, 2006 at 12:23:41PM +0000, Simon Marlow wrote:
Is kind ! visible to the programmer? How? What are you allowed/not allowed to do with something of kind !?
Yes it is, jhc has a rather robust kind system, originally developed to allow arbitrary typesafe mixing of strict and lazy code internally, I have found it quite useful to make its full power available to the programmer. A PTS is a type system that is parameterized by a set of 'sorts' that make up the kinds and superkinds in the system, a set of axioms describing the relationship between the sorts, and a set of rules describing what lambda abstractions are allowed based on sort. the PTS (pure type system) that jhc implements has the following 6 sorts. (in PTS's, both kinds and superkinds are considered "sorts") (#) - the kind of unboxed tuples # - the kind of unboxed values ! - the kind of boxed strict values * - the kind of boxed lazy values ** - the superkind of all boxed values ## - the superkind of all unboxed values the meaning of 'lazy' is exactly _|_ :: t iff t::* the axioms are as follows (*::**, #::##, (#)::##, !::**) basically, what you would expect, splitting things up into boxed and unboxed values. the way rules work are that they describe what you can abstract from what and then what the result is. so (#,*,*) - you can have a function from an unboxed value to a lazy boxed value and that function itself is lazy and boxed. (from,to,what) means we can have (\x:::from -> i:::to):::what a where ::: means the type of the type, so x:::y <-> exists t . x :: t and t :: y so, we have all of functions from any of #,*,! to #,*,! are allowed, and the result is a lazy boxed value. as a shortcut I will represent this as (*#!,*#!,*) (actually 9 rules in one) now there are a couple interesting ones: (*#!,(#),*) - a function may return an unboxed tuple and that function is lazy and boxed and the novel: ((#),*#!(#),!) - a function taking an unboxed tuple is a _strict_ value that is a function, meaning the function itself cannot be _|_. so, given the above rules, we can express any combination of lazy and strict interaction in a typesafe way. including the distinction between strict and lazy functional values. and on to the rules allowing types: (**,*,*) -- this says we can have functions from boxed types to values, as in forall a . a -> a we do not have a rule of the form (##,?,?) because an unboxed type cannot be passed to a function in a generic way, as values in it have a unique run-time representation. and we have (**,**,**) functions from types to types, such as the list constructor ([] :: * -> *) (**,##,##) - functions from types to unboxed types, often used as phantom arguments such as the declaration data Array__ a :: # making Array__ :: * -> # now, a great thing about this is that it is a valid PTS! so I get proofs of completeness, soundness, and a variety of robust typechecking algorithms for free! yay! in haskell source itself, you can use all of these kinds, the inference algorithm will only infer simple kinds, as in * or * -> * (ones made up of *'s and arrows) so you must use kind annotations to bring about other ones. some useful examples data StrictList a :: ! = Cons a (StrictList a) | Nil making StrictList :: * -> ! since a strictlist is of kind !, it can never contain _|_ in it, but the list argument is of kind *, so the list is strict in its spine, but may contain lazy values. data World__ :: # declares an unboxed type World__ with no values and hence no runtime representation. there is even a mild form of kind polymorphism allowed, with the folowing subkinding going on: ? / \ ?? (#) /\ *? # / \ * ! note that ?, ??, and *? are not kinds themselves, but can be used in haskell source files to represent that something can take more than one kind. for instance, IO has type data World__ :: # newtype IO (a::??) = IO (World__ -> (# World__, a #)) meaning, an IO action may return any sort of value, boxed, unboxed, strict or lazy (but not an unboxed tuple) any instantiation of IO will choose a concrete kind though. the declaration declares a simple sort of 'kind scheme'. none of ?, ??, or *? exist in core. they are only placeholders used in the kind inference algorithm presented to the programmer. an experimental feature is the 'undo' monad, whose syntax looks like the do monad, but is actually restricted to IO and function arguments and return values are automatically pulled into unboxed tuples and back again. It is entirely possible I mihht have an 'unboxed' mode, where '!' is the default infered kind for everything and unboxed tuples are automatically inserted at the appropriate points. it would make a very nice strict language with the full power of haskell type classes and the type system with the ability to call back and forth to haskell nicely! A front end for a strict language would just never produce anything of kind *. my original motivation for adding the ! kind was purely motivated by the back end, the grin back end is great at numerics, but is absolutely horrible at updates, doing update in place because I don't want to introduce the concept of redirections I would have to check for. Things of kind '!' never need indirections, as they are already evaluated, so they can use a more efficient representation, still boxed, but the code can just look directly into it or pass arguments to it without having to worry about it possibly being an unevaluated thunk. I am having some fun in 'undo', my strict toy language... so many distractions along the way, all I wanted was a nice way to avoid thunk indirections :) John -- John Meacham - ⑆repetae.net⑆john⑈

On 20/11/06, Simon Marlow
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',
Couldn't we have the following for this case? castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL z = z castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR z = z Not general, but it'd work in this case. And no need for unsafeCoerce. -- -David House, dmhouse@gmail.com

"David House"
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',
Couldn't we have the following for this case?
castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL z = z
castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR z = z
No actually. Try it! Any compiler will reject the catch-all cases ('z') because you are asking for the usage of z on the rhs to be at a different type than the pattern usage of z on the lhs. Regards, Malcolm

Hello,
I just wanted to point out that even the 'reasonable' uses of
'unsafeCoerce' (i.e., where a type does not appear directly in a
value) may not be compatible between different implementations.
Consider, for example, an implementation which turns a polymorphic
program into a monomorphic one by generating different versions of
functions for each of its instantiations (this is not a new idea). In
particular, one also has to specialize constructor functions, which
amounts to specializing datatypes. For example, 'Either@Int@Char'
and 'Either@Int@Bool' become completely separate types. In addition,
we may choose to add only those constructors of a specialized type
that actually appear in the program. For example, 'Either@Int@Char'
may have both constructors 'Left@Int@Char' and 'Righ@Int@Char', while
'Either@Int@Bool' may only have a single constructor 'Right@Int@Bool'.
Now, if we use an integer tag to distinguish constructors, then by
using 'unsafeCoerce' we may end up casting 'Left@Int@Char' into
'Right@Int@Bool'.
-Iavor
On 11/21/06, Malcolm Wallace
"David House"
wrote: * cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',
Couldn't we have the following for this case?
castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL z = z
castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR z = z
No actually. Try it! Any compiler will reject the catch-all cases ('z') because you are asking for the usage of z on the rhs to be at a different type than the pattern usage of z on the lhs.
Regards, Malcolm _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Iavor Diatchki wrote:
Hello, I just wanted to point out that even the 'reasonable' uses of 'unsafeCoerce' (i.e., where a type does not appear directly in a value) may not be compatible between different implementations. Consider, for example, an implementation which turns a polymorphic program into a monomorphic one by generating different versions of functions for each of its instantiations (this is not a new idea). In particular, one also has to specialize constructor functions, which amounts to specializing datatypes. For example, 'Either@Int@Char' and 'Either@Int@Bool' become completely separate types. In addition, we may choose to add only those constructors of a specialized type that actually appear in the program. For example, 'Either@Int@Char' may have both constructors 'Left@Int@Char' and 'Righ@Int@Char', while 'Either@Int@Bool' may only have a single constructor 'Right@Int@Bool'. Now, if we use an integer tag to distinguish constructors, then by using 'unsafeCoerce' we may end up casting 'Left@Int@Char' into 'Right@Int@Bool'.
Quite right. I didn't mean to suggest that this property of unsafeCoerce should be a defined part of its behaviour across implementations, just that it currently works in GHC. Cheers, Simon

On 2006-11-21, Malcolm Wallace
"David House"
wrote: * cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',
Couldn't we have the following for this case?
castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL z = z
castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR z = z
No actually. Try it! Any compiler will reject the catch-all cases ('z') because you are asking for the usage of z on the rhs to be at a different type than the pattern usage of z on the lhs.
How about:
castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL _ = error "castEitherL on Right"
castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR _ = error "castEitherR on Left"
-- Aaron Denney -><-

Hi
I think that it is a really bad idea to make 'unsafeCoerce' a part of the standard libraries.
Currently to use unsafeCoerce you have to do two evil things (assume about the type system, use CPP). Since people already do use unsafeCoerce, we can at least half the evilness of these people ;) Thanks Neil

Malcolm Wallace wrote:
Currently, all implementations provide either the 'unsafeCoerce' or 'unsafeCoerce#' functions, but they live in implementation-specific locations. This proposal is to adopt the Haskell'98-compatible name 'unsafeCoerce', and to add it to the standard base library package. Suggested location: Data.Function.Unsafe?
I'd much prefer Data.Unsafe. (It is a function, but has nothing to do with functions as data type.) Wolfram

kahl@cas.mcmaster.ca wrote:
I'd much prefer Data.Unsafe.
I'd like to see all unsafe function in an Unsafe.* hierarchy (possibly consisting only of the single module Unsafe). Thus one could avoid unsafety altogether by avoiding the Unsafe and Foreign hierarchies. Otherwise unsafeCoerce should go with all the other unsafe functions in System.IO.Unsafe. -- Ashley Yakeley

On 11/10/06, Ashley Yakeley
kahl@cas.mcmaster.ca wrote:
I'd much prefer Data.Unsafe.
I'd like to see all unsafe function in an Unsafe.* hierarchy (possibly consisting only of the single module Unsafe). Thus one could avoid unsafety altogether by avoiding the Unsafe and Foreign hierarchies.
Couldn't you just avoid using functions having "unsafe" as a name prefix? Its not like they have names like "inocuousPerformIO" or anything!

On Friday 10 November 2006 17:43, Samuel Bronson wrote:
On 11/10/06, Ashley Yakeley
wrote: kahl@cas.mcmaster.ca wrote:
I'd much prefer Data.Unsafe.
I'd like to see all unsafe function in an Unsafe.* hierarchy (possibly consisting only of the single module Unsafe). Thus one could avoid unsafety altogether by avoiding the Unsafe and Foreign hierarchies.
Couldn't you just avoid using functions having "unsafe" as a name prefix? Its not like they have names like "inocuousPerformIO" or anything!
It would be nice to be able to disallow all unsafe code by managing module imports. Suppose I want to run untrusted code. If I can verify that it doesn't use FFI, that it uses no unsafe primitives, and that it typechecks, then I know it is _unconditionally_ typesafe. If I can disallow unsafe primitives by just limiting the Unsafe.* and Foreign.* modules, that's a big win. If I instead have to keep a list of unsafe functions, that's not so good. So, I guess count this as another vote for Unsafe.* -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG

robdockins:
On Friday 10 November 2006 17:43, Samuel Bronson wrote:
On 11/10/06, Ashley Yakeley
wrote: kahl@cas.mcmaster.ca wrote:
I'd much prefer Data.Unsafe.
I'd like to see all unsafe function in an Unsafe.* hierarchy (possibly consisting only of the single module Unsafe). Thus one could avoid unsafety altogether by avoiding the Unsafe and Foreign hierarchies.
Couldn't you just avoid using functions having "unsafe" as a name prefix? Its not like they have names like "inocuousPerformIO" or anything!
It would be nice to be able to disallow all unsafe code by managing module imports.
yes! this was *critical* in lambdabot, for allowing random users to run pure h98 expressions. A lot of time went in to working out the trusted module import base (so not stToIo, , unsafe* and so on). Currently unsafe things are scattered around System.*, Data.Array.* Control.*. This isn't ideal.
Suppose I want to run untrusted code. If I can verify that it doesn't use FFI, that it uses no unsafe primitives, and that it typechecks, then I know it is _unconditionally_ typesafe. If I can disallow unsafe primitives by just limiting the Unsafe.* and Foreign.* modules, that's a big win. If I instead have to keep a list of unsafe functions, that's not so good.
So, I guess count this as another vote for Unsafe.*
-- Don

Hello Donald, Saturday, November 11, 2006, 5:47:36 AM, you wrote:
yes! this was *critical* in lambdabot, for allowing random users to run pure h98 expressions. A lot of time went in to working out the trusted module import base (so not stToIo, , unsafe* and so on).
Currently unsafe things are scattered around System.*, Data.Array.* Control.*.
isn't it _much_ better to use Virtual Machine to disallow bad code? may be yhc may provide such VM? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hi
isn't it _much_ better to use Virtual Machine to disallow bad code? may be yhc may provide such VM?
It could do with relative ease - just add a -safe flag and disallow IO in the interpretter. You could even keep the IO monad just not perform the actual IO effects. Thanks Neil

Hello Neil, Saturday, November 11, 2006, 12:54:20 PM, you wrote:
It could do with relative ease - just add a -safe flag and disallow IO in the interpretter. You could even keep the IO monad just not perform the actual IO effects.
"look but don't touch" ? :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

I'd actually like to have an -unsafe flag which one would have to use
in order to compile a module using functions/values marked as unsafe.
Unsafety would be inherited by default, and using a compiler pragma
would avoid this inheritance (but the module would still need to be
compiled with -unsafe, just not its children.)
I tend to think of the unsafe functions as hooks into the
compiler/runtime to be used for modifying the language, not really as
part of the language itself.
Thanks,
- Cale
On 11/11/06, Neil Mitchell
Hi
isn't it _much_ better to use Virtual Machine to disallow bad code? may be yhc may provide such VM?
It could do with relative ease - just add a -safe flag and disallow IO in the interpretter. You could even keep the IO monad just not perform the actual IO effects.
Thanks
Neil _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Saturday 11 November 2006 03:50, Bulat Ziganshin wrote:
Hello Donald,
Saturday, November 11, 2006, 5:47:36 AM, you wrote:
yes! this was *critical* in lambdabot, for allowing random users to run pure h98 expressions. A lot of time went in to working out the trusted module import base (so not stToIo, , unsafe* and so on).
Currently unsafe things are scattered around System.*, Data.Array.* Control.*.
isn't it _much_ better to use Virtual Machine to disallow bad code? may be yhc may provide such VM?
Well, since you bring it up, I'm currently working on this topic. Yes, you can examine a program bytecode file and disallow FFI and primitive actions _in that module_ pretty easily. However, if your untrusted module imports Foreign.IO.unsafePerformIO, or Some.PathTo.unsafeCoerce, all your hard work goes out the window. The idea is to segregate all the unsafe code in one place do its easy to discriminate against it at the module level rather than at the individual function level. -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG

Hello Robert, Saturday, November 11, 2006, 4:06:20 AM, you wrote:
it is _unconditionally_ typesafe. If I can disallow unsafe primitives by just limiting the Unsafe.* and Foreign.* modules, that's a big win. If I instead have to keep a list of unsafe functions, that's not so good.
So, I guess count this as another vote for Unsafe.*
don't forget about Data.Array.Base and Data.ByteString.Base modules :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

I wrote (back in November):
Currently, all implementations provide either the 'unsafeCoerce' or 'unsafeCoerce#' functions, but they live in implementation-specific locations. This proposal is to adopt the Haskell'98-compatible name 'unsafeCoerce', and to add it to the standard base library package.
Discussion finished, with a general feeling that this would be a good thing. There were some objections from Iavor (and others) that the behaviour of 'unsafeCoerce' is entirely implementation-dependent. Others replied, yes, indeed, but it _is_ marked "unsafe". The proposal merely makes the same name available in all compilers (to avoid the need for CPP), but gives no guarantees about its effects. You must satisfy yourself about whether any given usage is safe or not, with any given compiler. So I have just pushed a patch for this. The final location chosen in the module hierarchy was Unsafe.Coerce. Regards, Malcolm
participants (17)
-
Aaron Denney
-
Ashley Yakeley
-
Bulat Ziganshin
-
Cale Gibbard
-
David House
-
dons@cse.unsw.edu.au
-
Iavor Diatchki
-
John Meacham
-
kahl@cas.mcmaster.ca
-
Malcolm Wallace
-
Neil Mitchell
-
Robert Dockins
-
roconnor@theorem.ca
-
Ross Paterson
-
Samuel Bronson
-
Simon Marlow
-
Taral