proposal: add 'unsafeCoerce'

newer
Re: [Haskell] ANNOUNCE: new FGL...

Malcolm Wallace

10 Nov 2006 10 Nov '06

11:47 a.m.

Currently, all implementations provide either the 'unsafeCoerce' or 'unsafeCoerce#' functions, but they live in implementation-specific locations. This proposal is to adopt the Haskell'98-compatible name 'unsafeCoerce', and to add it to the standard base library package. Suggested location: Data.Function.Unsafe? http://hackage.haskell.org/trac/ghc/ticket/994 Regards, Malcolm

Show replies by date

Neil Mitchell

10 Nov 10 Nov

12:14 p.m.

Hi I strongly support this - I have programs which have to use CPP just to find this single function. As to the module name, its Unsafe, but it doesn't operate on data really, and its a function but so is everything else. Maybe we need to put it in Type.Unsafe, and start a new Type.* hierarchy for type level computation? Thanks Neil

dons＠cse.unsw.edu.au

12:32 p.m.

ndmitchell:

...

Hi

I strongly support this - I have programs which have to use CPP just to find this single function.

Agreed. I have cpp for the very same thing. The # must go!

...

As to the module name, its Unsafe, but it doesn't operate on data really, and its a function but so is everything else. Maybe we need to put it in Type.Unsafe, and start a new Type.* hierarchy for type level computation?

Hmm. Not sure. Nothing better suggests itself other than System.IO.Unsafe. -- Don

Ross Paterson

17 Nov 17 Nov

1:36 p.m.

Simon Marlow writes:

...

On the existence of unsafeCoerce itself, I'm happy to see it in a standard place, but we should be very careful not to guarantee _anything_ about what it does. For example, GHC has some strange restrictions on what you can unsafeCoerce, and it's possible to crash the compiler by using it.

but on the other hand: On Fri, Nov 10, 2006 at 11:32:40PM +1100, Donald Bruce Stewart wrote:

...

ndmitchell:

...
I have programs which have to use CPP just to find this single function.

Agreed. I have cpp for the very same thing. The # must go!

So presumably Neil and Don think there are situations where it can be safely used across implementations. What are they?

roconnor＠theorem.ca

19 Nov 19 Nov

10:28 a.m.

On Fri, 17 Nov 2006, Ross Paterson wrote:

...

So presumably Neil and Don think there are situations where it can be safely used across implementations. What are they?

Coq's Haskell extraction uses unsafe coerce to defeat the type system. Some of Coq's dependent types are difficult to represent in Haskell (such as (\n -> (Bool -> ... (n times) ... -> Bool)). So it uses unsafeCoerce to get around this. -- Russell O'Connor http://r6.ca/ ``All talk about `theft,''' the general counsel of the American Graphophone Company wrote, ``is the merest claptrap, for there exists no property in ideas musical, literary or artistic, except as defined by statute.''

Neil Mitchell

1:30 p.m.

...

So presumably Neil and Don think there are situations where it can be safely used across implementations. What are they?

Yhc translates Haskell to a well-typed (using rank-2 types, but not explicitly typed) language. To translate this language back to Haskell requires type annotations (which were lost ages ago), or lots of unsafeCoerce, which is the significantly easier method. In practice the unsafeCoerce's have never gone wrong on any compiler. Thanks Neil

John Meacham

7:46 p.m.

On Sun, Nov 19, 2006 at 01:30:16PM +0000, Neil Mitchell wrote:

...

Yhc translates Haskell to a well-typed (using rank-2 types, but not explicitly typed) language. To translate this language back to Haskell requires type annotations (which were lost ages ago), or lots of unsafeCoerce, which is the significantly easier method. In practice the unsafeCoerce's have never gone wrong on any compiler.

This is what jhc does when compiling via ghc as well. John -- John Meacham - ⑆repetae.net⑆john⑈

Iavor Diatchki

20 Nov 20 Nov

5:38 a.m.

Hello, I think that it is a really bad idea to make 'unsafeCoerce' a part of the standard libraries. As far as I understand, 'unsafeCoerce' is only "safe" if the programmer assumes something about the representations of values (in particular, that values of different types have the same representations). Haskell makes no such guarantees so, by definition, any program that uses 'unsafeCoerce' is using an implementation specific extension. I was trying ot think of cases where 'unsafeCoerce' might be somewhat safe, and the main example I came up with is when the coersion happens on a phantom type. Are there other reasonably portable examples? -Iavor On 11/19/06, John Meacham wrote:

...

On Sun, Nov 19, 2006 at 01:30:16PM +0000, Neil Mitchell wrote:

...
Yhc translates Haskell to a well-typed (using rank-2 types, but not explicitly typed) language. To translate this language back to Haskell requires type annotations (which were lost ages ago), or lots of unsafeCoerce, which is the significantly easier method. In practice the unsafeCoerce's have never gone wrong on any compiler.

This is what jhc does when compiling via ghc as well.

John

-- John Meacham - ⑆repetae.net⑆john⑈ _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

dons＠cse.unsw.edu.au

5:42 a.m.

iavor.diatchki:

...

Hello, I think that it is a really bad idea to make 'unsafeCoerce' a part of the standard libraries. As far as I understand, 'unsafeCoerce' is only "safe" if the programmer assumes something about the representations of values (in particular, that values of different types have the same representations). Haskell makes no such guarantees so, by definition, any program that uses 'unsafeCoerce' is using an implementation specific extension. I was trying ot think of cases where 'unsafeCoerce' might be somewhat safe, and the main example I came up with is when the coersion happens on a phantom type. Are there other reasonably portable examples?

Perhaps the implementations of Data.Typeable (assuming the underlying Typeable stuff was portable): cast :: (Typeable a, Typeable b) => a -> Maybe b cast x = r where r = if typeOf x == typeOf (fromJust r) then Just $ unsafeCoerce x else Nothing Its another case of the programmer having a proof not expressible in the type system. I suspect there will be more of these in the future as people write more and more proof/type system preprocessing tools and compilers. -- Don

dons＠cse.unsw.edu.au

5:51 a.m.

dons:

...

iavor.diatchki:

...
Hello, I think that it is a really bad idea to make 'unsafeCoerce' a part of the standard libraries. As far as I understand, 'unsafeCoerce' is only "safe" if the programmer assumes something about the representations of values (in particular, that values of different types have the same representations). Haskell makes no such guarantees so, by definition, any program that uses 'unsafeCoerce' is using an implementation specific extension. I was trying ot think of cases where 'unsafeCoerce' might be somewhat safe, and the main example I came up with is when the coersion happens on a phantom type. Are there other reasonably portable examples?

A quick grep in the core libraries, for insight: FFI stuff: ./base/Data/Array/Base.hs:678: nullStablePtr = StablePtr (unsafeCoerce# 0#) ./base/Data/Array/Base.hs:1522: case unsafeCoerce# memcpy marr'# marr# n# s2# of { (# s3#, () #) -> ./base/Data/Array/Base.hs:1592: case unsafeCoerce# memcpy marr# arr# n# s2# of { (# s3#, () #) -> ./base/Data/Array/Base.hs:1657: return (STUArray l u (unsafeCoerce# marr#)) Equalities not expressible in the type system: ./base/Data/Dynamic.hs:124: toDyn v = Dynamic (typeOf v) (unsafeCoerce v) ./base/Data/Dynamic.hs:135: | typeOf def == t = unsafeCoerce v ./base/Data/Dynamic.hs:147: case unsafeCoerce v of ./base/Data/Dynamic.hs:155: Just t3 -> Just (Dynamic t3 ((unsafeCoerce f) x)) ./base/Data/Typeable.hs:452: then Just $ unsafeCoerce x ./base/Data/Typeable.hs:460: then Just $ unsafeCoerce x ./base/Data/Typeable.hs:470: then Just $ unsafeCoerce x ./base/Data/Typeable.hs:480: then Just $ unsafeCoerce x I was actually surprised to see. That could be removed I'm pretty sure: ./base/Data/ByteString/Char8.hs:304: go :: Addr# -> [Char] -> ST a () go _ [] = return () go p (C# c:cs) = writeByte p (unsafeCoerce# c) >> go (p `plusAddr#` 1#) cs And some GHC things ./base/GHC/Handle.hs:1661: puts s = do write_rawBuffer 1 (unsafeCoerce# (packCString# s)) 0 (fromIntegral (length s)) ./base/GHC/ForeignPtr.hs:159: (# s, ForeignPtr (byteArrayContents# (unsafeCoerce# mbarr#)) ./base/GHC/ForeignPtr.hs:171: (# s, ForeignPtr (byteArrayContents# (unsafeCoerce# mbarr#)) ./base/GHC/ForeignPtr.hs:193: (# s, ForeignPtr (byteArrayContents# (unsafeCoerce# mbarr#)) ./base/GHC/ForeignPtr.hs:205: (# s, ForeignPtr (byteArrayContents# (unsafeCoerce# mbarr#)) ./base/GHC/ForeignPtr.hs:318: castForeignPtr f = unsafeCoerce# f And some FFI binding stuff: ./ObjectIO/Graphics/UI/ObjectIO/Process/IOState.hs:139: | ioStGetIOId ioSt == id = unsafeCoerce# (ps, ioSt) ./ObjectIO/Graphics/UI/ObjectIO/Window/Access.hs:578: | identifyWIDS wid wids = Just (unsafeCoerce# ls) So the uses fall into 2 categories: * FFI binding and raw pointer/foreign data manipulation * type equalities known, but not expressible statically Now, can we say something portable about these uses? -- Don

Ross Paterson

12:12 p.m.

On Mon, Nov 20, 2006 at 04:51:29PM +1100, Donald Bruce Stewart wrote:

...

So the uses fall into 2 categories:

* FFI binding and raw pointer/foreign data manipulation * type equalities known, but not expressible statically

Now, can we say something portable about these uses?

The second group, perhaps, but the first group (unsafeCoerce#) depend on GHC's internal representations.

Simon Marlow

1:40 p.m.

Ross Paterson wrote:

...

On Mon, Nov 20, 2006 at 04:51:29PM +1100, Donald Bruce Stewart wrote:

...
So the uses fall into 2 categories:

* FFI binding and raw pointer/foreign data manipulation * type equalities known, but not expressible statically

Now, can we say something portable about these uses?

The second group, perhaps, but the first group (unsafeCoerce#) depend on GHC's internal representations.

I'd like to have a precise (sound, if not complete) description of when it's safe to use unsafeCoerce in GHC, but it needs some careful thought. I *think* the following cases are guaranteed to work: * cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a', * casting a polymorphic type to the actual type of the runtime value. That is, you can safely cast a value to its correct type. (eg. in Typeable.cast). * casting an unboxed type to another unboxed type of the same size. The difficult cases arise when you want to cast a value to a type other than its real type, such as you need to do in the implementation of Dynamic for example. I doubt that we could provide a portable implementation of Dynamic. Casting boxed types to unboxed types or vice versa is pretty much guaranteed to end in tears. Cheers, Simon

dons＠cse.unsw.edu.au

1:48 p.m.

simonmarhaskell:

...

Casting boxed types to unboxed types or vice versa is pretty much guaranteed to end in tears.

Ah, you forget the 2004 IOHCC ;) module Main where{import GHC.Base;(&)=unsafeCoerce#;main=print(tail ((\(!)->map(\(?)->C#((&)(indexWordArray#((&)(?))4#)))(!))([106,117, 115,116,32,97,32,115,105,109,112,108,101,32,102,117,110,99,116,105, 111,110,97,108,32,108,97,110,103,117,97,103,101,(0x11)]::[Int])));} $ ./a.out "just a simple functional language" Admittedly, I took about 3 days to find an example that didn't end in segf^h^h^h^htears.. :) -- Don P.S. And it seems this only ever compiled with a snapshot of ghc 6.3.

Malcolm Wallace

2:45 p.m.

Simon Marlow wrote:

...

...
...
Now, can we say something portable about these uses?

I'd like to have a precise (sound, if not complete) description of when it's safe to use unsafeCoerce in GHC, but it needs some careful thought.

And not just GHC. I think all the points you mention (below) would be entirely reasonable for all implementations.

...

* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',

* casting a polymorphic type to the actual type of the runtime value. That is, you can safely cast a value to its correct type. (eg. in Typeable.cast).

* casting an unboxed type to another unboxed type of the same size.

There is one more important use case you haven't mentioned: * casting from a newtype to the contained value (or vice versa). This latter type of cast is the only one I can remember ever having used myself. Regards, Malcolm

Taral

8:34 p.m.

On 11/20/06, Malcolm Wallace wrote:

...

There is one more important use case you haven't mentioned:

* casting from a newtype to the contained value (or vice versa).

This latter type of cast is the only one I can remember ever having used myself.

Why do you need to cast for this? You have "casting" operators in the form of the constructor and (synthesized) destructor. -- Taral "You can't prove anything." -- Gödel's Incompetence Theorem

Malcolm Wallace

9:19 p.m.

Taral writes:

...

...
* casting from a newtype to the contained value (or vice versa).

Why do you need to cast for this? You have "casting" operators in the form of the constructor and (synthesized) destructor.

OK, to be more explicit, newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs Here, the convert function actually traverses the list at runtime. convert' xs = unsafeCoerce xs But this version does not - it is a type-avoiding identity. Regards, Malcolm

Taral

11:46 p.m.

On 11/20/06, Malcolm Wallace wrote:

...

newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs

Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\ Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes. -- Taral "You can't prove anything." -- Gödel's Incompetence Theorem

Malcolm Wallace

21 Nov 21 Nov

9:40 a.m.

Taral writes:

...

On 11/20/06, Malcolm Wallace wrote:

...
newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs

Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\

And not all compilers have optimisation phases.

...

Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes.

I believe this is guaranteed by the definition of newtype in the Language Report. Regards, Malcolm

Simon Marlow

1:01 p.m.

Malcolm Wallace wrote:

...

Taral writes:

...
On 11/20/06, Malcolm Wallace wrote:

...
newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs

Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\

And not all compilers have optimisation phases.

...
Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes.

I believe this is guaranteed by the definition of newtype in the Language Report.

The language doesn't say anything about the runtime representation of newtype. It so happens that the semantics lead to the obvious implementation of a newtype as a type cast (that was the reason for introducing newtype, after all), but there's nothing to say you have to implement it this way. It would be wrong to require that unsafeCoerce let you convert between a newtype and its underlying type across implementations. Cheers, Simon

Robert Dockins

2:47 p.m.

On Nov 21, 2006, at 8:01 AM, Simon Marlow wrote:

...

Malcolm Wallace wrote:

...
Taral writes:

...
On 11/20/06, Malcolm Wallace wrote:

...
newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs

Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\ And not all compilers have optimisation phases. Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes. I believe this is guaranteed by the definition of newtype in the Language Report.

The language doesn't say anything about the runtime representation of newtype. It so happens that the semantics lead to the obvious implementation of a newtype as a type cast (that was the reason for introducing newtype, after all), but there's nothing to say you have to implement it this way. It would be wrong to require that unsafeCoerce let you convert between a newtype and its underlying type across implementations.

From Section 4.2.3, from the Haskell report: A declaration of the form newtype cx => T u1 ... uk = N t introduces a new type whose representation is the same as an existing type. The type (T u1 ... uk) renames the datatype t. It differs from a type synonym in that it creates a distinct type that must be explicitly coerced to or from the original type. Also, unlike type synonyms, newtype may be used to define recursive types. The constructor N in an expression coerces a value from type t to type (T u1 ... uk). Using N in a pattern coerces a value from type (T u1 ... uk) to type t. These coercions may be implemented without execution time overhead; newtype does not change the underlying representation of an object. I'd say that discusses the runtime representation.

...

Cheers, Simon

Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Simon Marlow

3:09 p.m.

Robert Dockins wrote:

...

On Nov 21, 2006, at 8:01 AM, Simon Marlow wrote:

...
Malcolm Wallace wrote:

...
Taral writes:

...
On 11/20/06, Malcolm Wallace wrote:

...
newtype Wrapper a = Wrap a convert :: [a] -> [Wrapper a] convert xs = map Wrap xs

Interesting! Looks like the compiler lacks rules for optimizing "map id" &c.\

And not all compilers have optimisation phases.

...
Your coercion does assume that the underlying runtime doesn't have some kind of type-tag implementation of type classes.

I believe this is guaranteed by the definition of newtype in the Language Report.

The language doesn't say anything about the runtime representation of newtype. It so happens that the semantics lead to the obvious implementation of a newtype as a type cast (that was the reason for introducing newtype, after all), but there's nothing to say you have to implement it this way. It would be wrong to require that unsafeCoerce let you convert between a newtype and its underlying type across implementations.

From Section 4.2.3, from the Haskell report:

A declaration of the form

newtype cx => T u1 ... uk = N t

introduces a new type whose representation is the same as an existing type. The type (T u1 ... uk) renames the datatype t. It differs from a type synonym in that it creates a distinct type that must be explicitly coerced to or from the original type. Also, unlike type synonyms, newtype may be used to define recursive types. The constructor N in an expression coerces a value from type t to type (T u1 ... uk). Using N in a pattern coerces a value from type (T u1 ... uk) to type t. These coercions may be implemented without execution time overhead; newtype does not change the underlying representation of an object.

*blink* I stand corrected (for the second time today, duh, maybe I should check facts before trusting my memory next time...). I have no idea why the report does say that though. Seems very odd, there's no need to mention the representation. Indeed, the language provides no way (absent unsafeCoerce) for a programmer to determine what the representation is, so how should we interpret that paragraph? An invisible requirement or an implementation hint? Cheers, Simon

kahl＠cas.mcmaster.ca

8:41 p.m.

Simon Marlow wrote:

...

...
From Section 4.2.3, from the Haskell report:

A declaration of the form

newtype cx => T u1 ... uk = N t

introduces a new type whose representation is the same as an existing type. The type (T u1 ... uk) renames the datatype t. It differs from a type synonym in that it creates a distinct type that must be explicitly coerced to or from the original type. Also, unlike type synonyms, newtype may be used to define recursive types. The constructor N in an expression coerces a value from type t to type (T u1 ... uk). Using N in a pattern coerces a value from type (T u1 ... uk) to type t. These coercions may be implemented without execution time overhead; newtype does not change the underlying representation of an object.

*blink*

I stand corrected (for the second time today, duh, maybe I should check facts before trusting my memory next time...).

I have no idea why the report does say that though. Seems very odd, there's no need to mention the representation. Indeed, the language provides no way (absent unsafeCoerce) for a programmer to determine what the representation is, so how should we interpret that paragraph? An invisible requirement or an implementation hint?

Or as a hint that we should introduce a ``pseudo-polymorphic pseudo-function'' safeCoerce :: a -> b which can only be used if a and b have the same representation, as mandated by the language definition? (Or if the compiler can derive that for the type language currently used...) Wolfram

Taral

11:18 p.m.

On 21 Nov 2006 15:41:01 -0500, kahl@cas.mcmaster.ca wrote:

...

Or as a hint that we should introduce a ``pseudo-polymorphic pseudo-function''

safeCoerce :: a -> b

which can only be used if a and b have the same representation, as mandated by the language definition?

It wouldn't be unreasonable for the compiler to provide instances of TypeEq for newtypes and product types. -- Taral "You can't prove anything." -- Gödel's Incompetence Theorem

Neil Mitchell

11:24 p.m.

...

Or as a hint that we should introduce a ``pseudo-polymorphic pseudo-function''

safeCoerce :: a -> b

While this may be a useful function (whatever semantics it might end up having), its not the function that I need to use :) The whole point of my use is to make Haskell compilers accept a program that does not have a type system that can be expressed in Haskell, or that can't be easily expressed. This has to be unsafe, because if the necessary safety invariants could be inferred by Haskell, that's basically the same as it doing the type checking. If you give unsafeCoerce any safety properties, I think you're going to destroy at least some of its uses. Thanks Neil

John Meacham

28 Nov 28 Nov

4:12 a.m.

On Tue, Nov 21, 2006 at 01:01:46PM +0000, Simon Marlow wrote:

...

The language doesn't say anything about the runtime representation of newtype. It so happens that the semantics lead to the obvious implementation of a newtype as a type cast (that was the reason for introducing newtype, after all), but there's nothing to say you have to implement it this way. It would be wrong to require that unsafeCoerce let you convert between a newtype and its underlying type across implementations.

indeed, in fact, even though jhc uses the same representation for newtypes as their value inside, unsafeCoerce here is still not valid. The reason being that various core-transformations inside the compiler are type directed, and the explicit knowledge that we are flipping something to its newtype or back is very important to guarenteed the safety of these optimizations. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham

4:04 a.m.

On Mon, Nov 20, 2006 at 02:45:13PM +0000, Malcolm Wallace wrote:

...

Simon Marlow wrote:

...
...
...
Now, can we say something portable about these uses?

I'd like to have a precise (sound, if not complete) description of when it's safe to use unsafeCoerce in GHC, but it needs some careful thought.

And not just GHC. I think all the points you mention (below) would be entirely reasonable for all implementations.

...
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',

* casting a polymorphic type to the actual type of the runtime value. That is, you can safely cast a value to its correct type. (eg. in Typeable.cast).

* casting an unboxed type to another unboxed type of the same size.

There is one more important use case you haven't mentioned:

* casting from a newtype to the contained value (or vice versa).

This latter type of cast is the only one I can remember ever having used myself.

there are very few safe uses of unsafeCoerce in jhc. the only ones guarenteed safe are * casting a recursive newtype to its representation and back (note, recursive newtypes are chosen via a loop-breaking algorithm in the compiler, so it is best to let it worry about this) * casting arbitrary values of kind * to a system provided type 'Box' and back again * casting arbitrary values of kind ! (the kind of strict boxed values) to a system provided type 'BoxBang' and back again * values may not ever be cast to anything other than Box and back _to their original type_ again. Even passing through any other type will poison them. in particular, even 'seq' is unsafe on casted values until they have been cast back to their original type. casting of unboxed types is right out (and guarenteed impossible by the type system) there are specific primitives to cast from kind * values to the analog type of kind ! values and back again these are not casts and are not subject to the above rules. (though, converting something of kind * to kind ! necessarily evaluates it). any cross-platform use of unsafeCoerce is folly, unless of course you have verified it is possible to use it in each compilers documentation you wish to support. since this is a perfectly reasonable thing to do, I support putting unsafeCoerce in a standard spot, but with no standardized behavior. John -- John Meacham - ⑆repetae.net⑆john⑈

Simon Marlow

30 Nov 30 Nov

12:23 p.m.

John Meacham wrote:

...

On Mon, Nov 20, 2006 at 02:45:13PM +0000, Malcolm Wallace wrote:

...
Simon Marlow wrote:

...
...
...
Now, can we say something portable about these uses?

I'd like to have a precise (sound, if not complete) description of when it's safe to use unsafeCoerce in GHC, but it needs some careful thought.

And not just GHC. I think all the points you mention (below) would be entirely reasonable for all implementations.

...
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',

* casting a polymorphic type to the actual type of the runtime value. That is, you can safely cast a value to its correct type. (eg. in Typeable.cast).

* casting an unboxed type to another unboxed type of the same size.

There is one more important use case you haven't mentioned:

* casting from a newtype to the contained value (or vice versa).

This latter type of cast is the only one I can remember ever having used myself.

there are very few safe uses of unsafeCoerce in jhc. the only ones guarenteed safe are

* casting a recursive newtype to its representation and back (note, recursive newtypes are chosen via a loop-breaking algorithm in the compiler, so it is best to let it worry about this) * casting arbitrary values of kind * to a system provided type 'Box' and back again * casting arbitrary values of kind ! (the kind of strict boxed values) to a system provided type 'BoxBang' and back again

Is kind ! visible to the programmer? How? What are you allowed/not allowed to do with something of kind !? I added a kind ! to GHC recently, but it has a very limited use: it's the kind of boxed/unlifted primitive types like ByteArray# and MutVar#. So you can write polymorphic functions over boxed/unlifted things, and one day maybe have arrays of them. Cheers, Simon

John Meacham

1:42 p.m.

On Thu, Nov 30, 2006 at 12:23:41PM +0000, Simon Marlow wrote:

...

Is kind ! visible to the programmer? How? What are you allowed/not allowed to do with something of kind !?

Yes it is, jhc has a rather robust kind system, originally developed to allow arbitrary typesafe mixing of strict and lazy code internally, I have found it quite useful to make its full power available to the programmer. A PTS is a type system that is parameterized by a set of 'sorts' that make up the kinds and superkinds in the system, a set of axioms describing the relationship between the sorts, and a set of rules describing what lambda abstractions are allowed based on sort. the PTS (pure type system) that jhc implements has the following 6 sorts. (in PTS's, both kinds and superkinds are considered "sorts") (#) - the kind of unboxed tuples # - the kind of unboxed values ! - the kind of boxed strict values * - the kind of boxed lazy values ** - the superkind of all boxed values ## - the superkind of all unboxed values the meaning of 'lazy' is exactly _|_ :: t iff t::* the axioms are as follows (*::**, #::##, (#)::##, !::**) basically, what you would expect, splitting things up into boxed and unboxed values. the way rules work are that they describe what you can abstract from what and then what the result is. so (#,*,*) - you can have a function from an unboxed value to a lazy boxed value and that function itself is lazy and boxed. (from,to,what) means we can have (\x:::from -> i:::to):::what a where ::: means the type of the type, so x:::y <-> exists t . x :: t and t :: y so, we have all of functions from any of #,*,! to #,*,! are allowed, and the result is a lazy boxed value. as a shortcut I will represent this as (*#!,*#!,*) (actually 9 rules in one) now there are a couple interesting ones: (*#!,(#),*) - a function may return an unboxed tuple and that function is lazy and boxed and the novel: ((#),*#!(#),!) - a function taking an unboxed tuple is a _strict_ value that is a function, meaning the function itself cannot be _|_. so, given the above rules, we can express any combination of lazy and strict interaction in a typesafe way. including the distinction between strict and lazy functional values. and on to the rules allowing types: (**,*,*) -- this says we can have functions from boxed types to values, as in forall a . a -> a we do not have a rule of the form (##,?,?) because an unboxed type cannot be passed to a function in a generic way, as values in it have a unique run-time representation. and we have (**,**,**) functions from types to types, such as the list constructor ([] :: * -> *) (**,##,##) - functions from types to unboxed types, often used as phantom arguments such as the declaration data Array__ a :: # making Array__ :: * -> # now, a great thing about this is that it is a valid PTS! so I get proofs of completeness, soundness, and a variety of robust typechecking algorithms for free! yay! in haskell source itself, you can use all of these kinds, the inference algorithm will only infer simple kinds, as in * or * -> * (ones made up of *'s and arrows) so you must use kind annotations to bring about other ones. some useful examples data StrictList a :: ! = Cons a (StrictList a) | Nil making StrictList :: * -> ! since a strictlist is of kind !, it can never contain _|_ in it, but the list argument is of kind *, so the list is strict in its spine, but may contain lazy values. data World__ :: # declares an unboxed type World__ with no values and hence no runtime representation. there is even a mild form of kind polymorphism allowed, with the folowing subkinding going on: ? / \ ?? (#) /\ *? # / \ * ! note that ?, ??, and *? are not kinds themselves, but can be used in haskell source files to represent that something can take more than one kind. for instance, IO has type data World__ :: # newtype IO (a::??) = IO (World__ -> (# World__, a #)) meaning, an IO action may return any sort of value, boxed, unboxed, strict or lazy (but not an unboxed tuple) any instantiation of IO will choose a concrete kind though. the declaration declares a simple sort of 'kind scheme'. none of ?, ??, or *? exist in core. they are only placeholders used in the kind inference algorithm presented to the programmer. an experimental feature is the 'undo' monad, whose syntax looks like the do monad, but is actually restricted to IO and function arguments and return values are automatically pulled into unboxed tuples and back again. It is entirely possible I mihht have an 'unboxed' mode, where '!' is the default infered kind for everything and unboxed tuples are automatically inserted at the appropriate points. it would make a very nice strict language with the full power of haskell type classes and the type system with the ability to call back and forth to haskell nicely! A front end for a strict language would just never produce anything of kind *. my original motivation for adding the ! kind was purely motivated by the back end, the grin back end is great at numerics, but is absolutely horrible at updates, doing update in place because I don't want to introduce the concept of redirections I would have to check for. Things of kind '!' never need indirections, as they are already evaluated, so they can use a more efficient representation, still boxed, but the code can just look directly into it or pass arguments to it without having to worry about it possibly being an unevaluated thunk. I am having some fun in 'undo', my strict toy language... so many distractions along the way, all I wanted was a nice way to avoid thunk indirections :) John -- John Meacham - ⑆repetae.net⑆john⑈

David House

21 Nov 21 Nov

3:11 p.m.

On 20/11/06, Simon Marlow wrote:

...

* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',

Couldn't we have the following for this case? castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL z = z castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR z = z Not general, but it'd work in this case. And no need for unsafeCoerce. -- -David House, dmhouse@gmail.com

Malcolm Wallace

5:35 p.m.

"David House" wrote:

...

...
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',

Couldn't we have the following for this case?

castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL z = z

castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR z = z

No actually. Try it! Any compiler will reject the catch-all cases ('z') because you are asking for the usage of z on the rhs to be at a different type than the pattern usage of z on the lhs. Regards, Malcolm

Iavor Diatchki

6:44 p.m.

Hello, I just wanted to point out that even the 'reasonable' uses of 'unsafeCoerce' (i.e., where a type does not appear directly in a value) may not be compatible between different implementations. Consider, for example, an implementation which turns a polymorphic program into a monomorphic one by generating different versions of functions for each of its instantiations (this is not a new idea). In particular, one also has to specialize constructor functions, which amounts to specializing datatypes. For example, 'Either@Int@Char' and 'Either@Int@Bool' become completely separate types. In addition, we may choose to add only those constructors of a specialized type that actually appear in the program. For example, 'Either@Int@Char' may have both constructors 'Left@Int@Char' and 'Righ@Int@Char', while 'Either@Int@Bool' may only have a single constructor 'Right@Int@Bool'. Now, if we use an integer tag to distinguish constructors, then by using 'unsafeCoerce' we may end up casting 'Left@Int@Char' into 'Right@Int@Bool'. -Iavor On 11/21/06, Malcolm Wallace wrote:

...

"David House" wrote:

...
...
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',

Couldn't we have the following for this case?

castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL z = z

castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR z = z

No actually. Try it! Any compiler will reject the catch-all cases ('z') because you are asking for the usage of z on the rhs to be at a different type than the pattern usage of z on the lhs.

Regards, Malcolm _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Simon Marlow

22 Nov 22 Nov

10:38 a.m.

Iavor Diatchki wrote:

...

Hello, I just wanted to point out that even the 'reasonable' uses of 'unsafeCoerce' (i.e., where a type does not appear directly in a value) may not be compatible between different implementations. Consider, for example, an implementation which turns a polymorphic program into a monomorphic one by generating different versions of functions for each of its instantiations (this is not a new idea). In particular, one also has to specialize constructor functions, which amounts to specializing datatypes. For example, 'Either@Int@Char' and 'Either@Int@Bool' become completely separate types. In addition, we may choose to add only those constructors of a specialized type that actually appear in the program. For example, 'Either@Int@Char' may have both constructors 'Left@Int@Char' and 'Righ@Int@Char', while 'Either@Int@Bool' may only have a single constructor 'Right@Int@Bool'. Now, if we use an integer tag to distinguish constructors, then by using 'unsafeCoerce' we may end up casting 'Left@Int@Char' into 'Right@Int@Bool'.

Quite right. I didn't mean to suggest that this property of unsafeCoerce should be a defined part of its behaviour across implementations, just that it currently works in GHC. Cheers, Simon

Aaron Denney

21 Nov 21 Nov

8:05 p.m.

On 2006-11-21, Malcolm Wallace wrote:

...

"David House" wrote:

...
...
* cast that changes a phantom type, or changes a type that is not reflected by a part of the value, eg. 'unsafeCoerce (Left 3) :: Either Int a' should be fine for any 'a',

Couldn't we have the following for this case?

castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL z = z

castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR z = z

No actually. Try it! Any compiler will reject the catch-all cases ('z') because you are asking for the usage of z on the rhs to be at a different type than the pattern usage of z on the lhs.

How about:

...

castEitherL :: Either a b -> Either a c castEitherL (Left x) = Left x castEitherL _ = error "castEitherL on Right"

castEitherR :: Either a b -> Either c b castEitherR (Right x) = Right x castEitherR _ = error "castEitherR on Left"

-- Aaron Denney -><-

Neil Mitchell

20 Nov 20 Nov

1:31 p.m.

...

I think that it is a really bad idea to make 'unsafeCoerce' a part of the standard libraries.

Currently to use unsafeCoerce you have to do two evil things (assume about the type system, use CPP). Since people already do use unsafeCoerce, we can at least half the evilness of these people ;) Thanks Neil

kahl＠cas.mcmaster.ca

10 Nov 10 Nov

2:38 p.m.

Malcolm Wallace wrote:

...

Currently, all implementations provide either the 'unsafeCoerce' or 'unsafeCoerce#' functions, but they live in implementation-specific locations. This proposal is to adopt the Haskell'98-compatible name 'unsafeCoerce', and to add it to the standard base library package. Suggested location: Data.Function.Unsafe?

http://hackage.haskell.org/trac/ghc/ticket/994

I'd much prefer Data.Unsafe. (It is a function, but has nothing to do with functions as data type.) Wolfram

Ashley Yakeley

10:35 p.m.

kahl@cas.mcmaster.ca wrote:

...

I'd much prefer Data.Unsafe.

I'd like to see all unsafe function in an Unsafe.* hierarchy (possibly consisting only of the single module Unsafe). Thus one could avoid unsafety altogether by avoiding the Unsafe and Foreign hierarchies. Otherwise unsafeCoerce should go with all the other unsafe functions in System.IO.Unsafe. -- Ashley Yakeley

Samuel Bronson

10:43 p.m.

On 11/10/06, Ashley Yakeley wrote:

...

kahl@cas.mcmaster.ca wrote:

...
I'd much prefer Data.Unsafe.

I'd like to see all unsafe function in an Unsafe.* hierarchy (possibly consisting only of the single module Unsafe). Thus one could avoid unsafety altogether by avoiding the Unsafe and Foreign hierarchies.

Couldn't you just avoid using functions having "unsafe" as a name prefix? Its not like they have names like "inocuousPerformIO" or anything!

Robert Dockins

11 Nov 11 Nov

1:06 a.m.

On Friday 10 November 2006 17:43, Samuel Bronson wrote:

...

On 11/10/06, Ashley Yakeley wrote:

...
kahl@cas.mcmaster.ca wrote:

...
I'd much prefer Data.Unsafe.

I'd like to see all unsafe function in an Unsafe.* hierarchy (possibly consisting only of the single module Unsafe). Thus one could avoid unsafety altogether by avoiding the Unsafe and Foreign hierarchies.

Couldn't you just avoid using functions having "unsafe" as a name prefix? Its not like they have names like "inocuousPerformIO" or anything!

It would be nice to be able to disallow all unsafe code by managing module imports. Suppose I want to run untrusted code. If I can verify that it doesn't use FFI, that it uses no unsafe primitives, and that it typechecks, then I know it is _unconditionally_ typesafe. If I can disallow unsafe primitives by just limiting the Unsafe.* and Foreign.* modules, that's a big win. If I instead have to keep a list of unsafe functions, that's not so good. So, I guess count this as another vote for Unsafe.* -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG

dons＠cse.unsw.edu.au

2:47 a.m.

robdockins:

...

On Friday 10 November 2006 17:43, Samuel Bronson wrote:

...
On 11/10/06, Ashley Yakeley wrote:

...
kahl@cas.mcmaster.ca wrote:

...
I'd much prefer Data.Unsafe.

I'd like to see all unsafe function in an Unsafe.* hierarchy (possibly consisting only of the single module Unsafe). Thus one could avoid unsafety altogether by avoiding the Unsafe and Foreign hierarchies.

Couldn't you just avoid using functions having "unsafe" as a name prefix? Its not like they have names like "inocuousPerformIO" or anything!

It would be nice to be able to disallow all unsafe code by managing module imports.

yes! this was *critical* in lambdabot, for allowing random users to run pure h98 expressions. A lot of time went in to working out the trusted module import base (so not stToIo, , unsafe* and so on). Currently unsafe things are scattered around System.*, Data.Array.* Control.*. This isn't ideal.

...

Suppose I want to run untrusted code. If I can verify that it doesn't use FFI, that it uses no unsafe primitives, and that it typechecks, then I know it is _unconditionally_ typesafe. If I can disallow unsafe primitives by just limiting the Unsafe.* and Foreign.* modules, that's a big win. If I instead have to keep a list of unsafe functions, that's not so good.

So, I guess count this as another vote for Unsafe.*

-- Don

Bulat Ziganshin

8:50 a.m.

New subject: Re[2]: proposal: add 'unsafeCoerce'

Hello Donald, Saturday, November 11, 2006, 5:47:36 AM, you wrote:

...

yes! this was *critical* in lambdabot, for allowing random users to run pure h98 expressions. A lot of time went in to working out the trusted module import base (so not stToIo, , unsafe* and so on).

...

Currently unsafe things are scattered around System.*, Data.Array.* Control.*.

isn't it _much_ better to use Virtual Machine to disallow bad code? may be yhc may provide such VM? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Neil Mitchell

9:54 a.m.

New subject: Re[2]: proposal: add 'unsafeCoerce'

...

isn't it _much_ better to use Virtual Machine to disallow bad code? may be yhc may provide such VM?

It could do with relative ease - just add a -safe flag and disallow IO in the interpretter. You could even keep the IO monad just not perform the actual IO effects. Thanks Neil

Bulat Ziganshin

12 Nov 12 Nov

11:06 p.m.

New subject: Re[4]: proposal: add 'unsafeCoerce'

Hello Neil, Saturday, November 11, 2006, 12:54:20 PM, you wrote:

...

It could do with relative ease - just add a -safe flag and disallow IO in the interpretter. You could even keep the IO monad just not perform the actual IO effects.

"look but don't touch" ? :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Cale Gibbard

21 Nov 21 Nov

10:21 a.m.

New subject: Re[2]: proposal: add 'unsafeCoerce'

I'd actually like to have an -unsafe flag which one would have to use in order to compile a module using functions/values marked as unsafe. Unsafety would be inherited by default, and using a compiler pragma would avoid this inheritance (but the module would still need to be compiled with -unsafe, just not its children.) I tend to think of the unsafe functions as hooks into the compiler/runtime to be used for modifying the language, not really as part of the language itself. Thanks, - Cale On 11/11/06, Neil Mitchell wrote:

...

Hi

...
isn't it _much_ better to use Virtual Machine to disallow bad code? may be yhc may provide such VM?

It could do with relative ease - just add a -safe flag and disallow IO in the interpretter. You could even keep the IO monad just not perform the actual IO effects.

Thanks

Neil _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Robert Dockins

11 Nov 11 Nov

1:47 p.m.

On Saturday 11 November 2006 03:50, Bulat Ziganshin wrote:

...

Hello Donald,

Saturday, November 11, 2006, 5:47:36 AM, you wrote:

...
yes! this was *critical* in lambdabot, for allowing random users to run pure h98 expressions. A lot of time went in to working out the trusted module import base (so not stToIo, , unsafe* and so on).

Currently unsafe things are scattered around System.*, Data.Array.* Control.*.

isn't it _much_ better to use Virtual Machine to disallow bad code? may be yhc may provide such VM?

Well, since you bring it up, I'm currently working on this topic. Yes, you can examine a program bytecode file and disallow FFI and primitive actions _in that module_ pretty easily. However, if your untrusted module imports Foreign.IO.unsafePerformIO, or Some.PathTo.unsafeCoerce, all your hard work goes out the window. The idea is to segregate all the unsafe code in one place do its easy to discriminate against it at the module level rather than at the individual function level. -- Rob Dockins Talk softly and drive a Sherman tank. Laugh hard, it's a long way to the bank. -- TMBG

Bulat Ziganshin

8:48 a.m.

New subject: Re[2]: proposal: add 'unsafeCoerce'

Hello Robert, Saturday, November 11, 2006, 4:06:20 AM, you wrote:

...

it is _unconditionally_ typesafe. If I can disallow unsafe primitives by just limiting the Unsafe.* and Foreign.* modules, that's a big win. If I instead have to keep a list of unsafe functions, that's not so good.

...

So, I guess count this as another vote for Unsafe.*

don't forget about Data.Array.Base and Data.ByteString.Base modules :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Malcolm Wallace

2 Apr 2 Apr

2:36 p.m.

I wrote (back in November):

...

Currently, all implementations provide either the 'unsafeCoerce' or 'unsafeCoerce#' functions, but they live in implementation-specific locations. This proposal is to adopt the Haskell'98-compatible name 'unsafeCoerce', and to add it to the standard base library package.

http://hackage.haskell.org/trac/ghc/ticket/994

Discussion finished, with a general feeling that this would be a good thing. There were some objections from Iavor (and others) that the behaviour of 'unsafeCoerce' is entirely implementation-dependent. Others replied, yes, indeed, but it _is_ marked "unsafe". The proposal merely makes the same name available in all compilers (to avoid the need for CPP), but gives no guarantees about its effects. You must satisfy yourself about whether any given usage is safe or not, with any given compiler. So I have just pushed a patch for this. The final location chosen in the module hierarchy was Unsafe.Coerce. Regards, Malcolm

6676

Age (days ago)

6819

Last active (days ago)

List overview

Download

45 comments

17 participants

participants (17)

Aaron Denney
Ashley Yakeley
Bulat Ziganshin
Cale Gibbard
David House
dons＠cse.unsw.edu.au
Iavor Diatchki
John Meacham
kahl＠cas.mcmaster.ca
Malcolm Wallace
Neil Mitchell
Robert Dockins
roconnor＠theorem.ca
Ross Paterson
Samuel Bronson
Simon Marlow
Taral