Are newtypes optimised and how much?

So I have the following nice things: {-# LANGUAGE GeneralizedNewtypeDeriving, OverloadedStrings #-} import Data.String newtype Foo = Foo { unFoo :: String } deriving (IsString) x :: Foo x = "Hello, World!" newtype Bar = Bar { unBar :: Integer } deriving (Eq,Show,Num,Integral,Real,Enum,Ord) y :: Bar y = 2 I can write literals and they will be converted to the type I wanted with no extra verbiage needed. Questions (I'm talking about GHC when I refer to compilation): (1) Are fromString and fromIntegral ran at compile time? I don't think that this is the case. I think they are just translated to fromString "Hello, World!" and fromIntegral 2 verbatim. (2) Regardless of this, the implementation of fromString and fromIntegral is essentially a no-op, it's just fromString = Foo, fromIntegral = Bar, which is in turn essentially fromString = id, fromIntegral = id, as far as I understand it. It's purely compile time. But supposing I write: fromIntegral (fromIntegral (2::Integer) :: Bar) :: Integer Is this at the end of the day equal to just (2::Integer)? Thinking simple-mindedly, I would say, yes. The compiler knows that fromIntegral :: Integer -> Bar == id, and that fromIntegral :: Bar -> Integer == id (right?). But is that the case? Perhaps the type class methods have some dictionary and thus cannot be inlined, or maybe that doesn't matter? At the end of the day what motivated me to ask these questions it that I like very much defining newtypes for most of the types I use, I have completely forgotten about `type' aliasing. I'm completely happy to write Foo and unFoo all over the place to aid my type correctness, but I want a nice generic way to convert to/from newtypes but keeping it a compile-time concept. Sometimes I have unThisThat, unTheOther, unThoseWhoShantBeNamed, etc. and it I could just use fromIntegral and fromString then that would be super. Also, is 'map unFoo' optimised away at compile-time, too? I think that it would be compiled to map id. So it would still wrap a thunk around each cons. How far does it go? So, if I go around using fromIntegral/fromString (etc. for other newtype types), is it still kept compile time? After having newtypes catch dozens of type mismatches that otherwise wouldn't unified happily but were completely wrong (e.g. wrong argument order), I've found newtype to be an indispensable part of Haskell and of writing a large piece of software. Cheers

On 10/19/10 2:12 PM, Christopher Done wrote:
Questions (I'm talking about GHC when I refer to compilation):
(1) Are fromString and fromIntegral ran at compile time? I don't think that this is the case. I think they are just translated to fromString "Hello, World!" and fromIntegral 2 verbatim. (2) Regardless of this, the implementation of fromString and fromIntegral is essentially a no-op, it's just fromString = Foo, fromIntegral = Bar, which is in turn essentially fromString = id, fromIntegral = id, as far as I understand it.
Foo and unFoo are /essentially/ id, but they're not actually id. In particular, they stick around as System Fc coersions in the core language, whereas id can be compiled away entirely. Unfortunately this means that rewrite rules involving id won't fire, which is why I often add things like: {-# RULES "map Foo" map Foo = unsafeCoerce "fmap Foo" fmap Foo = unsafeCoerce "liftA Foo" liftA Foo = unsafeCoerce "liftM Foo" liftM Foo = unsafeCoerce "map unFoo" map unFoo = unsafeCoerce "fmap unFoo" fmap unFoo = unsafeCoerce "liftA unFoo" liftA unFoo = unsafeCoerce "liftM unFoo" liftM unFoo = unsafeCoerce #-} if I want to ensure them. Unfortunately, last I heard the use of unsafeCoerce can interfere with other rewrite rules, too, since it's also /essentially/ but not exactly id. I'd love to get an up-to-date story on how exactly newtypes and things like fromString, fromInteger, fromEnum, and fromRational are handled re how they get optimized in GHC 7. -- Live well, ~wren

| At the end of the day what motivated me to ask these questions it that | I like very much defining newtypes for most of the types I use, I have | completely forgotten about `type' aliasing. I'm completely happy to | write Foo and unFoo all over the place to aid my type correctness, but | I want a nice generic way to convert to/from newtypes but keeping it a | compile-time concept. Sometimes I have unThisThat, unTheOther, | unThoseWhoShantBeNamed, etc. and it I could just use fromIntegral and | fromString then that would be super. Yes, you can freely use Foo/unFoo. There's no runtime penalty. (In the jargon of GHC's intermediate language, Foo and unFoo translate to *type-safe casts*, which generate no executable code. That includes the 'newtype deriving' stuff too, and hence your uses of fromInteger etc. However, sadly: | Also, is 'map unFoo' optimised away at compile-time, too? I think that | it would be compiled to map id. So it would still wrap a thunk around | each cons. How far does it go? No, this isn't optimised. The trouble is that you write (map Foo xs), but GHC doesn't know about 'map'. We could add a special case for map, but then you'd soon want (mapTree Foo my_tree). What you really want is to say is something like this. Suppose my_tree :: Tree String. Then you'd like to say my_tree ::: Tree Foo meaning "please find a way to convert m_tree to type (Tree Foo), using newtype coercions. The exact syntax is a problem (as usual). We have the technology now. The question is how important it is. Simon | -----Original Message----- | From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] | On Behalf Of Christopher Done | Sent: 19 October 2010 19:12 | To: Haskell Cafe | Subject: [Haskell-cafe] Are newtypes optimised and how much? | | So I have the following nice things: | | {-# LANGUAGE GeneralizedNewtypeDeriving, OverloadedStrings #-} | | import Data.String | | newtype Foo = Foo { unFoo :: String } deriving (IsString) | | x :: Foo | x = "Hello, World!" | | newtype Bar = Bar { unBar :: Integer } deriving | (Eq,Show,Num,Integral,Real,Enum,Ord) | | y :: Bar | y = 2 | | I can write literals and they will be converted to the type I wanted | with no extra verbiage needed. | | Questions (I'm talking about GHC when I refer to compilation): | | (1) Are fromString and fromIntegral ran at compile time? I don't think | that this is the case. I think they are just translated to fromString | "Hello, World!" and fromIntegral 2 verbatim. | (2) Regardless of this, the implementation of fromString and | fromIntegral is essentially a no-op, it's just fromString = Foo, | fromIntegral = Bar, which is in turn essentially fromString = id, | fromIntegral = id, as far as I understand it. It's purely compile | time. But supposing I write: | | fromIntegral (fromIntegral (2::Integer) :: Bar) :: Integer | | Is this at the end of the day equal to just (2::Integer)? Thinking | simple-mindedly, I would say, yes. The compiler knows that | fromIntegral :: Integer -> Bar == id, and that fromIntegral :: Bar -> | Integer == id (right?). But is that the case? Perhaps the type class | methods have some dictionary and thus cannot be inlined, or maybe that | doesn't matter? | | At the end of the day what motivated me to ask these questions it that | I like very much defining newtypes for most of the types I use, I have | completely forgotten about `type' aliasing. I'm completely happy to | write Foo and unFoo all over the place to aid my type correctness, but | I want a nice generic way to convert to/from newtypes but keeping it a | compile-time concept. Sometimes I have unThisThat, unTheOther, | unThoseWhoShantBeNamed, etc. and it I could just use fromIntegral and | fromString then that would be super. | | Also, is 'map unFoo' optimised away at compile-time, too? I think that | it would be compiled to map id. So it would still wrap a thunk around | each cons. How far does it go? | | So, if I go around using fromIntegral/fromString (etc. for other | newtype types), is it still kept compile time? After having newtypes | catch dozens of type mismatches that otherwise wouldn't unified | happily but were completely wrong (e.g. wrong argument order), I've | found newtype to be an indispensable part of Haskell and of writing a | large piece of software. | | Cheers | _______________________________________________ | Haskell-Cafe mailing list | Haskell-Cafe@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-cafe

On 10/20/10 4:09 AM, Simon Peyton-Jones wrote:
No, this isn't optimised. The trouble is that you write (map Foo xs), but GHC doesn't know about 'map'. We could add a special case for map, but then you'd soon want (mapTree Foo my_tree).
How about a special case for fmap? That seems like it should handle a lot of cases. Cheers, Greg

On 20 October 2010 13:09, Simon Peyton-Jones
Yes, you can freely use Foo/unFoo. There's no runtime penalty. (In the jargon of GHC's intermediate language, Foo and unFoo translate to *type-safe casts*, which generate no executable code.
That includes the 'newtype deriving' stuff too, and hence your uses of fromInteger etc.
Oh wow, excellent!
No, this isn't optimised. The trouble is that you write (map Foo xs), but GHC doesn't know about 'map'. We could add a special case for map, but then you'd soon want (mapTree Foo my_tree).
What you really want is to say is something like this. Suppose my_tree :: Tree String. Then you'd like to say my_tree ::: Tree Foo meaning "please find a way to convert m_tree to type (Tree Foo), using newtype coercions.
The exact syntax is a problem (as usual). We have the technology now. The question is how important it is.
I don't know whether it's so important for me at least. I was just
interested in how much optimisation it did.
On 20 October 2010 17:58, Gregory Crosswhite
On 10/20/10 4:09 AM, Simon Peyton-Jones wrote:
No, this isn't optimised. The trouble is that you write (map Foo xs), but GHC doesn't know about 'map'. We could add a special case for map, but then you'd soon want (mapTree Foo my_tree).
How about a special case for fmap? That seems like it should handle a lot of cases.
Personally I haven't had much use for mapping newtype constructors/unconstructors. My personal use case of newtypes unwrapping/wrapping is in passing them to functions and in record fields, not so much unwarpping them inside data types and trees across the board. -- | Assign a review for a submission to a user. assignReview :: TrackId -> UserId -> SubmissionId -> Model () assignReview tid uid sid = do insert T.review $ F.uid <<- unUserId uid # F.submission <<- unSubmissionId sid # F.trackId <<- unTrackId tid This is the simplest function I could find. However, consider if in my busy hacking I accidentally get the argument order incorrect in the definition, or the call, with newtypes I can't mismatch them. This has stopped me doing bad things a few times in my haste. Another nice thing I've found is combining them with view patterns: -- | Submit a review. submitReview :: UserId -> SubmissionId -> [ReviewField] -> Model () submitReview (unUserId -> uid) (unSubmissionId -> sid) rs = do forM_ rs $ \ReviewField{..} -> ... The next ten lines or so use uid and sid. This makes the submitReview function kind of "guarded" from being given the wrong values. Inside it can do what it wants. You might point out that now that I've unwrapped them I can pass them to some other function willy nilly, but of course if a function needs a submission id, then it needs a SubmissionId. So I find this really nice. And of course I use the deriving extensively: newtype SubmissionId = SubmissionId { unSubmissionId :: Int } deriving (Show,Num,Eq,Ord,Enum,Integral,Real,ShowConstant,JSON) I have a types file with 46 newtypes, zero type aliases, and ah, 47 data types. GHCi gets kind of slow when dealing with it. My solution is to have an Emacs shortcut to build the project with cabal every so often so that GHCi can load the .o files in an instant.

On Oct 20, 2010, at 11:58 AM, Gregory Crosswhite
On 10/20/10 4:09 AM, Simon Peyton-Jones wrote:
No, this isn't optimised. The trouble is that you write (map Foo xs), but GHC doesn't know about 'map'. We could add a special case for map, but then you'd soon want (mapTree Foo my_tree).
How about a special case for fmap? That seems like it should handle a lot of cases.
Or even better, a special handling of 'id' in rules pragmas that would cause any rule matching id to also match any newtype constructor or projection. Or have the compiler automatically add rules that map all newtype wrappers and unwrappers to unsafeCoerce and make sure that unsafeCoerce has rules for map, fmap, (.), etc. --James

Do we really want to treat every newtype wrappers as a form of 'id'?
For example:
newtype Nat = Nat Integer -- must always be positive
A possible rule (doesn't actually typecheck, but you get the idea):
forall (x :: Nat). sqrt (x * x) = x
If we ignore newtyping we get an incorrect rewrite rule. It depends
on the exact implementation of which 'id's would be recognised.
On 20 October 2010 21:08, James Andrew Cook
On Oct 20, 2010, at 11:58 AM, Gregory Crosswhite
wrote: On 10/20/10 4:09 AM, Simon Peyton-Jones wrote:
No, this isn't optimised. The trouble is that you write (map Foo xs), but GHC doesn't know about 'map'. We could add a special case for map, but then you'd soon want (mapTree Foo my_tree).
How about a special case for fmap? That seems like it should handle a lot of cases.
Or even better, a special handling of 'id' in rules pragmas that would cause any rule matching id to also match any newtype constructor or projection. Or have the compiler automatically add rules that map all newtype wrappers and unwrappers to unsafeCoerce and make sure that unsafeCoerce has rules for map, fmap, (.), etc.
--James_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Push the envelope. Watch it bend.

On Oct 20, 2010, at 5:06 PM, Thomas Schilling
Do we really want to treat every newtype wrappers as a form of 'id'? For example:
newtype Nat = Nat Integer -- must always be positive
A possible rule (doesn't actually typecheck, but you get the idea):
forall (x :: Nat). sqrt (x * x) = x
If we ignore newtyping we get an incorrect rewrite rule. It depends on the exact implementation of which 'id's would be recognised.
That wouldn't be generalized to id, the special treatment would only apply to rule that _mention_ Prelude.id explicitly. Such rules would implicitly fire when, say, "fmap Nat xs" occurs because Nat would be considered a specialization of id in the (pattern side of) rule "fmap id = id". Rules mentioning Nat would not be magical in any way. -- James

On 10/20/10 7:09 AM, Simon Peyton-Jones wrote:
Yes, you can freely use Foo/unFoo. There's no runtime penalty. (In the jargon of GHC's intermediate language, Foo and unFoo translate to *type-safe casts*, which generate no executable code.
When does the conversion to type-safe casts occur relative to other optimizations (namely, rewrite rules)? -- Live well, ~wren

On 10/20/10 7:07 PM, wren ng thornton wrote:
On 10/20/10 7:09 AM, Simon Peyton-Jones wrote:
Yes, you can freely use Foo/unFoo. There's no runtime penalty. (In the jargon of GHC's intermediate language, Foo and unFoo translate to *type-safe casts*, which generate no executable code.
When does the conversion to type-safe casts occur relative to other optimizations (namely, rewrite rules)?
That is, I know that rewrite rules operate on the source language not on Core, but to what extent does that mean that type-safe casts inhibit the firing of rules? -- Live well, ~wren

On Thursday 21 October 2010 01:11:25, wren ng thornton wrote:
On 10/20/10 7:07 PM, wren ng thornton wrote:
On 10/20/10 7:09 AM, Simon Peyton-Jones wrote:
Yes, you can freely use Foo/unFoo. There's no runtime penalty. (In the jargon of GHC's intermediate language, Foo and unFoo translate to *type-safe casts*, which generate no executable code.
When does the conversion to type-safe casts occur relative to other optimizations (namely, rewrite rules)?
That is, I know that rewrite rules operate on the source language not on Core, but to what extent does that mean that type-safe casts inhibit the firing of rules?
Pretty much completely, I'm afraid. There's a rewrite rule for realToFrac :: Double -> Float. Build yourself a benchmark for that transformation that runs some measurable time (say 0.1 seconds or so). Now use a newtype wrapper around Float instead (and don't add a rewrite rule for it). Ouch.

| >> Yes, you can freely use Foo/unFoo. There's no runtime penalty. (In the | >> jargon of GHC's intermediate language, Foo and unFoo translate to | >> *type-safe casts*, which generate no executable code. | > | > When does the conversion to type-safe casts occur relative to other | > optimizations (namely, rewrite rules)? | | That is, I know that rewrite rules operate on the source language not on | Core, but to what extent does that mean that type-safe casts inhibit the | firing of rules? Rewrite rules work on Core, not source, so they "see" the type-safe casts. I don't know how to say "to what extent" they inhibit rules. If you show an example I will try to help. S

Simon Peyton-Jones wrote:
What you really want is to say is something like this. Suppose my_tree :: Tree String. Then you'd like to say my_tree ::: Tree Foo meaning "please find a way to convert m_tree to type (Tree Foo), using newtype coercions.
The exact syntax is a problem (as usual). We have the technology now. The question is how important it is.
I think extending the syntax for contexts would be sufficient: Write a ~~ b for "a can be converted to b by wrapping / unwrapping newtypes", which is a conservative approximation of "a and b have the same representation". Then we can define safeCoerce :: (a ~~ b) => a -> b safeCoerce = unsafeCoerce and your example would become safeCoerce my_tree :: Tree Foo The feature would add convenience to the language when working with newtypes, and reduce the tension between type safety and performance (where the choice is between using a newtype and unsafeCoerce, and just working with the plain underlying type.) So while the pressure is quite low, I imagine it would become quite a useful feature once we'd have it, but that's of course speculation. As far as I can see, the feature is nontrivial: Care has to be taken to not break abstractions (like "safely"coercing IO to ST), so it's quite possible that the engineering effort outweighs the potential benefits. Best regards, Bertram

| > The exact syntax is a problem (as usual). We have the technology now. The | question is how important it is. | | I think extending the syntax for contexts would be sufficient: | Write a ~~ b for "a can be converted to b by wrapping / unwrapping | newtypes", which is a conservative approximation of "a and b have the | same representation". | | Then we can define | | safeCoerce :: (a ~~ b) => a -> b | safeCoerce = unsafeCoerce Yes, that's right. When I said "we have the technology" I meant that we (will) have something similar to ~~. See our paper "Generative Type Abstraction and Type-level Computation" http://www.cis.upenn.edu/~sweirich/newtypes.pdf. No unsafeCoerce required. Simon

| Then we can define | | safeCoerce :: (a ~~ b) => a -> b | safeCoerce = unsafeCoerce
Yes, that's right. When I said "we have the technology" I meant that we (will) have something similar to ~~. See our paper "Generative Type Abstraction and Type-level Computation" http://www.cis.upenn.edu/~sweirich/newtypes.pdf. No unsafeCoerce required.
The idea was to put safeCoerce into a library. The syntax extension would be light-weight because contexts, unlike expressions, still have plenty of room for extensions. The idea is is based on the assumption that to the compiler, 'unsafeCoerce' looks like an artificial safe coercion, so that after inlining safeCoerce, we get exactly the effect of a safe coercion during type checking and further compilation. Perhaps that assumption is wrong. I'll look at the paper. Regards, Bertram
participants (8)
-
Bertram Felgenhauer
-
Christopher Done
-
Daniel Fischer
-
Gregory Crosswhite
-
James Andrew Cook
-
Simon Peyton-Jones
-
Thomas Schilling
-
wren ng thornton