
In http://www.cs.columbia.edu/~cdo/icfp99.ps Okasaki does a couple things I'm not sure I quite understand: He has a datatype newtype Empty a = E () I cannot for the life of me figure out why he didn't just do data Empty a = E The other weird thing was: newtype Pair v w a = P (v a, w a) I can't see why he didn't do data Pair v w a = P (v a) (w a) I was wondering if anyone on this list knew of any reasons the types he chose would be more efficient than mine.... David

Feuer
newtype Empty a = E ()
vs.
data Empty a = E
and
newtype Pair v w a = P (v a, w a)
vs.
data Pair v w a = P (v a) (w a)
I was wondering if anyone on this list knew of any reasons the types he chose would be more efficient than mine....
When recently reading the GHC docs, the chapter about profiling, I think, I came across something about this. From Chapter 6: | Newtypes are better than datatypes: | | If your datatype has a single constructor with a single field, | use a newtype declaration instead of a data declaration. The newtype | will be optimised away in most cases -kzm -- If I haven't seen further, it is by standing in the footprints of giants

I know what you mean. However, if you look at it, data Empty1 a = E1 is a datatype with one constructor, and that constructor takes no arguments. So this is in fact a "phantom" unit type. newtype Empty2 a = E2 () is in fact better than data EmptyBad a = EBad () because the constructor E will be optimised away, and the type will be share the representation of the unit type (). However, I don't see why Empty1 and Empty2 would differ in any way at all... In any reasonable implementation, they would have the same internal representation, right? Similarly, data Pair1 v w a = P1 (v a) (w a) is just a tuple type. It should be exactly the same as newtype Pair2 v w a = P2 (v a, w a) If someone can explain why they would be different, I'm all ears. By the way, I don't understand why Haskell98 provides both strictness flags and newtype declarations.... it seems to me that newtype M [a1 a2 ....] = MC (...) should be exactly the same as data N [b1 b2 ....] = NC !(...) If I'm not mistaken, any compiler too dumb to notice a datatype with only one constructor strict in its one argument is too dumb to use. Ketil's local user wrote:
Feuer
asks about: newtype Empty a = E ()
vs.
data Empty a = E
and
newtype Pair v w a = P (v a, w a)
vs.
data Pair v w a = P (v a) (w a)
I was wondering if anyone on this list knew of any reasons the types he chose would be more efficient than mine....
When recently reading the GHC docs, the chapter about profiling, I think, I came across something about this. From Chapter 6:
| Newtypes are better than datatypes: | | If your datatype has a single constructor with a single field, | use a newtype declaration instead of a data declaration. The newtype | will be optimised away in most cases
-kzm -- If I haven't seen further, it is by standing in the footprints of giants

| Newtypes are better than datatypes: | | If your datatype has a single constructor with a single field, | use a newtype declaration instead of a data declaration. The newtype | will be optimised away in most cases
This feature is one I consider among the worst in Haskell. If a datatype with just a single constructor can be optimized away, why burden the programmer with such a task. It's easy for the compiler to filter out the data declarations having only one constructor and optimize them in the newtype way. I can't see any reasons not to do such data decl. optimization, but maybe I'm missing some point. Can anyone tell why this is up to the programmer, rather than the compiler?

No-one appears to have responded to this with a definitive answer... On Tue, 15 Jan 2002, Feuer wrote:
I know what you mean. However, if you look at it, data Empty1 a = E1 is a datatype with one constructor, and that constructor takes no arguments. So this is in fact a "phantom" unit type. newtype Empty2 a = E2 () is in fact better than data EmptyBad a = EBad () because the constructor E will be optimised away, and the type will be share the representation of the unit type (). However, I don't see why Empty1 and Empty2 would differ in any way at all... In any reasonable implementation, they would have the same internal representation, right?
Might this not be simply that an implementation _could_ compile them down to the same thing, but that given newtype exists in the language anyway (as I understand it primarily so that you can have types which the typechecker can tell aren't the same and yet leverage off an existing type which provides all the operations you need without suffering a run-time penalty) it's not worth the extra complication of adding special cases to the code for standard types (declared using data) so they are more efficient in this case. It's a point often made by Simon P-J that there aren't that many people actually working on Haskell interpreters/compilers around the world and work does get prioritised by the reward/difficulty ratio.
By the way, I don't understand why Haskell98 provides both strictness flags and newtype declarations.... it seems to me that newtype M [a1 a2 ....] = MC (...) should be exactly the same as data N [b1 b2 ....] = NC !(...) If I'm not mistaken, any compiler too dumb to notice a datatype with only one constructor strict in its one argument is too dumb to use.
Dunno (but I'm a Haskell diletantte), and I can't remember which order they were added to the language in. (I know newtype wasn't in the Haskell 1.0 report, I'm 99% sure strictness annotation wasn't either.) ___cheers,_dave_________________________________________________________ www.cs.bris.ac.uk/~tweed/|`...heat generated by its microprocessors will email:tweed@cs.bris.ac.uk|slope upward exponentially, reaching the power work tel:(0117) 954-5250 |density of a nuclear reactor before 2010'-Intel
participants (4)
-
D. Tweed
-
Feuer
-
Ketil's local user
-
Rijk-Jan van Haaften