newtype/datatype (was efficiency)

hello, it looks like a lot of Haskell experts are at POPL (prog. language conference) and are not checking their email... so here is my attempt at explaining the differences between data and newtype. data T -- has 1 vale: _|_ data T = T -- has 2 values: _|_, T data T = T () -- has 3 values: _|_, T _|_, T () newtype T = T () -- has 2 values: _|_, T () basically the "data" declaration in Haskell "lifts" a type, by adding a new _|_ element. the reason for this is that in general a data may have different shapes (hence the constructors) and while trying to make a value of the type one might non-terminate before they computed what constructor should be applied. the annoying thing is that this happens even for datatypes which dont have different shapes, i.e. data declarations with only one different shape. Even if there is only one possible shape we always remember the shape (constructor) in the representation and when pattern matching check to see if the value is of this shape. i dont know why. to (kind of) avoid this, if you have a datatype with only one shape and one field you should use a newtype. unfortunatelly if you have multiple fields you are stuck with the additional _|_. i dont think there should be any difference in efficiency between: data P a b = P a b and newtype P a b = P (a,b) unless there are some hard-coded tuple specific optimizations done by the current implementations. i cannot see any reasons why these optimizations shouldnt happen for the data declarartion as well (i am not experienced compiler writer however...) which of the above is better is a matter of personal preference - i prefer the 1st one as i think it leads to more readable patterns. finaly there was the question about strictness annotations - they seem to be quite orthogonal to the whole newtype/data issue and enable you to specify that some constructors of your datatype are strict - i.e. when you construct something with a strict constructor it will first be evalued (until it becomes a function or an application of a lazy constructor to some arguments) and only then will the disired value be created. since this evaluation might not terminate of course, some "different" values in the lazy version of the datatype all collapse to teh same thing. so in some sense newtype makes things more lazy, while ! makes them more eager. sorry for the long email hope it helped iavor -- ---------------------------------------+---------------------------------------- Iavor S. Diatchki | email: diatchki@cse.ogi.edu Dept. of Computer Science | web: http://www.cse.ogi.edu/~diatchki OGI School of Science and Engineering | work phone: 5037481631 OHSU | home phone: 5036434085 ---------------------------------------+----------------------------------------

A while ago, to help myself understand newtype, data, and strictness, I tried to write down how Haskell types correspond to lifted domains. Here is a cleaned-up version of my attempt. I am not sure that what follows is entirely correct -- please point out any errors. I would also appreciate comments or pointers to where this stuff is better described. In particular, I have not considered recursive types. Every type "t" in Haskell can be denotationally modeled by a complete partial order (CPO), which we will notate as "[t]". In fact, every type in standard Haskell corresponds to a -pointed- CPO, meaning a CPO with a least element "bottom". GHC, in addition, supports -unboxed- types, which correspond to CPOs in general. Given a CPO "D", we can -lift- it to a pointed CPO "lift D" by adding a bottom. Given two (not necessarily pointed) CPOs "D" and "E", we can take their -cartesian product- "D x E", which is like the set-theoretic product. Given two pointed CPOs "D" and "E", we can take their -smash product- "D * E", which is like the set-theoretic/cartesian product, but which identifies all pairs of the form "(x,bottom)" or "(bottom,y)" into one single new bottom element. The identity of "*" is the two-element pointed CPO "1", which consists of a bottom and a top. Given two pointed CPOs "D" and "E", we can take their -coalesced sum- "D + E", which is like the set-theoretic sum, but which identifies the bottom from "D" and the bottom from "E" into one single new bottom element. The identity of "+" is the one-element pointed CPO "0", which consists only of a bottom. Given any two (not necessarily pointed) CPOs, we have the property lift D * lift E = lift (D x E). In Haskell, if you write newtype t' = T t -- , then "[t']" is exactly isomorphic to "[t]". Pattern matching on the "T" is a no-op (i.e., always succeeds). Now turn to a data declaration data t' = T1 ... | T2 ... | ... -- . Each branch in the declaration specifies a pointed CPO by naming zero or more data types, each of which can be strict or non-strict. We lift the non-strict ones, then take the smash product of everything. The result is the pointed CPO denoted by the branch. The pointed CPO "[t']" denoted by the combined data type "t'" is the coalesced sum of the pointed CPOs denoted by all the branches. For example, if we write data t' = T1 !t11 !t12 | T2 t2 | T3 -- , then [t'] = [t11] * [t12] + lift [t2] + 1 -- . Pattern-matching on any data-declared type such as "t'" above is always strict, in the sense that if the value being matched against is bottom, then the whole expression denotes bottom. For example, given how we defined "t'" above, the expression case x' of T1 _ _ -> y T2 _ -> y T3 -> y is entirely equivalent to x' `seq` y -- . The only reason to have `seq` is because primitive types such as Int cannot be pattern-matched against. (Hence people call "seq" "polymorphic case".) Unboxed types are non-pointed CPOs, so it doesn't make sense to say data t' = T !t if "t" is an unboxed type, but it does make sense to say data t' = T t because "[t]" can be lifted to a pointed CPO. You ought to be able to say newtype t' = T t to define a new unboxed type "t'" from an existing unboxed type "t", but I faintly remember this being not implemented or allowed or something. Tuple types are special notation for what could be otherwise defined using "data" declarations as follows: data () = () data (,) t1 = (,) t1 data (,,) t1 t2 = (,,) t1 t2 data (,,,) t1 t2 t3 = (,,,) t1 t2 t3 -- ... So we have [( )] = 1 [(t1 )] = lift [t1] [(t1,t2 )] = lift [t1] * lift [t2] = lift ([t1] x [t2]) [(t1,t2,t3)] = lift [t1] * lift [t2] * lift [t3] = lift ([t1] x [t2] x [t3]) ... If "[t1]" has 3 elements including its bottom, and "[t2]" has 5 elements including its bottom, then "[(t1,t2)]" has 3 * 5 + 1 = 16 elements including its bottom. Given the definitions above, let us consider some corner and not so corner declarations: (When I write that foo denotes bar, I mean that foo denotes something isomorphic to bar.) data T1 -- denotes "0", the identity for "+" -- (but this is not standard Haskell) data T2 = T2 -- denotes "1", the identity for "*" data T3 = T3 () -- denotes "lift 1", because "()" denotes "1" data T4 = T4 !() -- denotes "1", because "()" denotes "1" newtype T5 = T5 () -- denotes "1", because "()" denotes "1" data T6 = T6 !T1 Int -- denotes "0 * lift [Int]", which is just "0" Note that although both T4 and T5 denote "1", pattern-matching on them is different: case T4 undefined of T4 _ -> y === undefined case T5 undefined of T5 _ -> y === y Whew. -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig and view Ken's picture at http://www.medianstrip.net/~julia/pix/ken.jpg "Dont waste your fire men! They couldn't hit an elephant at this dist--" -Last words of Union General John Sedgwick at the Battle of Spotsylvania 1864

On Wed, Jan 16, 2002 at 11:25:43PM -0500, Ken Shan wrote:
Every type "t" in Haskell can be denotationally modeled by a complete partial order (CPO), which we will notate as "[t]". In fact, every type in standard Haskell corresponds to a -pointed- CPO, meaning a CPO with a least element "bottom".
We can avoid smash products and coalesced sums by analysing [t] as being the lifting of a not-necessarily-pointed cpo T[[t]]: [t] = lift (T[[t]]) We can define T[[t]] by induction over t, using the following operations on cpos (not necessarily having a bottom element): lift D = D plus a new bottom element D x E = cartesian product of D and E D + E = disjoint union of D and E D -> E = the cpo of continuous functions from D to E 1 = the one-element cpo 0 = the empty cpo For primitive types like Int, Integer, Double, etc, T[[t]] is just a discrete cpo (i.e. a set, with no bottom element). For Haskell's function type, we have T[[s -> t]] = [s] -> [t] For a data declaration data D = c1 | ... | cn we have T[[D]] = C[[c1]] + ... + C[[cn]] C[[K a1 ... an]] = A[[a1]] x ... x A[[an]] C[[K]] = 1 -- the identity for cartesian product A[[t]] = [t] A[[! t]] = T[[t]] For a newtype declaration newtype N = K t we have T[[N]] = T[[t]] To take the original questions, newtype Empty a = E () data Empty' a = E' newtype Pair v w a = P (v a, w a) data Pair' v w a = P' (v a) (w a) we have T[[Empty]] = T[[()]] = 1 T[[Empty']] = C[[E']] = 1 T[[Pair v w a]] = T[[(v a, w a)]] = [v a] x [w a] T[[Pair' v w a]] = C[[P' (v a) (w a)]] = [v a] x [w a] So semantically they're the same. I suspect the reason Chris used newtype is that GHC knows about the predefined tuple types and does some extra optimizations on them, but doing this for user-defined type constructors across module boundaries is a bit harder. Similarly, given newtype M = MC t data N = NC !t we have T[[M]] = T[[t]] = T[[N]], though as you pointed out, pattern matching is defined differently for the two types. (The formal semantics in the Report seems to be missing the relevant rule, though its informal semantics is clear on this point.) As for recursive types, I don't know a simple presentation of the general case, but if there's no recursion through the first argument of a -> type it's just the smallest cpo such that the two sides are isomorphic. For example, for data Nat = Zero | Succ Nat we have T[[Nat]] ~= 1 + lift (T[[Nat]]) or [Nat] ~= lift (1 + [Nat]) The elements of [Nat] are _|_, Zero, Succ _|_, Succ Zero, etc and, to make it complete, an extra element Succ (Succ (Succ ...)) which is the least upper bound of the chain _|_ <= Succ _|_ <= Succ (Succ _|_) <= ... On the other hand, for data Nat = Zero | Succ !Nat we have T[[Nat]] ~= 1 + T[[Nat]] i.e. the elements of T[[Nat] are equivalent to the natural numbers, and [Nat] has an extra bottom element. Another example (from Patrik Jansson): newtype Void = Void Void then T[[Void]] is the least cpo such that T[[Void]] = T[[Void]] so it is the empty cpo 0, and [Void] is a one-element cpo. Another definition of the same cpo is data Void = Void !Void and many more are possible, even without GHC's extension.

On Thu, Jan 17, 2002 at 12:16:14PM +0000, Ross Paterson wrote:
We can avoid smash products and coalesced sums by analysing [t] as being the lifting of a not-necessarily-pointed cpo T[[t]]:
[t] = lift (T[[t]])
We can define T[[t]] by induction over t, using the following operations on cpos (not necessarily having a bottom element):
lift D = D plus a new bottom element D x E = cartesian product of D and E D + E = disjoint union of D and E D -> E = the cpo of continuous functions from D to E 1 = the one-element cpo 0 = the empty cpo ... For Haskell's function type, we have
T[[s -> t]] = [s] -> [t]
If I understand it correctly, this makes \x.undefined :: a -> b different from undefined :: a -> b For instance, in this setup, the CPO [()->()] has four elements, in a totally ordered CPO; in increasing order, they are undefined const undefined id const () Is it really clear the first two ('undefined' and 'const undefined') are different? Ken says they are observationally equivalent. --Dylan Thurston

Dylan Thurston wrote:
If I understand it correctly, this makes \x.undefined :: a -> b different from undefined :: a -> b For instance, in this setup, the CPO [()->()] has four elements, in a totally ordered CPO; in increasing order, they are undefined const undefined id const () Is it really clear the first two ('undefined' and 'const undefined') are different? Ken says they are observationally equivalent.
The two first are different: seq undefined 0 gives bottom seq (\ x -> undefined) 0 gives 0 When seq was introduced as a polymorphic function the function space in Haskell got lifted. :-( Not that in lambda calculus they are the same; seq is not lambda definable. -- Lennart
participants (6)
-
Dylan Thurston
-
Iavor Diatchki
-
Jan de Wit
-
Ken Shan
-
Lennart Augustsson
-
Ross Paterson