
hello, it looks like a lot of Haskell experts are at POPL (prog. language conference) and are not checking their email... so here is my attempt at explaining the differences between data and newtype. data T -- has 1 vale: _|_ data T = T -- has 2 values: _|_, T data T = T () -- has 3 values: _|_, T _|_, T () newtype T = T () -- has 2 values: _|_, T () basically the "data" declaration in Haskell "lifts" a type, by adding a new _|_ element. the reason for this is that in general a data may have different shapes (hence the constructors) and while trying to make a value of the type one might non-terminate before they computed what constructor should be applied. the annoying thing is that this happens even for datatypes which dont have different shapes, i.e. data declarations with only one different shape. Even if there is only one possible shape we always remember the shape (constructor) in the representation and when pattern matching check to see if the value is of this shape. i dont know why. to (kind of) avoid this, if you have a datatype with only one shape and one field you should use a newtype. unfortunatelly if you have multiple fields you are stuck with the additional _|_. i dont think there should be any difference in efficiency between: data P a b = P a b and newtype P a b = P (a,b) unless there are some hard-coded tuple specific optimizations done by the current implementations. i cannot see any reasons why these optimizations shouldnt happen for the data declarartion as well (i am not experienced compiler writer however...) which of the above is better is a matter of personal preference - i prefer the 1st one as i think it leads to more readable patterns. finaly there was the question about strictness annotations - they seem to be quite orthogonal to the whole newtype/data issue and enable you to specify that some constructors of your datatype are strict - i.e. when you construct something with a strict constructor it will first be evalued (until it becomes a function or an application of a lazy constructor to some arguments) and only then will the disired value be created. since this evaluation might not terminate of course, some "different" values in the lazy version of the datatype all collapse to teh same thing. so in some sense newtype makes things more lazy, while ! makes them more eager. sorry for the long email hope it helped iavor -- ---------------------------------------+---------------------------------------- Iavor S. Diatchki | email: diatchki@cse.ogi.edu Dept. of Computer Science | web: http://www.cse.ogi.edu/~diatchki OGI School of Science and Engineering | work phone: 5037481631 OHSU | home phone: 5036434085 ---------------------------------------+----------------------------------------