Type system speculation

People in the Haskell community get awfully excited about Haskell's type system. When I was first learning Haskell, I found this rather odd. After all, a "type" is just a flat name that tells the compiler how many bits to allocate and which operations to allow, right? As I read further, I quickly discovered that in Haskell, the type system is something much more powerful. In a normal language, the type system prevents you from accidentally trying to multiply a string by a binary tree of integers, or pass a function an signed integer when it's expecting an unsigned integer. But in Haskell, the combination of parametric polymorphism, typeclasses and automatic type inference combine to form a potent mixture. Even without any exotic language extensions, in Haskell you can use the type system to *prove stuff about your programs*! And not necessary stuff that's anything to do with the usual notion of a "data type". You can prove that lists are non-empty or even of a specific size. You can prove that optional values get checked before they are used. And you can prove that user input is sanitised before being passed to sensitive functions. (Or rather, you can make your program simply not compile if the required properties are not guaranteed.) Let's take an example. Suppose you have some functions that obtain input from a user, and you have some other functions that do something with this data. What you can do is something like this: newtype Data x y = Data x data Checked data Unchecked check_string :: Data String Unchecked -> Data String Checked Now make it so that everything that obtains data from the user returns Data String Unchecked, and everything that uses a string to do something potentially hazardous demands Data String Checked. (And any functions that don't care whether the string is checked or not can just use Data String x.) Only one problem: Any functions that expect a plain String now won't work. Particularly, the entirity of Data.List will refuse to accept this stuff. Which is kind of galling, considering that a Data String Checked *is* a String! What we're really trying to do here is attach additional information to a value - information which exists only in the type checker's head, but has no effect on runtime behaviour (other than determining whether we *get* to runtime). As far as I can tell, Haskell does not provide any way to take an existing type and attach additional information to it in such a way that code which cares about the new information can make use of it, but existing code which doesn't care continues to work. Any comments on this one? On a rather unrelated note, I found myself thinking about the Type Families extension (or whatever the hell it's called). I found that I can do this: class Collection c where type Element c :: * empty :: c first :: c -> Maybe (Element c) ... This works for collections that can contain *anything*: instance Collection [x] where type Element [x] = x ... It works for collections that can only contain *one* type: instance Collection ByteString where type Element ByteString = Word8 ... And it works for collections that can handle more than one type, but less than *every* type: instance Ord x => Collection (Set x) where type Element (Set x) = x ... Something quite interesting is happening here: this "Element" thing is almost like a *function* which takes one type and returns another. A function that operates on types themselves. Now suppose I want to write a function that converts one kind of collection into another: convert :: (Collection c1, Collection c2) => c1 -> c2 Erm, no. That doesn't work AT ALL. What we want to say is that the element types are constrained (specifically, they're identical). I searched for quite some time looking for a way to say this. In the end, I gave up and let the compiler deduce the type signature automatically. Apparently the correct signature is convert :: (Collection c1, Collection c2, Element c1 ~ Element c2) => c1 -> c2 Apparently the bizare syntax "Element c1 ~ Element c2" means that these two types are supposed to be equal. (Why not just an equals sign?) This got me thinking... Maybe what we'd really like to say is something like convert :: (Collection c1 {Element = x}, Collection c2 {Element = x}) => c1 -> c2 In other words, make the assosiated type like a named field of the class, using syntax similar to the named fields that values can have. And then, if we can do it to classes, why not types? data Map {keys :: *, values :: *} = ... insert :: (Ord k) => k -> v -> Map {keys = k, values = v} -> Map {keys = k, values = v} Now, it seems reasonable that if one of the type fields is unconstrained, you do not need to mention it: keys :: (Ord k) => Map {keys = k} -> [k] values :: Map {values = v} -> [v] If we can have some mechanism for adding new type fields to an existing type, we are one step away from the problem at the beginning. You could take the existing String type, and add (by whatever the mechanism is) a new "safety" field to it. All of Data.List ignores this field, but any new code you write can choose to use it if required. At least, that's my vague idea. Anybody like it? (My first thought is that "Map {keys = k, values = v}" is going to start getting tedious to type very, very quickly. Perhaps we should do something like values :: (m :: Map {values = v}) => m -> [v], or even allow values :: (m :: Map {}) => m -> [values m] as an alternative? IDK.) So there you have it. Multiple paragraphs of semi-directed rambling about type systems. All written by somebody who doesn't even know what a Skolem variable is. :-}

Andrew Coppin wrote:
What we're really trying to do here is attach additional information to a value - information which exists only in the type checker's head, but has no effect on runtime behaviour (other than determining whether we *get* to runtime). As far as I can tell, Haskell does not provide any way to take an existing type and attach additional information to it in such a way that code which cares about the new information can make use of it, but existing code which doesn't care continues to work. Any comments on this one?
In defense of Haskell, that's not what the semantics of newtype are. The semantics of newtype declare that the, er, new type is unrelated to the old type--- despite however they may be represented at runtime. Thus, allowing functions to ignore the newtype wrapper would defeat the intentions of using newtypes in the first place. I actually find this to be one of the most powerful things in the type system of H98, and I miss having them in every other language where I'm forced to sacrifice performance or correctness. As a stop-gap measure you can always make your newtype a functor and then use (<$>) for application instead of ($) or whitespace. That said, I've also considered the ability to have a "multi-type" system where types can be annotated with orthogonal information like Perl's taint flag (which your example resembles) or which subset of IO is used for some particular IO x (e.g., which exceptions can be thrown). There's some trickiness with getting multi-types to work right because we'd like to be able to restrict certain functions to only accept sanitized values, or values in a given format, or what have you. That is, just passing the information around at the type level isn't enough, we have to be able to restrict it too. It's certainly a worthy avenue of research to work the kinks out and get a nice workable system. -- Live well, ~wren

On Wed, Dec 9, 2009 at 7:47 PM, wren ng thornton
Andrew Coppin wrote:
What we're really trying to do here is attach additional information to a value - information which exists only in the type checker's head, but has no effect on runtime behaviour (other than determining whether we *get* to runtime). As far as I can tell, Haskell does not provide any way to take an existing type and attach additional information to it in such a way that code which cares about the new information can make use of it, but existing code which doesn't care continues to work. Any comments on this one?
In defense of Haskell, that's not what the semantics of newtype are. The semantics of newtype declare that the, er, new type is unrelated to the old type--- despite however they may be represented at runtime. Thus, allowing functions to ignore the newtype wrapper would defeat the intentions of using newtypes in the first place. I actually find this to be one of the most powerful things in the type system of H98, and I miss having them in every other language where I'm forced to sacrifice performance or correctness.
What caught me about Andrew's idea is that it would allow for kind polymorphism in a useful way. Or at least, in a way that I tend to yearn for it. Do you have anyway to get kind polymorphism? Jason
participants (3)
-
Andrew Coppin
-
Jason Dagit
-
wren ng thornton