type inference and named fields

older
RE: [Haskell-cafe] Re: [Haskell]...

Malcolm Wallace

23 Jun 2005 23 Jun '05

6:01 p.m.

I have discovered something I believe to be a problem in Haskell'98, although it is not a simple bug as such - it has more of the flavour of an unintended mismatch in the interaction of two separate features. Since Haskell is deeply principled language, a feature conflict is extremely rare, and so worthy of note. So here it is: Some attempts to use named field updates are prohibited by type inference, even though the code is perfectly type safe. Here is an example.

...

module Fieldbug where data Fields a = VariantWithTwo { field1 :: a , field2 :: a } | VariantWithOne { field1 :: a }

The key point here is that the data structure with named fields has more than one constructor, and some fields are omitted in one of the variants. Now let's try to write a simple conversion function over values of this type, using only the named-field update style:

...

data Void = Void

voidcast :: Fields a -> Fields Void voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void }

It looks simple enough doesn't it? But none of the widely available Haskell implementations will accept this code. Hugs: ERROR "Fieldbug.hs":11 - Inferred type is not general enough *** Expression : voidcast *** Expected type : Fields a -> Fields Void *** Inferred type : Fields Void -> Fields Void ghc: Fieldbug.hs:11:32: Couldn't match the rigid variable `a' against `Void' `a' is bound by the type signature for `voidcast' Expected type: Fields Void Inferred type: Fields a In the record update: v {field1 = Void} In the definition of `voidcast': voidcast (v@(VariantWithOne {})) = v {field1 = Void} nhc98: ====== Errors after type inference/checking: Derived type for Fieldbug.voidcast at 10:1-11:42 does not match due to: given free variable a is bound to Void Derived:((Fields Void) -> (Fields Void)) Given :((Fields a) -> (Fields Void)) As far as I can tell, the problem is that in the final line, the expression v { field1 = Void } is interpreted by the type inference algorithm as if the variable v could contain either constructor VariantWithOne or VariantWithTwo. Obviously in the latter case the expression would indeed be incomplete because it casts only one of the fields, not both. But we know that can never be the case! By pattern-matching, this particular v /must/ be the former constructor. Unfortunately we cannot pass that knowledge into the type inference algorithm. It turns out that it is in fact /impossible/ to write this conversion function using only named field updates. It can only be written by using an explicit constructor on the rhs, which thus forces you to initialise all its fields explicitly. This negates any value from using named fields in the first place. Regards, Malcolm

Show replies by date

Lennart Augustsson

23 Jun 23 Jun

6:12 p.m.

A somewhat similar problem exists even without fields: foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x Since Haskell type checking doesn't use the information gained by pattern matching to refine types we just have to accept that some perfectly safe programs don't type check. -- Lennart Malcolm Wallace wrote:

...

I have discovered something I believe to be a problem in Haskell'98, although it is not a simple bug as such - it has more of the flavour of an unintended mismatch in the interaction of two separate features. Since Haskell is deeply principled language, a feature conflict is extremely rare, and so worthy of note.

Jacques Carette

6:16 p.m.

I was under the impression that, in ghc 6.4 at least, GADTs did just that: use information gained by matching on the type constructor to refine types. I sort-of expected that the extension to pattern matching would follow. Or is that a nice paper waiting to be written? Jacques Lennart Augustsson wrote:

...

A somewhat similar problem exists even without fields:

foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

Since Haskell type checking doesn't use the information gained by pattern matching to refine types we just have to accept that some perfectly safe programs don't type check.

-- Lennart

Lennart Augustsson

8:42 p.m.

It's true that GADTs does just that. But only if you use them. And they are not part of Haskell. :) -- Lennart Jacques Carette wrote:

...

I was under the impression that, in ghc 6.4 at least, GADTs did just that: use information gained by matching on the type constructor to refine types. I sort-of expected that the extension to pattern matching would follow.

Or is that a nice paper waiting to be written?

Jacques

Lennart Augustsson wrote:

...
A somewhat similar problem exists even without fields:

foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

Since Haskell type checking doesn't use the information gained by pattern matching to refine types we just have to accept that some perfectly safe programs don't type check.

-- Lennart

Henning Thielemann

6:26 p.m.

On Thu, 23 Jun 2005, Lennart Augustsson wrote:

...

A somewhat similar problem exists even without fields:

foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

Since Haskell type checking doesn't use the information gained by pattern matching to refine types we just have to accept that some perfectly safe programs don't type check.

This sounds like the discussion about whether it is sensible to distinguish between empty lists of different types. Definitely yes! As well as ([] :: [a]) is not the same as ([] :: [()]) it must be (Right x :: Either a b) different from (Right x :: Either () b).

Jonathan Cast

7:46 p.m.

Lennart Augustsson wrote:

...

A somewhat similar problem exists even without fields:

foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

Since Haskell type checking doesn't use the information gained by pattern matching to refine types we just have to accept that some perfectly safe programs don't type check.

Well, not perfectly safe. We already had this discussion (about []), but to say it again: the value bound by any pattern has some particular type. x above doesn't have type (Either a b) for arbitrary a and b, but rather (Either a b) for some specific a and b. Unless that specific a happens to be (), x doesn't have the correct type to be the rhs of foo. It would require a radical change to GHC (moving away from System F as a basis for the intermediate language Core) to implement a language in which this wasn't true. Here is (approximately) the Core translation of foo above:

...

foo :: forall a b. Either a b -> Either () b foo = /\ a b -> \ e -> case e of Left _ -> Left @() @b () Right _ -> e

You call this as (foo @a @b e), where e has type (Either a b). So, the call (foo @a @b (Right @a @b x)) reduces (ignoring types) to (Right @a @b x). This reduces a term of type (Either () b) to one of type (Either a b). That's not type safe. Jon Cast

Lennart Augustsson

11:37 p.m.

Jonathan Cast wrote:

...

Lennart Augustsson wrote:

...
A somewhat similar problem exists even without fields:

foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

Since Haskell type checking doesn't use the information gained by pattern matching to refine types we just have to accept that some perfectly safe programs don't type check.

Well, not perfectly safe. We already had this discussion (about []), but to say it again: the value bound by any pattern has some particular type. x above doesn't have type (Either a b) for arbitrary a and b, but rather (Either a b) for some specific a and b. Unless that specific a happens to be (), x doesn't have the correct type to be the rhs of foo. It would require a radical change to GHC (moving away from System F as a basis for the intermediate language Core) to implement a language in which this wasn't true.

I'm very well aware that you cannot translate this to type correct System F. That doesn't mean it's necessarily unsafe. I wasn't suggesting to change GHC nor Haskell, I was just pointing out a similar problem.

...

Here is (approximately) the Core translation of foo above:

...
foo :: forall a b. Either a b -> Either () b foo = /\ a b -> \ e -> case e of Left _ -> Left @() @b () Right _ -> e

You call this as (foo @a @b e), where e has type (Either a b).

So, the call (foo @a @b (Right @a @b x)) reduces (ignoring types) to (Right @a @b x). This reduces a term of type (Either () b) to one of type (Either a b). That's not type safe.

There are no types around at run time so in case foo is being passed a Right argument the representation of the result is identical to the argument. (OK, if you want to be very pedantic, this isn't necessarily true for every possible implementation of Haskell, just all existing ones. :) -- Lennart

Jonathan Cast

24 Jun 24 Jun

1:57 p.m.

...

...
Lennart Augustsson wrote:

<snip>

...

Jonathan Cast wrote:

...
Here is (approximately) the Core translation of foo above:

...
foo :: forall a b. Either a b -> Either () b foo = /\ a b -> \ e -> case e of Left _ -> Left @() @b () Right _ -> e

You call this as (foo @a @b e), where e has type (Either a b).

So, the call (foo @a @b (Right @a @b x)) reduces (ignoring types) to (Right @a @b x). This reduces a term of type (Either () b) to one of type (Either a b). That's not type safe.

There are no types around at run time

We agree on this, but disagree on the reason. I think we may have to agree to disagree on the reason, but I would rather explain my position first (see below).

...

so in case foo is being passed a Right argument the representation of the result is identical to the argument.

So? Types are logical tools we use to structure our programs and guarantee their safety; type safe is a sufficient condition for ``will not segfault (or similarly fail)'', not an equivalent condition. In any case, it doesn't matter. If I say:

...

newtype Plus alpha beta = Plus (Either alpha beta) bat :: Plus alpha beta -> Either alpha beta bar x = x

bar's result is /guaranteed/ by the standard to have the same representation as its argument, but it still doesn't type check. Why? Because ``type safe'' doesn't mean ``uses the right representation at run time''. Types are not a computer science concept; they are a foundation of mathematics concept, invented around the turn of the previous century I believe by Bertrand Russell to get around certain logical problems in set theory, and I believe brought to prominence by Martin-Löf as a foundation for intuitionistic mathematics. Types made their first contact with lambda-terms in Alonzo Church's time, at which point the lambda-calculus was still intended as a system for foundations of mathematics. We use types in computer science because they are useful, but they don't arise from information theory or representation theory or any other branch of computer science. Therefore, genuine types don't exist at run time. In any language. Types are not properties of programs (which are not mathematical objects), but rather of languages (which are); since no machine language (that I know of) is typed, types do not exist at run time. No type theory (that I know of) goes beyond System F in accepting anything like foo. So, given the current state of the art, foo is unconditionally ill-typed. That could change if someone comes up with a /consistent/ type theory that accepts foo, but foo is ill-typed at the moment.

...

(OK, if you want to be very pedantic, this isn't necessarily true for every possible implementation of Haskell, just all existing ones. :)

``Perfectly correct'' is an open invitation to pedanticism :) Jon Cast

Lennart Augustsson

5:16 p.m.

Jonathan Cast wrote:

...

No type theory (that I know of) goes beyond System F in accepting anything like foo. So, given the current state of the art, foo is unconditionally ill-typed. That could change if someone comes up with a /consistent/ type theory that accepts foo, but foo is ill-typed at the moment.

As you say, we use types in programming languages because they help us, not because we want to implement type systems. So if a program makes sense when you ignore the types, it would be nice if the type system accepts it. My little program makes sense, but is not accepted by Haskell. Nor am I proposing that it should be accepted by Haskell. I was only giving an example of something that makes sense, but isn't allowed. There are, of course, type systems where my program works fine. O'Haskell is an example of a language with such a type system. In O'Haskell the Either type is defined like this: data Left a = Left a data Right a = Right a data Either a b > Left a, Right b which means that Either is really a union of the Left and Right types. Now my program type checks. :) foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x And it has type checked in various type systems for quite a while, but not in System F. But System F is not the end all of type systems. -- Lennart

Bulat Ziganshin

25 Jun 25 Jun

5:55 a.m.

New subject: Re[2]: type inference and named fields

Hello Lennart, Friday, June 24, 2005, 9:16:22 PM, you wrote: LA> There are, of course, type systems where my program works fine. LA> O'Haskell is an example of a language with such a type system. LA> In O'Haskell the Either type is defined like this: i like O'Haskell because it's very close to traditional OO languages in its ways to extend the types. why O'Haskell is not really implemented as extension to GHC/Hugs? because it is not compatible with System F? -- Best regards, Bulat mailto:bulatz@HotPOP.com

Thomas Hallgren

27 Jun 27 Jun

12:44 a.m.

Bulat Ziganshin wrote:

...

i like O'Haskell because it's very close to traditional OO languages in its ways to extend the types. why O'Haskell is not really implemented as extension to GHC/Hugs? because it is not compatible with System F?

O'Haskell *is* implemented as an extension of Hugs. Google for ohugs. I don't think you need to abandon System F to support subtyping in a compiler. You just need to make the subtyping coercions explicit, similar to how overloading is supported by the dictionary translation. -- Thomas H

Jonathan Cast

26 Jun 26 Jun

10:40 p.m.

Lennart Augustsson wrote: <snip>

...

There are, of course, type systems where my program works fine. O'Haskell is an example of a language with such a type system. In O'Haskell the Either type is defined like this:

data Left a = Left a data Right a = Right a data Either a b > Left a, Right b

which means that Either is really a union of the Left and Right types. Now my program type checks. :)

foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

And it has type checked in various type systems for quite a while, but not in System F. But System F is not the end all of type systems.

OK. I happily yield. Jon Cast

Daniel Brown

24 Jun 24 Jun

2:39 p.m.

Jonathan Cast wrote:

...

Lennart Augustsson wrote:

...
foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

Since Haskell type checking doesn't use the information gained by pattern matching to refine types we just have to accept that some perfectly safe programs don't type check.

Well, not perfectly safe. We already had this discussion (about []), but to say it again: the value bound by any pattern has some particular type. x above doesn't have type (Either a b) for arbitrary a and b, but rather (Either a b) for some specific a and b. Unless that specific a happens to be (), x doesn't have the correct type to be the rhs of foo. It would require a radical change to GHC (moving away from System F as a basis for the intermediate language Core) to implement a language in which this wasn't true.

I see that the argument to foo has type `Either a b` (unquantified), but why can't x, bound to the value that matches the pattern, have the type of the pattern? In this case, that would let `x :: forall x. Either x b`, which unifies with `Either () b`. Dan

Jonathan Cast

26 Jun 26 Jun

10:56 p.m.

Daniel Brown wrote:

...

Jonathan Cast wrote:

...
Lennart Augustsson wrote:

...
foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

Since Haskell type checking doesn't use the information gained by pattern matching to refine types we just have to accept that some perfectly safe programs don't type check.

Well, not perfectly safe. We already had this discussion (about []), but to say it again: the value bound by any pattern has some particular type. x above doesn't have type (Either a b) for arbitrary a and b, but rather (Either a b) for some specific a and b. Unless that specific a happens to be (), x doesn't have the correct type to be the rhs of foo. It would require a radical change to GHC (moving away from System F as a basis for the intermediate language Core) to implement a language in which this wasn't true.

I see that the argument to foo has type `Either a b` (unquantified), but why can't x, bound to the value that matches the pattern, have the type of the pattern?

Because it's bound to a value of the type of the argument, not of the type of the pattern. What about the function

...

bar :: alpha -> beta bar x = x

? The pattern here is x, and its type is (x :: forall tau. tau). But, I think it axiomatic that typing rules have to forbid functions like bar.

...

In this case, that would let `x :: forall x. Either x b`, which unifies with `Either () b`.

Jon Cast

Malcolm Wallace

24 Jun 24 Jun

8:50 a.m.

Lennart Augustsson writes:

...

A somewhat similar problem exists even without fields:

foo :: Either a b -> Either () b foo (Left _) = Left () foo x@(Right _) = x

Your example is less surprising than mine with fields. The expression x on the rhs of the last clause has type Either a b This does not match the signature, which claims Either () b So it is fairly easy to explain why this is rejected. Even if we change the definition of the Either type to omit the Left constructor (and correspondingly omit the first clause of foo's definition), you still get an error. Whereas in the named field example, the rhs expression v {field1=Void} does indeed have the type Fields Void as declared in the signature. The expression explicitly converts all the relevant interior fields to Void. At least, that is how it could appear to a naive programmer like me :-) After all, if I change the definition of the type Fields to omit the constructor VariantWithTwo (and correspondingly omit the first clause of voidcast's definition), then suddenly and miraculously, the second clause passes the type checker. It is a bit weird that adding a new constructor to a datatype can cause existing code to fail to typecheck! Regards, Malcolm

...

data Fields a = VariantWithTwo { field1 :: a , field2 :: a } | VariantWithOne { field1 :: a } data Void = Void

...

voidcast :: Fields a -> Fields Void voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void }

Christian Maeder

9:02 a.m.

Malcolm Wallace wrote:

...

Whereas in the named field example, the rhs expression v {field1=Void} does indeed have the type Fields Void as declared in the signature. The expression explicitly converts all the relevant interior fields to Void. At least, that is how it could appear to a naive programmer like me :-)

If v has a second field with the same type of field1, do you really expect that field2 is silently casted to the new type of field1? (This would be unsafe) Christian

Henning Thielemann

23 Jun 23 Jun

6:43 p.m.

On Thu, 23 Jun 2005, Malcolm Wallace wrote:

...

...
module Fieldbug where data Fields a = VariantWithTwo { field1 :: a , field2 :: a } | VariantWithOne { field1 :: a }

The key point here is that the data structure with named fields has more than one constructor, and some fields are omitted in one of the variants. Now let's try to write a simple conversion function over values of this type, using only the named-field update style:

...
data Void = Void

voidcast :: Fields a -> Fields Void voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void }

It looks simple enough doesn't it? But none of the widely available Haskell implementations will accept this code.

This is either a point against multiple fields with the same label or against the field update syntax. Against multiple declarations because it is not allowed to define multiple objects with the same name in the same scope. But if you consider multiple occurences of the same field name as distributed declarations of the same function the conflict is resolved. So it seems to be more a point against the record update syntax. What I need quite often is a function which updates a field. This would nicely work also for your example. update_field1 :: (a -> a) -> Fields a -> Fields a update_field1 f (VariantWithTwo x y) = VariantWithTwo (f x) y update_field1 f (VariantWithOne x) = VariantWithOne (f x) (update_field1 (const x)) would replace a field value with a new one. Support for this kind of update would be better than support by the current update syntax.

Christian Maeder

7:08 p.m.

Malcolm Wallace wrote:

...

...
voidcast :: Fields a -> Fields Void voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void }

I would not expect that updating only field1 can change the type of v. The right thing is to construct a new value. This looks as follows with record syntax: voidcast VariantWithTwo{} = VariantWithTwo { field1 = Void , field2 = Void } voidcast VariantWithOne{} = VariantWithOne { field1 = Void } Christian

Henning Thielemann

7:23 p.m.

On Thu, 23 Jun 2005, Christian Maeder wrote:

...

Malcolm Wallace wrote:

...
...
voidcast :: Fields a -> Fields Void voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void }

I would not expect that updating only field1 can change the type of v. The right thing is to construct a new value. This looks as follows with record syntax:

voidcast VariantWithTwo{} = VariantWithTwo { field1 = Void , field2 = Void } voidcast VariantWithOne{} = VariantWithOne { field1 = Void }

Erm, I have overlooked this, too. So my update function proposal doesn't affect the type aspect of record update syntax.

Tomasz Zielonka

7:54 p.m.

On Thu, Jun 23, 2005 at 09:08:12PM +0200, Christian Maeder wrote:

...

Malcolm Wallace wrote:

...
...
voidcast :: Fields a -> Fields Void voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void }

I would not expect that updating only field1 can change the type of v.

But it can. Note that if you change the second field1 to field3 (both in datatype definition and voidcast function), the code will be accepted.

...

The right thing is to construct a new value.

In the report, section 3.15.3, "Updates Using Field Labels", the translation is simply a pattern-match and (re-)construction. There is no requirement that "input" and "output" types are the same. It can be a nice feature actually, I've used it once or twice. Best regards Tomasz

Christian Maeder

24 Jun 24 Jun

8:37 a.m.

Tomasz Zielonka wrote:

...

...
...
...
voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void } I would not expect that updating only field1 can change the type of v.

But it can. Note that if you change the second field1 to field3 (both in datatype definition and voidcast function), the code will be accepted.

...
The right thing is to construct a new value.

In the report, section 3.15.3, "Updates Using Field Labels", the translation is simply a pattern-match and (re-)construction. There is no requirement that "input" and "output" types are the same. It can be a nice feature actually, I've used it once or twice.

Obviously this (magic) reconstruction does not work always (and not in the given case). Even consecutive updates (that could be recognized by the desugarer) don't work: voidcast v@VariantWithTwo{} = v { field1 = Void} {field2 = Void } I think, the type checker of programatica requires that input and output type of updated values are the same (not only because comma-separated field updates are desugared to consecutive updates.) Christian

Malcolm Wallace

10:44 a.m.

Christian Maeder writes:

...

Even consecutive updates (that could be recognized by the desugarer) don't work:

voidcast v@VariantWithTwo{} = v { field1 = Void} {field2 = Void }

Yes, I find it interesting that consecutive updates are not equivalent to a combined update. I believe this is largely because named fields are defined as sugar - they behave in some sense like textual macros in other languages, which can often turn out to have unexpected properties. Here is a strawman proposal, which does not fix the consecutive update problem, but which might address the original typing problem. The Report (section 3.15) defines the following conditions on a named field update: * all labels must be taken from the same datatype * at least one constructor must define all of the labels mentioned in the update * no label may be mentioned more than once * an execution error occurs when the value being updated does not contain all of the specified labels. Now, this last condition is rather nasty - a runtime error - and I would like to eliminate it. I think we can do so by also eliminating the second condition. That is, I would like to be able to use a set of labels that might not be contained solely within a single constructor. The obvious semantics is that if a constructed value does not contain a label of that type, that part of the update is discarded. Hence, I could write a single expression that updated all possible variants of a type, simply by unioning all their possible labels. e.g.

...

data Fields a = VariantWithOne { field1 :: a } | VariantWithTwo { field2 :: a }

voidcast :: Fields a -> Fields Void voidcast v = v { field1 = Void , field2 = Void }

The only change required in the desugaring translation specified in the Report, is to replace the default case match from _ -> error "Update error" to v -> v (The desugaring does not currently seem to implement condition 2, and the original case default only partially addresses condition 4.) What might be the downside of such a change? Well, an update expression (such as in my example above) could confuse the programmer into thinking that the value contains at least two fields, when it in fact contains only one. Likewise, it may suggest strongly that a certain value associated with a label will afterwards be stored within the variable (as now), whereas it is possible that the value afterwards appears nowhere within the variable (because the label was not a member). How do these problems weigh against the increased convenience of the proposal? I think they are very slight, and the benefit is potentially larger. It would break no old programs. It would permit some current programs to be more succinct. Regards, Malcolm

Christian Maeder

12:23 p.m.

Malcolm Wallace wrote:

...

Here is a strawman proposal, which does not fix the consecutive update problem, but which might address the original typing problem.

I think it does not really address the original typing problem. It would allow you to write:

...

voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void , field2 = Void }

Setting field2 only assures type correctness, but the value of field2 would be ignored at runtime. I'ld rather define an empty polymorphic element and update that one: e = VariantWithOne { field1 = undefined } voidcast VariantWithOne{} = e { field1 = Void } In the last line e is (only) instantiated (and not casted).

...

The Report (section 3.15) defines the following conditions on a named field update:

* all labels must be taken from the same datatype * at least one constructor must define all of the labels mentioned in the update * no label may be mentioned more than once * an execution error occurs when the value being updated does not contain all of the specified labels.

Now, this last condition is rather nasty - a runtime error - and I would like to eliminate it. I think we can do so by also eliminating the second condition. That is, I would like to be able to use a set of labels that might not be contained solely within a single constructor. The obvious semantics is that if a constructed value does not contain a label of that type, that part of the update is discarded. Hence, I could write a single expression that updated all possible variants of a type, simply by unioning all their possible labels.

I want to detect errors as early as possible. Christian

Malcolm Wallace

1:38 p.m.

Christian Maeder writes:

...

...
voidcast v@(VariantWithOne{}) = v { field1 = Void , field2 = Void }

Setting field2 only assures type correctness, but the value of field2 would be ignored at runtime.

Exactly what I was proposing.

...

...
I could write a single expression that updated all possible variants of a type, simply by unioning all their possible labels.

I want to detect errors as early as possible.

But you can already write this:

...

...
voidcast v@(VariantWithOne{}) = v { field1 = Void , field2 = Void }

and you do not get a compile time error - it is a runtime error. I am proposing that it should not be thought of as an error at all. Regards, Malcolm

Christian Maeder

27 Jun 27 Jun

9:40 a.m.

Malcolm Wallace wrote:

...

Christian Maeder writes:

...
...
voidcast v@(VariantWithOne{}) = v { field1 = Void , field2 = Void } Setting field2 only assures type correctness, but the value of field2 would be ignored at runtime.

Exactly what I was proposing.

I know, but how "save" is this. i.e. when you mistype/exchange field names?

...

I am proposing that it should not be thought of as an error at all.

I think, it is better to get i.e. an overflow error (at runtime) than another runtime error elsewhere. Christian

Christian Maeder

10:19 a.m.

Christian Maeder wrote:

...

Malcolm Wallace wrote:

...
Christian Maeder writes:

...
...
voidcast v@(VariantWithOne{}) = v { field1 = Void , field2 = Void } Setting field2 only assures type correctness, but the value of field2 would be ignored at runtime. Exactly what I was proposing.

I know, but how "save" is this. i.e. when you mistype/exchange field names?

The criterion should be how "robust" you can program, i.e. can you easily detect the spots that you must change (or need not to be changed) when the data type changed? (One of the advantages of Haskell's type system) Christian

Christian Maeder

10:30 a.m.

Malcolm Wallace wrote:

...

Yes, I find it interesting that consecutive updates are not equivalent to a combined update. I believe this is largely because named fields are defined as sugar - they behave in some sense like textual macros in other languages, which can often turn out to have unexpected properties.

I share your characterization "like textual macros". The question is, should this be eliminated? Would it be cleaner, though less convenient some times, if the types of updated values must not change? (Of course, it is no realistic option to change that in Haskell, since probably too much existing code would break.) Christian

Malcolm Wallace

12:41 p.m.

Malcolm Wallace wrote:

...

...
Yes, I find it interesting that consecutive updates are not equivalent to a combined update. I believe this is largely because named fields are defined as sugar - they behave in some sense like textual macros in other languages, which can often turn out to have unexpected properties.

Christian Maeder writes:

...

I share your characterization "like textual macros". The question is, should this be eliminated?

Although the behaviour is initially surprising, I suppose it is easily explained. After all, an equivalent expression using lambdas would have the same typing problem: data A a = A { one, two :: a } update v@(A{}) = v { one = Void } { two = Void } update' v@(A{}) = (\A one two -> A one Void) $ (\A one two -> A Void two) $ v Because of this, I would be reluctant to define that consecutive field updates can be semantically amalgamated into a single update.

...

Would it be cleaner, though less convenient some times, if the types of updated values must not change?

I believe this is a very different question from the consecutive update one. By analogy with the lambda equivalent, I can see no reason to outlaw a type change, where all the relevant types change at the same time: update_ok v@(A{}) = v { one=Void, two=Void } update_ok' v@(A{}) = (\A one two -> A Void Void) v Regards, Malcolm

Christian Maeder

1:16 p.m.

Malcolm Wallace wrote:

...

I believe this is a very different question from the consecutive update one.

I agree, consecutive and parallel updates are quite different.

...

I can see no reason to outlaw a type change, where all the relevant types change at the same time:

update_ok v@(A{}) = v { one=Void, two=Void } update_ok' v@(A{}) = (\A one two -> A Void Void) v

Because typing fails in your original example: voidcast v@VariantWithOne{} = v {field1 = Void} What would be the "lambda equivalent"?

...

voidcast v@VariantWithOne{} = (\ (VariantWithOne a) -> VariantWithOne Void) v

would go through. Whereas

...

voidcast v@VariantWithOne{} = ( \ x -> case x of VariantWithTwo a b -> VariantWithTwo Void b VariantWithOne a -> VariantWithOne Void) v

fails like the field notation. I think writing "VariantWithOne {field1 = Void}" instead of "v {field1 = Void}" is as clear as writing "Right r" again on the rhs instead of "x" from the pattern "x@(Right r)" Christian

Malcolm Wallace

4:07 p.m.

Christian Maeder writes:

...

Typing fails in your original example: voidcast v@VariantWithOne{} = v {field1 = Void} but not in the "lambda equivalent" voidcast v@VariantWithOne{} = (\ (VariantWithOne a) -> VariantWithOne Void) v

Hmm. Yes, that was my original point.

...

But voidcast v@VariantWithOne{} = ( \ x -> case x of VariantWithTwo a b -> VariantWithTwo Void b VariantWithOne a -> VariantWithOne Void) v fails like the field notation.

Yes, this is a more accurate translation of the field notation to a lambda expression, and therefore suffers the same problem. My proposal was to allow voidcast v = v { field1 = P, field2 = Q } to be interpreted as the lambda expression voidcast v = ( \ x -> case x of VariantWithTwo a b -> VariantWithTwo P Q VariantWithOne a -> VariantWithOne P) v where the information about field2=Q is omitted from the clauses where no field2 exists. (I'm using P and Q instead of Void, to highlight the apparent loss of information on the RHS of the pattern-match.)

...

I think writing "VariantWithOne {field1 = Void}" instead of "v {field1 = Void}" is as clear as writing "Right r" again on the rhs instead of "x" from the pattern "x@(Right r)"

I can only repeat myself, that the field being updated (and type-converted) is only one of many, and all other fields should carry the same value in the updated structure as in the original. There is no good way to write this at the moment. If there were no type-conversion, a field update would work just great. But because of the conversion, one is forced to use explicit construction. Regards, Malcolm

Henning Thielemann

4:15 p.m.

On Mon, 27 Jun 2005, Malcolm Wallace wrote:

...

I can only repeat myself, that the field being updated (and type-converted) is only one of many, and all other fields should carry the same value in the updated structure as in the original. There is no good way to write this at the moment. If there were no type-conversion, a field update would work just great. But because of the conversion, one is forced to use explicit construction.

If there is some field which appears in many constructors but the type is independent from the other ones I would think about bundling all constructors which share the same type. The sub-alternatives could be put into another type.

Malcolm Wallace

24 Jun 24 Jun

10:06 a.m.

Christian Maeder writes:

...

...
...
voidcast :: Fields a -> Fields Void voidcast v@(VariantWithTwo{}) = v { field1 = Void , field2 = Void } voidcast v@(VariantWithOne{}) = v { field1 = Void }

I would not expect that updating only field1 can change the type of v. The right thing is to construct a new value. This looks as follows with record syntax:

voidcast VariantWithTwo{} = VariantWithTwo { field1 = Void , field2 = Void } voidcast VariantWithOne{} = VariantWithOne { field1 = Void }

Ah, but the reason I want to use field update, rather than a new construction on the rhs, is that my type really has lots of other fields (not mentioned here), which are all of fixed definite types (not parametric). It is much nicer to be able to write v { field1 = Void } than VariantWithOne { field1 = Void , field2 = field2 v , field3 = field3 v , field4 = field4 v , field5 = field5 v ... } which has so much more scope for making a mistake, not to mention the extra maintainance work required if I ever add or delete a field from the type. Regards, Malcolm

Christian Maeder

10:20 a.m.

Malcolm Wallace wrote:

...

Ah, but the reason I want to use field update, rather than a new construction on the rhs, is that my type really has lots of other fields (not mentioned here), which are all of fixed definite types (not parametric). It is much nicer to be able to write

v { field1 = Void }

than

VariantWithOne { field1 = Void , field2 = field2 v , field3 = field3 v , field4 = field4 v , field5 = field5 v ... }

There is no problem with the desirable short version of updating if the type of v does not change! Only for a polymorphic record data type suitable map (or update) functions should be considered. Your long version is (of course) wrongly typed if the type of another field depends on the type of field1. Christian

7323

Age (days ago)

7327

Last active (days ago)

List overview

Download

32 comments

10 participants

participants (10)

Bulat Ziganshin
Christian Maeder
Daniel Brown
Henning Thielemann
Jacques Carette
Jonathan Cast
Lennart Augustsson
Malcolm Wallace
Thomas Hallgren
Tomasz Zielonka