TypeHoles: unbound variables as named holes

(I'm again starting a new thread to focus on this issue. It's easier to track that way.) On Thu, Oct 4, 2012 at 11:40 AM, Simon Peyton-Jones wrote:
There is also the small matter, in this example, of distinguishing which `_' is which. The description works, but you have to think about it. I don't have an immediate and simple solution to this. Perhaps the addition of unique labels (e.g. _$1 _$2). But this is not a major problem. It can even wait until some future development/expansion on TypeHoles.
Just to clarify the "small matter" above, I was proposing a way to reference unnamed holes in the warning messages. This was only to solve the relatively minor problem of visually distinguishing the holes if there are multiple. ****
I have a proposal. Someone has already suggested on hackage.haskell.org/trac/ghc/ticket/5910 that an un-bound variable behaves like a hole. Thus, if you say
** **
f x = y****
GHC says “Error: y is not in scope”. But (idea) with -XTypeHoles****
** **
f x = y****
might generate ****
**1. **(renamer) *Warning*: y is not in scope****
**2. **(type) *Error*: Hole “y” has type....****
So that’s like a named hole, in effect. ****
** **
If you say****
f x = 4****
GHC warns about the unused binding for x. But if you say****
f _x = 4****
the unused-binding warning is suppressed. So (idea) if you say****
f x = _y****
maybe we can suppress warning (1). And, voila, named holes.****
** **
Moreover if you add –fdefer-type-errors you can keep going and run the program.****
** **
Any comments? This is pretty easy to do.
I also like the proposal; however, I think it only makes sense if the set of unbound variables with the same name is treated as referring to the same identifier. This was, after all, the main reason for named holes. Roman expected this, and I think everybody who uses the feature will expect it. On Fri, Oct 5, 2012 at 9:24 AM, Roman Cheplyaka wrote:
* Simon Peyton-Jones [2012-10-05 07:14:36+0000]
| Sounds cool. I would also expect that if you have several occurences of | the same unbound identifier, then it gets a unified type.
I thought about this, but I think not. Consider
f x1 = _y g x2 = _y
Do you want _y and _y to be unified, so that f and g are no longer
* Roman Cheplyaka [2012-10-05 10:22:23+0300] polymorphic? I think not. Any more than the "_" holes we have now are unified.
Do you mean polymorphism in their argument? Why would it go away?
I would expect the functions to get types `a -> c` and `b -> c` respectively, and `c` to be reported as the type of the hole.
Ah, I see. Since `c` is universally quantified, it means different things for f and g. Good point.
I would expect `_y' to be unified. I'm not sure how to write the type of `f' or `g' (maybe `forall a. a -> c' and `forall b. b -> c'?), but I would still expect `_y' to have some type `c'. It would be endlessly confusing to have identifiers that look the same but aren't. Here's a thought that just occurred to me, though I'm not yet sure if it makes sense. Treat all expression identifiers _x as unique but report them as one hole with all possible types. Then, you can visually identify the patterns between the types. So:
f x = _y g x = _y 'a'
with some warning like this: Found hole `_y' in multiple locations with the possible types File.hs:##:##: a0 File.hs:##:##: Char -> b0 Now, I know by looking at it that `a0' and `b0' are universally quantified per location, but I can do some mental unification myself. Regards, Sean

On Fri, Oct 5, 2012 at 10:22 AM, Sean Leather
I also like the proposal; however, I think it only makes sense if the set of unbound variables with the same name is treated as referring to the same identifier. This was, after all, the main reason for named holes. Roman expected this, and I think everybody who uses the feature will expect it.
I for one wasn't expecting it, so I guess that makes me fall outside of the scope of everybody. ;) The main thing I like about Simon's proposal is that I could just drop an _foo in my file, and when I open it back up with vim or what have you /_foo to find my way back to it.
Here's a thought that just occurred to me, though I'm not yet sure if it makes sense. Treat all expression identifiers _x as unique but report them as one hole with all possible types. Then, you can visually identify the patterns between the types. So:
f x = _y g x = _y 'a'
with some warning like this:
Found hole `_y' in multiple locations with the possible types File.hs:##:##: a0 File.hs:##:##: Char -> b0
Now, I know by looking at it that `a0' and `b0' are universally quantified per location, but I can do some mental unification myself.
There is the slight complication that the inscope variables shown for each location would be in different unification contexts as well, so your list gets a lot more cluttered, and the in scope variables probably should be grouped with each in turn, so I'm not sure this is any better than listing them separately by the time you factor in that clutter. -Edward

On Fri, Oct 5, 2012 at 10:39 PM, Edward Kmett wrote:
On Fri, Oct 5, 2012 at 10:22 AM, Sean Leather wrote:
I also like the proposal; however, I think it only makes sense if the set of unbound variables with the same name is treated as referring to the same identifier. This was, after all, the main reason for named holes. Roman expected this, and I think everybody who uses the feature will expect it.
I for one wasn't expecting it, so I guess that makes me fall outside of the scope of everybody. ;)
Perhaps not everybody expects it, but I think it's quite natural to expect/want this feature. Once you find that you can get the type of _x (as opposed to _) in one place, why wouldn't you also want the type of _x when it's used in multiple places? The main thing I like about Simon's proposal is that I could just drop an
_foo in my file, and when I open it back up with vim or what have you /_foo to find my way back to it.
I'm not sure what you mean here. The point I was making was about having two (or more) _foo's in your file(s). So, you can still /_foo if you want or grep _foo if you prefer that. Here's a thought that just occurred to me, though I'm not yet sure if it
makes sense. Treat all expression identifiers _x as unique but report them as one hole with all possible types. Then, you can visually identify the patterns between the types. So:
f x = _y g x = _y 'a'
with some warning like this:
Found hole `_y' in multiple locations with the possible types File.hs:##:##: a0 File.hs:##:##: Char -> b0
Now, I know by looking at it that `a0' and `b0' are universally quantified per location, but I can do some mental unification myself.
There is the slight complication that the inscope variables shown for each location would be in different unification contexts as well, so your list gets a lot more cluttered, and the in scope variables probably should be grouped with each in turn, so I'm not sure this is any better than listing them separately by the time you factor in that clutter.
By "inscope variables," I guess you mean the "relevant bindings" in the hole report. Why would you group them with the different locations? I expect only one definition for _y. (If I didn't expect one definition for _y, then I would rename some of the holes.) So, I am only interested in variables that are available wherever that definition can be. If all the _y holes are in one top-level definition, then I probably prefer the local variables (so as not to be overwhelmed with other variables), and the context will make sense. If the _y holes are spread over multiple top-level definitions, then local variables won't help me there anyway. Regards, Sean

I also like the proposal; however, I think it only makes sense if the set of unbound variables with the same name is treated as referring to the same identifier. This was, after all, the main reason for named holes. Roman expected this, and I think everybody who uses the feature will expect it. It would be endlessly confusing to have identifiers that look the same but aren't. I agree that occurrences of an unbound variable with the same name can be thought of as being the same identifier -- but I do not agree that means that each occurrence need have the same type. Consider f x = (undefined && True, ord undefined) There are two occurrences of 'undefined', but one has type Bool, and the other has type Char. is that endlessly confusing? I think that each hole *must* be independent, and effectively must have type (forall a.a), just like undefined. Otherwise you get into real difficulties about how polymorphism is affected because we don't know how far out the sharing happens. I don't think type sharing is a good reason for named holes -- indeed I think that trying to specify what sharing might mean would be plain confusing. Instead, I'm proposing this simple design * If there is no binding for "x", "_x" or "foo_x" then they are unbound variables * With a suitable flag, any unbound variable is treated as a hole, with each occurrence having a separate type. That is, it's just as if there was a top level binding for the variable saying foo_x :: forall a. a; foo_x = error "foo_x is undefined", except that the compiler reports the type of each occurrence of a hole. * Our current holes "_" are just a degenerate case of the above. Simple, uniform, explainable. Simon

Hey all, To report on the current situation: I'm myself still not sure which of the options of sharing/not sharing is the most useful. I can't really estimate which I would want to use the most, so I wouldn't dare to predict which most users would want. However, the module-wide sharing is hard, and as Simon mentioned is problematic with polymorphism. We've looked at this before, and right now I think I don't have the time to finish that. No sharing, the way Simon proposed it, was rather easy to add. What I have now works like this: When -XTypeHoles is on, any unbound identifier that starts with _ is also turned into a hole, with an appropriate message. For example, the module: f = _a _b _ Gives (these are warnings, because -fdefer-type-errors is on): --- holes2.hs:1:5: Warning: Found hole with type: `t0 -> t1 -> t' Arising from: an undeclared identifier `_a' at holes2.hs:1:5-6 Where: `t0' is an ambiguous type variable `t1' is an ambiguous type variable `t' is a rigid type variable bound by the inferred type of f :: t at holes2.hs:1:1 Relevant bindings include f :: t (bound at holes2.hs:1:1) In the expression: _a In the expression: _a _b _ In an equation for `f': f = _a _b _ holes2.hs:1:8: Warning: Found hole with type: `t0' Arising from: an undeclared identifier `_b' at holes2.hs:1:8-9 Where: `t0' is an ambiguous type variable In the first argument of `_a', namely `_b' In the expression: _a _b _ In an equation for `f': f = _a _b _ holes2.hs:1:11: Warning: Found hole with type: `t1' Arising from: a use of `_' at holes2.hs:1:11 Where: `t1' is an ambiguous type variable In the second argument of `_a', namely `_' In the expression: _a _b _ In an equation for `f': f = _a _b _ --- I think this is a good compromise that gives you named holes and easily allows you to turn unbound identifiers into holes. What do you think? Regards, Thijs Alkemade
participants (4)
-
Edward Kmett
-
Sean Leather
-
Simon Peyton-Jones
-
Thijs Alkemade