RFC: rewrite-with-location proposal

Michael Snoyman

25 Feb 2013 25 Feb '13

6:06 a.m.

Quite a while back, Simon Hengel and I put together a proposal[1] for a new feature in GHC. The basic idea is pretty simple: provide a new pragma that could be used like so: error :: String -> a errorLoc :: IO Location -> String -> a {-# REWRITE_WITH_LOCATION error errorLoc #-} Then all usages of `error` would be converted into calls to `errorLoc` by the compiler, passing in the location information of where the call originated from. Our three intended use cases are: * Locations for failing test cases in a test framework * Locations for log messages * assert/error/undefined Note that the current behavior of the assert function[2] already includes this kind of approach, but it is a special case hard-coded into the compiler. This proposal essentially generalizes that behavior and makes it available for all functions, whether included with GHC or user-defined. The proposal spells out some details of this approach, and contrasts with other methods being used today for the same purpose, such as TH and CPP. Michael [1] https://github.com/sol/rewrite-with-location [2] http://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Control-Ex...

Attachments:

attachment.html (text/html — 1.4 KB)

Show replies by date

Joachim Breitner

25 Feb 25 Feb

8:57 a.m.

Hi, Am Montag, den 25.02.2013, 08:06 +0200 schrieb Michael Snoyman:

...

Quite a while back, Simon Hengel and I put together a proposal[1] for a new feature in GHC. The basic idea is pretty simple: provide a new pragma that could be used like so:

error :: String -> a errorLoc :: IO Location -> String -> a {-# REWRITE_WITH_LOCATION error errorLoc #-}

in light of attempts to split base into a pure part (without IO) and another part, I wonder if the IO wrapping is really necessary. Can you elaborate the reason why a simple "Location ->" is not enough? Thanks, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Simon Hengel

9:13 a.m.

On Mon, Feb 25, 2013 at 09:57:04AM +0100, Joachim Breitner wrote:

...

Hi,

Am Montag, den 25.02.2013, 08:06 +0200 schrieb Michael Snoyman:

...
Quite a while back, Simon Hengel and I put together a proposal[1] for a new feature in GHC. The basic idea is pretty simple: provide a new pragma that could be used like so:

error :: String -> a errorLoc :: IO Location -> String -> a {-# REWRITE_WITH_LOCATION error errorLoc #-}

in light of attempts to split base into a pure part (without IO) and another part, I wonder if the IO wrapping is really necessary.

Can you elaborate the reason why a simple "Location ->" is not enough?

The IO helps with reasoning. Without it you could write code that does something different depending on the call site. Here is an example: someBogusThingy :: Int someBogusThingy = .. someBogusThingyLoc :: Location -> Int someBogusThingyLoc loc | (even . getLine) loc = 23 | otherwise = someBogusThingyLoc {-# REWRITE_WITH_LOCATION someBogusThingy someBogusThingyLoc #-} Now someBogusThingy behaves different depending on whether the call site is on an even or uneven line number. Admittedly, the example is contrived, but I hope it illustrates the issue. I do not insist on keeping it. If we, as a community, decide, that we do not need the IO here. Then I'm fine with dropping it. Cheers, Simon

Michael Snoyman

9:16 a.m.

On Mon, Feb 25, 2013 at 11:13 AM, Simon Hengel wrote:

...

On Mon, Feb 25, 2013 at 09:57:04AM +0100, Joachim Breitner wrote:

...
Hi,

Am Montag, den 25.02.2013, 08:06 +0200 schrieb Michael Snoyman:

...
Quite a while back, Simon Hengel and I put together a proposal[1] for a new feature in GHC. The basic idea is pretty simple: provide a new pragma that could be used like so:

error :: String -> a errorLoc :: IO Location -> String -> a {-# REWRITE_WITH_LOCATION error errorLoc #-}

in light of attempts to split base into a pure part (without IO) and another part, I wonder if the IO wrapping is really necessary.

Can you elaborate the reason why a simple "Location ->" is not enough?

The IO helps with reasoning. Without it you could write code that does something different depending on the call site. Here is an example:

someBogusThingy :: Int someBogusThingy = ..

someBogusThingyLoc :: Location -> Int someBogusThingyLoc loc | (even . getLine) loc = 23 | otherwise = someBogusThingyLoc

{-# REWRITE_WITH_LOCATION someBogusThingy someBogusThingyLoc #-}

Now someBogusThingy behaves different depending on whether the call site is on an even or uneven line number. Admittedly, the example is contrived, but I hope it illustrates the issue.

I do not insist on keeping it. If we, as a community, decide, that we do not need the IO here. Then I'm fine with dropping it.

And FWIW, my vote *does* go towards dropping it. I put this proposal in the same category as rewrite rules in general: it's certainly possible for a bad implementation to wreak havoc, but it's the responsibility of the person using the rewrite rules to ensure that doesn't happen. Michael

Joachim Breitner

9:17 a.m.

Hi, Am Montag, den 25.02.2013, 10:13 +0100 schrieb Simon Hengel:

...

On Mon, Feb 25, 2013 at 09:57:04AM +0100, Joachim Breitner wrote:

...
Hi,

Am Montag, den 25.02.2013, 08:06 +0200 schrieb Michael Snoyman:

...
Quite a while back, Simon Hengel and I put together a proposal[1] for a new feature in GHC. The basic idea is pretty simple: provide a new pragma that could be used like so:

error :: String -> a errorLoc :: IO Location -> String -> a {-# REWRITE_WITH_LOCATION error errorLoc #-}

in light of attempts to split base into a pure part (without IO) and another part, I wonder if the IO wrapping is really necessary.

Can you elaborate the reason why a simple "Location ->" is not enough?

The IO helps with reasoning. Without it you could write code that does something different depending on the call site. Here is an example:

someBogusThingy :: Int someBogusThingy = ..

someBogusThingyLoc :: Location -> Int someBogusThingyLoc loc | (even . getLine) loc = 23 | otherwise = someBogusThingyLoc

{-# REWRITE_WITH_LOCATION someBogusThingy someBogusThingyLoc #-}

Now someBogusThingy behaves different depending on whether the call site is on an even or uneven line number. Admittedly, the example is contrived, but I hope it illustrates the issue.

ok, I mentally applied REWRITE_WITH_LOCATION before wondering about reasoning about the code. But you are right that it would be nice if the rewrite rule would be valid as well. I’m not firmly commited either way. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nomeata@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nomeata@joachim-breitner.de | http://people.debian.org/~nomeata

Alexander Kjeldaas

9:21 a.m.

Immediately, the alternative of introducing bound variables in the environment that is available to rewrite rules comes to mind as a more general way of doing this. So this example from the GHC docs: {-# RULES "map/map" forall f g xs. map f (map g xs) = map (f.g) xs "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys #-} For some source: map f (map g xs) it is translated into: let location = "somefile.hs:234" in map (f.g) xs So for error: {-# RULES "error/location" error s = errorLoc location s #-} is translated into: let location = "somefile.hs:345" in errorLoc location s Alexander

Twan van Laarhoven

9:40 a.m.

On 25/02/13 07:06, Michael Snoyman wrote:

...

Quite a while back, Simon Hengel and I put together a proposal[1] for a new feature in GHC. The basic idea is pretty simple: provide a new pragma that could be used like so:

error :: String -> a errorLoc :: IO Location -> String -> a {-# REWRITE_WITH_LOCATION error errorLoc #-}

Then all usages of `error` would be converted into calls to `errorLoc` by the compiler, passing in the location information of where the call originated from. Our three intended use cases are:

I think there is no need to have a separate REWRITE_WITH_LOCATION rule. What if the compiler instead rewrites 'currentLocation' to the current location? Then you'd just define the rule: {-# REWRITE "errorLoc" error = errorLoc currentLocation #-} I'm also pretty sure that something like this has been proposed in the past. Twan

Simon Hengel

11:46 a.m.

On Mon, Feb 25, 2013 at 10:40:29AM +0100, Twan van Laarhoven wrote:

...

I think there is no need to have a separate REWRITE_WITH_LOCATION rule. What if the compiler instead rewrites 'currentLocation' to the current location? Then you'd just define the rule:

{-# REWRITE "errorLoc" error = errorLoc currentLocation #-}

REWRITE rules are only enabled with -O. Source locations are also useful during development (when you care more about compilation time than efficient code and hence use -O0). So I'm not sure whether it's a good idea to lump those two things together. Cheers, Simon

Alexander Kjeldaas

12:15 p.m.

On Mon, Feb 25, 2013 at 12:46 PM, Simon Hengel wrote:

...

On Mon, Feb 25, 2013 at 10:40:29AM +0100, Twan van Laarhoven wrote:

...
I think there is no need to have a separate REWRITE_WITH_LOCATION rule. What if the compiler instead rewrites 'currentLocation' to the current location? Then you'd just define the rule:

{-# REWRITE "errorLoc" error = errorLoc currentLocation #-}

REWRITE rules are only enabled with -O. Source locations are also useful during development (when you care more about compilation time than efficient code and hence use -O0). So I'm not sure whether it's a good idea to lump those two things together.

I could imagine that source locations being useful when debugging rewrite rules for example. I think your argument makes sense, but why not fix that specifically? {-# REWRITE ALWAYS "errorLoc" error = errorLoc currentLocation #-} Alexander

Michael Snoyman

12:41 p.m.

On Mon, Feb 25, 2013 at 2:15 PM, Alexander Kjeldaas < alexander.kjeldaas@gmail.com> wrote:

...

On Mon, Feb 25, 2013 at 12:46 PM, Simon Hengel wrote:

...
On Mon, Feb 25, 2013 at 10:40:29AM +0100, Twan van Laarhoven wrote:

...
I think there is no need to have a separate REWRITE_WITH_LOCATION rule. What if the compiler instead rewrites 'currentLocation' to the current location? Then you'd just define the rule:

{-# REWRITE "errorLoc" error = errorLoc currentLocation #-}

REWRITE rules are only enabled with -O. Source locations are also useful during development (when you care more about compilation time than efficient code and hence use -O0). So I'm not sure whether it's a good idea to lump those two things together.

I could imagine that source locations being useful when debugging rewrite rules for example.

I think your argument makes sense, but why not fix that specifically?

{-# REWRITE ALWAYS "errorLoc" error = errorLoc currentLocation #-}

At that point, we've now made two changes to REWRITE rules: 1. They can takes a new ALWAYS parameters. 2. There's a new, special identifier currentLocation available. What would be the advantage is of that approach versus introducing a single new REWRITE_WITH_LOCATION pragma? Michael

Twan van Laarhoven

1:02 p.m.

On 25/02/13 13:41, Michael Snoyman wrote:

...

At that point, we've now made two changes to REWRITE rules:

1. They can takes a new ALWAYS parameters. 2. There's a new, special identifier currentLocation available.

What would be the advantage is of that approach versus introducing a single new REWRITE_WITH_LOCATION pragma?

You are probably right. Ghc already has some logic in place for doing this with 'assert': -- Return an expression for (assertError "Foo.hs:27") mkAssertErrorExpr = .. finishHsVar name = do { ignore_asserts <- goptM Opt_IgnoreAsserts ; if ignore_asserts || not (name `hasKey` assertIdKey) then return (HsVar name, unitFV name) else do { e <- mkAssertErrorExpr ; return (e, unitFV name) } } So the check is name `hasKey` assertIdKey. I.e. it is a literal check whether the name is assert. Maybe that could be extended to check whether the name is declared as assert-like. Of course the real solution is to have proper stack traces. Twan

Petr Pudlák

1:02 p.m.

2013/2/25 Michael Snoyman

...

At that point, we've now made two changes to REWRITE rules:

1. They can takes a new ALWAYS parameters. 2. There's a new, special identifier currentLocation available.

What would be the advantage is of that approach versus introducing a single new REWRITE_WITH_LOCATION pragma?

Just a remark: 'currentLocation' is not a function (it's a special keyword) but behaves like one - it returns some kind of value. But it's not referentially transparent - it returns a different value depending on where it's used. This is something that I really don't expect from Haskell. So having it return `IO Location` seems therefore much better option. And if someone really wants to get the location as a pure value, (s)he can simply wrap it with `unsafePerformIO`, which signals code readers to be careful with that part. Best regards, Petr

Alexander Kjeldaas

2:15 p.m.

My initial thought as I read the proposal was to represent currentLocation as a lexical bound variable, thus "error" is rewritten to the expression: let currentLocation = "someplace.hs:123" in errorLoc currentLocation there is no referntial transparency issue in that because there is no global function "currentLocation", it's a lexically bound variable in the rewrite environment. Btw, I'm happy that people want to implement whatever they feel like. Feel free to do whatever makes sense. My comments are not meant to discourage this :-) Alexander On Mon, Feb 25, 2013 at 2:02 PM, Petr Pudlák wrote:

...

2013/2/25 Michael Snoyman

...
At that point, we've now made two changes to REWRITE rules:

1. They can takes a new ALWAYS parameters. 2. There's a new, special identifier currentLocation available.

What would be the advantage is of that approach versus introducing a single new REWRITE_WITH_LOCATION pragma?

Just a remark: 'currentLocation' is not a function (it's a special keyword) but behaves like one - it returns some kind of value. But it's not referentially transparent - it returns a different value depending on where it's used. This is something that I really don't expect from Haskell. So having it return `IO Location` seems therefore much better option. And if someone really wants to get the location as a pure value, (s)he can simply wrap it with `unsafePerformIO`, which signals code readers to be careful with that part.

Best regards, Petr

Roman Cheplyaka

3:59 p.m.

* Petr Pudlák [2013-02-25 14:02:28+0100]

...

2013/2/25 Michael Snoyman

...
At that point, we've now made two changes to REWRITE rules:

1. They can takes a new ALWAYS parameters. 2. There's a new, special identifier currentLocation available.

What would be the advantage is of that approach versus introducing a single new REWRITE_WITH_LOCATION pragma?

Just a remark: 'currentLocation' is not a function (it's a special keyword) but behaves like one - it returns some kind of value. But it's not referentially transparent - it returns a different value depending on where it's used. This is something that I really don't expect from Haskell. So having it return `IO Location` seems therefore much better option. And if someone really wants to get the location as a pure value, (s)he can simply wrap it with `unsafePerformIO`, which signals code readers to be careful with that part.

Wrapping it in IO doesn't make it any more referentially transparent, if you think about it. Roman

Daniel Trstenjak

26 Feb 26 Feb

8:52 a.m.

Hi Michael, On Mon, Feb 25, 2013 at 02:41:19PM +0200, Michael Snoyman wrote:

...

At that point, we've now made two changes to REWRITE rules:

1. They can takes a new ALWAYS parameters. 2. There's a new, special identifier currentLocation available.

What would be the advantage is of that approach versus introducing a single new REWRITE_WITH_LOCATION pragma?

The name REWRITE_WITH_LOCATION could indicate that it's just a REWRITE with an additional location, but not that it's used by the compiler in a different way. Perhaps using just another word instead of REWRITE could indicate the difference of application. Greetings, Daniel

Simon Peyton-Jones

25 Feb 25 Feb

2:42 p.m.

...

I think there is no need to have a separate REWRITE_WITH_LOCATION rule. What if the compiler instead rewrites 'currentLocation' to the current location? Then you'd just define the rule:

{-# REWRITE "errorLoc" error = errorLoc currentLocation #-} REWRITE rules are only enabled with -O. Source locations are also useful during development (when you care more about compilation time

I’m afraid the rewrite-rule idea won’t work. RULES are applied during optimisation, when tons of inlining has happened and the program has been shaken around a lot. No reliable source location information is available there. See http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack; and please edit it. One idea I had, which that page does not yet describe, is to have an implicit parameter, something like ?loc::Location, with errLoc :: ?loc:Location => String -> a errLoc s = error (“At “ ++ ?loc ++ “\n” ++ s) This behave exactly like an ordinary implicit parameter, EXCEPT that if there is no binding for ?loc::Location, then the current location is used. Thus myErr :: ?loc:Location => Int -> a myErr n = errLoc (show n) foo :: Int -> int foo n | n<0 = myErr n | otherwise = ...whatever... When typechecking ‘foo’ we need ?loc:Location, and so the magic is that we use the location of the call of myErr in foo. Simon From: haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.org] On Behalf Of Alexander Kjeldaas Sent: 25 February 2013 12:16 To: Simon Hengel Cc: Haskell Cafe Subject: Re: [Haskell-cafe] RFC: rewrite-with-location proposal On Mon, Feb 25, 2013 at 12:46 PM, Simon Hengel mailto:sol@typeful.net> wrote: On Mon, Feb 25, 2013 at 10:40:29AM +0100, Twan van Laarhoven wrote: than efficient code and hence use -O0). So I'm not sure whether it's a good idea to lump those two things together. I could imagine that source locations being useful when debugging rewrite rules for example. I think your argument makes sense, but why not fix that specifically? {-# REWRITE ALWAYS "errorLoc" error = errorLoc currentLocation #-} Alexander

Kim-Ee Yeoh

5:30 p.m.

On Mon, Feb 25, 2013 at 9:42 PM, Simon Peyton-Jones wrote:

...

One idea I had, which that page does not yet describe, is to have an implicit parameter, something like ?loc::Location**

+1 Implicit params has a bad rap in some circles because of counterintuitive behavior when manually binding the parameter (the syntax is partly to blame). Since ?loc is never bound by hand, there should be no problems. -- Kim-Ee

Michael Snoyman

6:19 p.m.

On Mon, Feb 25, 2013 at 4:42 PM, Simon Peyton-Jones wrote:

...

I’m afraid the rewrite-rule idea won’t work. RULES are applied during optimisation, when tons of inlining has happened and the program has been shaken around a lot. No reliable source location information is available there.****

**

Do you mean that the proposal itself won't work, or specifically implementing this features in terms of existing rewrite rules won't work?

...

**

See http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack; and please edit it.****

**

One thing I'd disagree with on that page is point (3). While it's certainly nice to have a full stack trace, implementing just shallow call information is incredibly useful. For logging and test framework usages, it in fact completely covers the use case. And even for debugging, I think it would be a massive step in the right direction. I'll admit to ignorance on the internals of GHC, but it seems like doing the shallow source location approach would be far simpler than a full trace. I'd hate to lose a very valuable feature because we can't implement the perfect feature.

...

**

One idea I had, which that page does not yet describe, is to have an implicit parameter, something like ?loc::Location, with****

errLoc :: ?loc:Location => String -> a****

errLoc s = error (“At “ ++ ?loc ++ “\n” ++ s)****

** **

This behave exactly like an ordinary implicit parameter, EXCEPT that if there is no binding for ?loc::Location, then the current location is used. Thus****

** **

myErr :: ?loc:Location => Int -> a****

myErr n = errLoc (show n)****

** **

foo :: Int -> int****

foo n | n<0 = myErr n****

| otherwise = ...whatever...****

** **

When typechecking ‘foo’ we need ?loc:Location, and so the magic is that we use the location of the call of myErr in foo.****

** **

Simon****

** **

** **

** **

*From:* haskell-cafe-bounces@haskell.org [mailto: haskell-cafe-bounces@haskell.org] *On Behalf Of *Alexander Kjeldaas *Sent:* 25 February 2013 12:16 *To:* Simon Hengel *Cc:* Haskell Cafe *Subject:* Re: [Haskell-cafe] RFC: rewrite-with-location proposal****

** **

On Mon, Feb 25, 2013 at 12:46 PM, Simon Hengel wrote:*** *

On Mon, Feb 25, 2013 at 10:40:29AM +0100, Twan van Laarhoven wrote:

...
I think there is no need to have a separate REWRITE_WITH_LOCATION rule. What if the compiler instead rewrites 'currentLocation' to the current location? Then you'd just define the rule:

{-# REWRITE "errorLoc" error = errorLoc currentLocation #-}****

REWRITE rules are only enabled with -O. Source locations are also useful during development (when you care more about compilation time than efficient code and hence use -O0). So I'm not sure whether it's a good idea to lump those two things together.****

** **

I could imagine that source locations being useful when debugging rewrite rules for example.****

** **

I think your argument makes sense, but why not fix that specifically?****

** **

{-# REWRITE ALWAYS "errorLoc" error = errorLoc currentLocation #-}****

** **

Alexander****

** **

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Simon Peyton-Jones

26 Feb 26 Feb

10:06 a.m.

...

I think there is no need to have a separate REWRITE_WITH_LOCATION rule. What if the compiler instead rewrites 'currentLocation' to the current location? Then you'd just define the rule:

{-# REWRITE "errorLoc" error = errorLoc currentLocation #-} REWRITE rules are only enabled with -O. Source locations are also useful during development (when you care more about compilation time

Do you mean that the proposal itself won't work, or specifically implementing this features in terms of existing rewrite rules won't work? I meant the latter. I'll admit to ignorance on the internals of GHC, but it seems like doing the shallow source location approach would be far simpler than a full trace. I'd hate to lose a very valuable feature because we can't implement the perfect feature. I agree with that sentiment. But in fact I suspect that getting a stack is little or no harder than the shallow thing. My "implicit parameter" suggestion was trying to re-use an existing feature, with a small twist, to do what you want, rather than to implement something brand new. Simon From: michael.snoyman@gmail.com [mailto:michael.snoyman@gmail.com] On Behalf Of Michael Snoyman Sent: 25 February 2013 18:19 To: Simon Peyton-Jones Cc: Alexander Kjeldaas; Simon Hengel; Haskell Cafe Subject: Re: [Haskell-cafe] RFC: rewrite-with-location proposal On Mon, Feb 25, 2013 at 4:42 PM, Simon Peyton-Jones mailto:simonpj@microsoft.com> wrote: I'm afraid the rewrite-rule idea won't work. RULES are applied during optimisation, when tons of inlining has happened and the program has been shaken around a lot. No reliable source location information is available there. Do you mean that the proposal itself won't work, or specifically implementing this features in terms of existing rewrite rules won't work? See http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack; and please edit it. One thing I'd disagree with on that page is point (3). While it's certainly nice to have a full stack trace, implementing just shallow call information is incredibly useful. For logging and test framework usages, it in fact completely covers the use case. And even for debugging, I think it would be a massive step in the right direction. I'll admit to ignorance on the internals of GHC, but it seems like doing the shallow source location approach would be far simpler than a full trace. I'd hate to lose a very valuable feature because we can't implement the perfect feature. One idea I had, which that page does not yet describe, is to have an implicit parameter, something like ?loc::Location, with errLoc :: ?loc:Location => String -> a errLoc s = error ("At " ++ ?loc ++ "\n" ++ s) This behave exactly like an ordinary implicit parameter, EXCEPT that if there is no binding for ?loc::Location, then the current location is used. Thus myErr :: ?loc:Location => Int -> a myErr n = errLoc (show n) foo :: Int -> int foo n | n<0 = myErr n | otherwise = ...whatever... When typechecking 'foo' we need ?loc:Location, and so the magic is that we use the location of the call of myErr in foo. Simon From: haskell-cafe-bounces@haskell.orgmailto:haskell-cafe-bounces@haskell.org [mailto:haskell-cafe-bounces@haskell.orgmailto:haskell-cafe-bounces@haskell.org] On Behalf Of Alexander Kjeldaas Sent: 25 February 2013 12:16 To: Simon Hengel Cc: Haskell Cafe Subject: Re: [Haskell-cafe] RFC: rewrite-with-location proposal On Mon, Feb 25, 2013 at 12:46 PM, Simon Hengel mailto:sol@typeful.net> wrote: On Mon, Feb 25, 2013 at 10:40:29AM +0100, Twan van Laarhoven wrote: than efficient code and hence use -O0). So I'm not sure whether it's a good idea to lump those two things together. I could imagine that source locations being useful when debugging rewrite rules for example. I think your argument makes sense, but why not fix that specifically? {-# REWRITE ALWAYS "errorLoc" error = errorLoc currentLocation #-} Alexander _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.orgmailto:Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Michael Snoyman

12:24 p.m.

On Tue, Feb 26, 2013 at 12:06 PM, Simon Peyton-Jones wrote:

...

Do you mean that the proposal itself won't work, or specifically implementing this features in terms of existing rewrite rules won't work?* ***

** **

I meant the latter.****

** **

I'll admit to ignorance on the internals of GHC, but it seems like doing the shallow source location approach would be far simpler than a full trace. I'd hate to lose a very valuable feature because we can't implement the perfect feature.****

** **

I agree with that sentiment. But in fact I suspect that getting a stack is little or no harder than the shallow thing.****

** **

My “implicit parameter” suggestion was trying to re-use an existing feature, with a small twist, to do what you want, rather than to implement something brand new.

I personally have very little opinion about how this feature is implemented. But would this approach implement the shallow trace, or the full stack trace? Michael

...

****

Simon****

** **

*From:* michael.snoyman@gmail.com [mailto:michael.snoyman@gmail.com] *On Behalf Of *Michael Snoyman *Sent:* 25 February 2013 18:19 *To:* Simon Peyton-Jones *Cc:* Alexander Kjeldaas; Simon Hengel; Haskell Cafe

*Subject:* Re: [Haskell-cafe] RFC: rewrite-with-location proposal****

** **

** **

** **

On Mon, Feb 25, 2013 at 4:42 PM, Simon Peyton-Jones wrote:****

I’m afraid the rewrite-rule idea won’t work. RULES are applied during optimisation, when tons of inlining has happened and the program has been shaken around a lot. No reliable source location information is available there.****

****

** **

Do you mean that the proposal itself won't work, or specifically implementing this features in terms of existing rewrite rules won't work?* ***

****

See http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack; and please edit it.****

****

** **

One thing I'd disagree with on that page is point (3). While it's certainly nice to have a full stack trace, implementing just shallow call information is incredibly useful. For logging and test framework usages, it in fact completely covers the use case. And even for debugging, I think it would be a massive step in the right direction.****

** **

I'll admit to ignorance on the internals of GHC, but it seems like doing the shallow source location approach would be far simpler than a full trace. I'd hate to lose a very valuable feature because we can't implement the perfect feature.****

****

One idea I had, which that page does not yet describe, is to have an implicit parameter, something like ?loc::Location, with****

errLoc :: ?loc:Location => String -> a****

errLoc s = error (“At “ ++ ?loc ++ “\n” ++ s)****

****

This behave exactly like an ordinary implicit parameter, EXCEPT that if there is no binding for ?loc::Location, then the current location is used. Thus****

****

myErr :: ?loc:Location => Int -> a****

myErr n = errLoc (show n)****

****

foo :: Int -> int****

foo n | n<0 = myErr n****

| otherwise = ...whatever...****

****

When typechecking ‘foo’ we need ?loc:Location, and so the magic is that we use the location of the call of myErr in foo.****

****

Simon****

****

****

****

*From:* haskell-cafe-bounces@haskell.org [mailto: haskell-cafe-bounces@haskell.org] *On Behalf Of *Alexander Kjeldaas *Sent:* 25 February 2013 12:16 *To:* Simon Hengel *Cc:* Haskell Cafe *Subject:* Re: [Haskell-cafe] RFC: rewrite-with-location proposal****

****

On Mon, Feb 25, 2013 at 12:46 PM, Simon Hengel wrote:*** *

On Mon, Feb 25, 2013 at 10:40:29AM +0100, Twan van Laarhoven wrote:

...
I think there is no need to have a separate REWRITE_WITH_LOCATION rule. What if the compiler instead rewrites 'currentLocation' to the current location? Then you'd just define the rule:

{-# REWRITE "errorLoc" error = errorLoc currentLocation #-}****

REWRITE rules are only enabled with -O. Source locations are also useful during development (when you care more about compilation time than efficient code and hence use -O0). So I'm not sure whether it's a good idea to lump those two things together.****

****

I could imagine that source locations being useful when debugging rewrite rules for example.****

****

I think your argument makes sense, but why not fix that specifically?****

****

{-# REWRITE ALWAYS "errorLoc" error = errorLoc currentLocation #-}****

****

Alexander****

****

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe****

** **

Gershom Bazerman

3:51 p.m.

On 2/25/13 9:42 AM, Simon Peyton-Jones wrote:

...

I'm afraid the rewrite-rule idea won't work. RULES are applied during optimisation, when tons of inlining has happened and the program has been shaken around a lot. No reliable source location information is available there.

See http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack; and please edit it.

One idea I had, which that page does not yet describe, is to have an implicit parameter, something like ?loc::Location, with

errLoc :: ?loc:Location => String -> a

errLoc s = error ("At " ++ ?loc ++ "\n" ++ s)

This behave exactly like an ordinary implicit parameter, EXCEPT that if there is no binding for ?loc::Location, then the current location is used. Thus

I like the general approach of this proposal quite a bit. I'd very much like Location to be not just a string, but a record type. Ideally we could recover not just module name, line and character, but also the name of the function that takes the location. This would eliminate an entire swath of use-cases for Template Haskell. For example, I've worked out a template-haskell-free version of the Cloud Haskell closure API, which hopefully is getting merged in at some point. The major drawback it has is that the user is required to provide a globally-unique identifier for each closure, ideally stable across compilations. The current TH code solves this by grabbing the function and module name. If we could get direct access to these things without requiring template haskell, that would be quite nice. Other types of RPC libraries I've worked on could similarly benefit from this. Cheers, Gershom

Alberto G. Corona

3 Mar 3 Mar

10:41 p.m.

Additionally, Another way to include line number information and to improve readability of the degugging code is to add "verify" as an "assert" with flipped parameters so we can write: let x= head xs `verify` (not $ null xs) So the assertions appear on the right , separated from the rest of the code. instead of let x= assert (not $ null xs) xs 2013/2/26 Gershom Bazerman

...

On 2/25/13 9:42 AM, Simon Peyton-Jones wrote:

I’m afraid the rewrite-rule idea won’t work. RULES are applied during optimisation, when tons of inlining has happened and the program has been shaken around a lot. No reliable source location information is available there.****

** **

See http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack; and please edit it.****

** **

One idea I had, which that page does not yet describe, is to have an implicit parameter, something like ?loc::Location, with****

errLoc :: ?loc:Location => String -> a****

errLoc s = error (“At “ ++ ?loc ++ “\n” ++ s)****

** **

This behave exactly like an ordinary implicit parameter, EXCEPT that if there is no binding for ?loc::Location, then the current location is used. Thus

I like the general approach of this proposal quite a bit. I'd very much like Location to be not just a string, but a record type. Ideally we could recover not just module name, line and character, but also the name of the function that takes the location. This would eliminate an entire swath of use-cases for Template Haskell. For example, I've worked out a template-haskell-free version of the Cloud Haskell closure API, which hopefully is getting merged in at some point. The major drawback it has is that the user is required to provide a globally-unique identifier for each closure, ideally stable across compilations. The current TH code solves this by grabbing the function and module name. If we could get direct access to these things without requiring template haskell, that would be quite nice. Other types of RPC libraries I've worked on could similarly benefit from this.

Cheers, Gershom

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- Alberto.

Evan Laforge

2 Dec 2 Dec

9:43 p.m.

Hey, whatever happened with this? Is there anything in the way of getting this merged? Is there some way I could help? On Sun, Feb 24, 2013 at 10:06 PM, Michael Snoyman wrote:

...

Quite a while back, Simon Hengel and I put together a proposal[1] for a new feature in GHC. The basic idea is pretty simple: provide a new pragma that could be used like so:

error :: String -> a errorLoc :: IO Location -> String -> a {-# REWRITE_WITH_LOCATION error errorLoc #-}

Then all usages of `error` would be converted into calls to `errorLoc` by the compiler, passing in the location information of where the call originated from. Our three intended use cases are:

* Locations for failing test cases in a test framework * Locations for log messages * assert/error/undefined

Note that the current behavior of the assert function[2] already includes this kind of approach, but it is a special case hard-coded into the compiler. This proposal essentially generalizes that behavior and makes it available for all functions, whether included with GHC or user-defined.

The proposal spells out some details of this approach, and contrasts with other methods being used today for the same purpose, such as TH and CPP.

Michael

[1] https://github.com/sol/rewrite-with-location [2] http://hackage.haskell.org/packages/archive/base/4.6.0.1/doc/html/Control-Ex...

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Simon Hengel

10:06 p.m.

Hi Evan! On Mon, Dec 02, 2013 at 01:43:31PM -0800, Evan Laforge wrote:

...

Hey, whatever happened with this?

My code for this is here: https://github.com/sol/ghc/commits/rewrite-with-location Revision 03e63f0a70ec8c0fece4049c2d714ea533494ec2 was fully functional, but it needs to be rebased on current master. The missing feature here is that type checking only happens on rewrite. I just added a wip commit with local modifications that do the type checking earlier, when the module with the rewrite pragma is compiled.

...

Is there anything in the way of getting this merged? Is there some way I could help?

This needs rebasing + I'm not sure if the wip commit currently compiles. I'm somewhat swamped, so I'm not sure when I'll have time to work on this. If you want to help, that would be awesome! I'm happy to help with any questions (solirc on freenode, feel free to say hello in #hspec ;). Cheers, Simon

Simon Peyton-Jones

5 Dec 5 Dec

12:13 p.m.

Simon Interesting! There's been lot of work on this kind of thing, mostly collected here: https://ghc.haskell.org/trac/ghc/wiki/ExplicitCallStack I didn't know about your work, so I've added it. I'd be happy to see progress on this front. Tristan's "Finding the needle" stuff was close to "ready" but there were some awkward points (described in his paper) that meant he didn't feel it was done. To progress this, it'd be helpful to look at his work, articulate what the differences are, perhaps take the best of both, identify weak spots, and figure out what (if anything) should be done about them. We don't want the best to be the enemy of the good, but it's also worth ensuring that we take advantage of all the land-mine-discovery that earlier work has done. Simon | -----Original Message----- | From: Haskell-Cafe [mailto:haskell-cafe-bounces@haskell.org] On Behalf | Of Simon Hengel | Sent: 02 December 2013 22:06 | To: Evan Laforge | Cc: Haskell Cafe | Subject: Re: [Haskell-cafe] RFC: rewrite-with-location proposal | | Hi Evan! | | On Mon, Dec 02, 2013 at 01:43:31PM -0800, Evan Laforge wrote: | > Hey, whatever happened with this? | | My code for this is here: | | https://github.com/sol/ghc/commits/rewrite-with-location | | Revision 03e63f0a70ec8c0fece4049c2d714ea533494ec2 was fully functional, | but it needs to be rebased on current master. The missing feature here | is that type checking only happens on rewrite. I just added a wip | commit with local modifications that do the type checking earlier, when | the module with the rewrite pragma is compiled. | | > Is there anything in the way of getting this merged? Is there some | > way I could help? | | This needs rebasing + I'm not sure if the wip commit currently compiles. | I'm somewhat swamped, so I'm not sure when I'll have time to work on | this. If you want to help, that would be awesome! I'm happy to help | with any questions (solirc on freenode, feel free to say hello in #hspec | ;). | | Cheers, | Simon | _______________________________________________ | Haskell-Cafe mailing list | Haskell-Cafe@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-cafe

Evan Laforge

4 Feb 4 Feb

11:40 p.m.

I noticed a recent commit https://phabricator.haskell.org/D578 implements this. This is exciting! But I suppose it's too late for 7.10? Any chance of it making it in? I should note that to me the most interesting part of this has nothing to do with debugging. If you have a logging system, or tests, you need to have file name and line numbers. I've always done it via a custom preprocessor, but this extension would allow me to get rid of the preprocessor, which is very nice. It also seems simpler than the whole call stack thing because I only care about the first entry of the stack. On Thu, Dec 5, 2013 at 4:13 AM, Simon Peyton-Jones wrote:

...

Simon

Interesting!

There's been lot of work on this kind of thing, mostly collected here: https://ghc.haskell.org/trac/ghc/wiki/ExplicitCallStack

I didn't know about your work, so I've added it.

I'd be happy to see progress on this front. Tristan's "Finding the needle" stuff was close to "ready" but there were some awkward points (described in his paper) that meant he didn't feel it was done.

To progress this, it'd be helpful to look at his work, articulate what the differences are, perhaps take the best of both, identify weak spots, and figure out what (if anything) should be done about them.

We don't want the best to be the enemy of the good, but it's also worth ensuring that we take advantage of all the land-mine-discovery that earlier work has done.

Simon

| -----Original Message----- | From: Haskell-Cafe [mailto:haskell-cafe-bounces@haskell.org] On Behalf | Of Simon Hengel | Sent: 02 December 2013 22:06 | To: Evan Laforge | Cc: Haskell Cafe | Subject: Re: [Haskell-cafe] RFC: rewrite-with-location proposal | | Hi Evan! | | On Mon, Dec 02, 2013 at 01:43:31PM -0800, Evan Laforge wrote: | > Hey, whatever happened with this? | | My code for this is here: | | https://github.com/sol/ghc/commits/rewrite-with-location | | Revision 03e63f0a70ec8c0fece4049c2d714ea533494ec2 was fully functional, | but it needs to be rebased on current master. The missing feature here | is that type checking only happens on rewrite. I just added a wip | commit with local modifications that do the type checking earlier, when | the module with the rewrite pragma is compiled. | | > Is there anything in the way of getting this merged? Is there some | > way I could help? | | This needs rebasing + I'm not sure if the wip commit currently compiles. | I'm somewhat swamped, so I'm not sure when I'll have time to work on | this. If you want to help, that would be awesome! I'm happy to help | with any questions (solirc on freenode, feel free to say hello in #hspec | ;). | | Cheers, | Simon | _______________________________________________ | Haskell-Cafe mailing list | Haskell-Cafe@haskell.org | http://www.haskell.org/mailman/listinfo/haskell-cafe

Eric Seidel

11:51 p.m.

Evan Laforge writes:

...

I noticed a recent commit https://phabricator.haskell.org/D578 implements this. This is exciting! But I suppose it's too late for 7.10? Any chance of it making it in?

Heh, I probably should have made some noise about it... But there are still some questions about to what extent to use it (if at all) in the standard libraries, so I doubt it would be a good candidate for a last minute merge to 7.10. I've actually been meaning to start a discussion about where/whether we should use this feature in base, but it slipped by the wayside. Thanks for the reminder!

...

I should note that to me the most interesting part of this has nothing to do with debugging. If you have a logging system, or tests, you need to have file name and line numbers. I've always done it via a custom preprocessor, but this extension would allow me to get rid of the preprocessor, which is very nice. It also seems simpler than the whole call stack thing because I only care about the first entry of the stack.

Yep, my original motivation was getting access to source locations within embedded DSLs. The call-stack is a nice and easy extension, but I'm not sure how useful it will be in practice, as the first function that doesn't request a CallStack parameter will cut off the stack. This means that the generated stacks will often be quite short, I imagine. Eric

Evan Laforge

5 Feb 5 Feb

12:10 a.m.

On Wed, Feb 4, 2015 at 3:51 PM, Eric Seidel wrote:

...

Yep, my original motivation was getting access to source locations within embedded DSLs. The call-stack is a nice and easy extension, but I'm not sure how useful it will be in practice, as the first function that doesn't request a CallStack parameter will cut off the stack. This means that the generated stacks will often be quite short, I imagine.

Well, as I said, all I really care about is the direct caller. From the example in the commit, it looks like the function with the (?x :: Location) annotation can get its immediate caller, even if that caller doesn't have the annotation. If that's true, that's all that is needed! And from my point of view, it's not just "maybe useful in practice", but absolutely required, to the point where I wrote a custom preprocessor for it. I've been using it for 6 or 7 years and I sort of forgot that other people don't have it. I actually have no idea how other people do logging... just hope the message is unique and grep -n all the time? And for tests, manually give every single assertion a unique name and grep -n again? Enable TH globally? Those all seem impractical if you have or are expecting thousands of modules. I don't think it needs to be used at all in the standard libraries, since logging and testing are not part of base. I can understand if the merge window for 7.10 is closed, but trying to come up with a use in base shouldn't hold it up!

Eric Seidel

12:39 a.m.

Evan Laforge writes:

...

On Wed, Feb 4, 2015 at 3:51 PM, Eric Seidel wrote:

...
Yep, my original motivation was getting access to source locations within embedded DSLs. The call-stack is a nice and easy extension, but I'm not sure how useful it will be in practice, as the first function that doesn't request a CallStack parameter will cut off the stack. This means that the generated stacks will often be quite short, I imagine.

Well, as I said, all I really care about is the direct caller. From the example in the commit, it looks like the function with the (?x :: Location) annotation can get its immediate caller, even if that caller doesn't have the annotation. If that's true, that's all that is needed!

That's correct, though I regrettably forgot to update the Phabricator summary with the rename from Location to CallStack (the actual docs do properly talk about CallStacks). A constraint (?x :: CallStack) will always be solved for the source location that gave rise to it, when it comes from a function signature (as opposed to a use of the implicit param) you'll get the immediate call-site. Furthermore, if that call-site has a CallStack implicit param in scope, the stacks will be appended (this appending of call-stacks is the bit that I'm not sure will see much use). Does that make sense?

...

And from my point of view, it's not just "maybe useful in practice", but absolutely required, to the point where I wrote a custom preprocessor for it. I've been using it for 6 or 7 years and I sort of forgot that other people don't have it. I actually have no idea how other people do logging... just hope the message is unique and grep -n all the time? And for tests, manually give every single assertion a unique name and grep -n again? Enable TH globally? Those all seem impractical if you have or are expecting thousands of modules.

I don't think it needs to be used at all in the standard libraries, since logging and testing are not part of base. I can understand if the merge window for 7.10 is closed, but trying to come up with a use in base shouldn't hold it up!

You're quite right, there are plenty of reasons to want this functionality beyond error reporting, though I'd personally like to use this for `error`, `undefined`, and `assert` as well!

Evan Laforge

12:52 a.m.

On Wed, Feb 4, 2015 at 4:39 PM, Eric Seidel wrote:

...

A constraint (?x :: CallStack) will always be solved for the source location that gave rise to it, when it comes from a function signature (as opposed to a use of the implicit param) you'll get the immediate call-site. Furthermore, if that call-site has a CallStack implicit param in scope, the stacks will be appended (this appending of call-stacks is the bit that I'm not sure will see much use). Does that make sense?

Sure. One place I can think of where the chaining would be useful is you can put it on a helper function that then calls the logging function, so the output can skip that intermediate function. This makes it better than python or C++ logging, which often provides no way to do that.

...

You're quite right, there are plenty of reasons to want this functionality beyond error reporting, though I'd personally like to use this for `error`, `undefined`, and `assert` as well!

True enough, though assert kind of already has the feature. In fact, you could remove the assert special case hack, though there are probably people relying on it to get source info. In any case it's up to the GHC people to say what they think. In case it's not already obvious, big +1 for "merge as soon as possible" from me.

Eric Seidel

1:02 a.m.

...

On Feb 4, 2015, at 16:52, Evan Laforge wrote:

...
On Wed, Feb 4, 2015 at 4:39 PM, Eric Seidel wrote: A constraint (?x :: CallStack) will always be solved for the source location that gave rise to it, when it comes from a function signature (as opposed to a use of the implicit param) you'll get the immediate call-site. Furthermore, if that call-site has a CallStack implicit param in scope, the stacks will be appended (this appending of call-stacks is the bit that I'm not sure will see much use). Does that make sense?

Sure. One place I can think of where the chaining would be useful is you can put it on a helper function that then calls the logging function, so the output can skip that intermediate function. This makes it better than python or C++ logging, which often provides no way to do that.

Right, it would also be useful for partial functions like head, so error has access to head's call-site when it produces a stack trace. The thing is, this feature is not free, as it adds an argument to each function that participates. So the question is, to what extent would you want to annotate library code with these CallStack parameters?

Evan Laforge

1:18 a.m.

On Wed, Feb 4, 2015 at 5:02 PM, Eric Seidel wrote:

...

Right, it would also be useful for partial functions like head, so error has access to head's call-site when it produces a stack trace. The thing is, this feature is not free, as it adds an argument to each function that participates. So the question is, to what extent would you want to annotate library code with these CallStack parameters?

I don't actually care, since I never use those functions. I suppose one way to argue would be to say go ahead and add, since the only people using them are either a quick hack where crashing is ok, and thus probably don't care about micro-optimizing performance, or people who don't know any better, who are also not in the micro-optimization business. Or maybe people use them as a micro-optimization like e.g. unsafeIndex. But I don't know if it actually is a micro-optimization over e.g. 'case xs of [] -> error "..."', since 'head' probably compiles to just that. In any case, I have no horse in the race, but I'd say "put it on everything partial."

Simon Hengel

30 Mar 30 Mar

4:04 a.m.

Hi Evan, On Wed, Feb 04, 2015 at 04:10:56PM -0800, Evan Laforge wrote:

...

On Wed, Feb 4, 2015 at 3:51 PM, Eric Seidel wrote:

...
Yep, my original motivation was getting access to source locations within embedded DSLs. The call-stack is a nice and easy extension, but I'm not sure how useful it will be in practice, as the first function that doesn't request a CallStack parameter will cut off the stack. This means that the generated stacks will often be quite short, I imagine.

Well, as I said, all I really care about is the direct caller. From the example in the commit, it looks like the function with the (?x :: Location) annotation can get its immediate caller, even if that caller doesn't have the annotation. If that's true, that's all that is needed!

I completely agree with you. Logging and failing test cases where my main motivation when I looked at the problem domain (even though I still think the situation with error/undefined is unfortunate too).

...

And from my point of view, it's not just "maybe useful in practice", but absolutely required, to the point where I wrote a custom preprocessor for it. I've been using it for 6 or 7 years and I sort of forgot that other people don't have it. I actually have no idea how other people do logging... just hope the message is unique and grep -n all the time? And for tests, manually give every single assertion a unique name and grep -n again?

hspec-discover adds heuristic source locations by now (without really parsing any Haskell code, so it's pretty robust, worst case is you don't get a source location). This is on a spec item (== test case) granularity. I still want proper source locations, so that we can attach them to individual expectations/assertions.

...

Enable TH globally?

For logging-facade I added a TH version, but I don't really like it. And again, I'm super eager to use proper source locations here, once the patch got merged!

...

I don't think it needs to be used at all in the standard libraries, since logging and testing are not part of base. I can understand if the merge window for 7.10 is closed, but trying to come up with a use in base shouldn't hold it up!

As you argued in a later mail, I would be in favor of using it for all partial functions in base, too (but we may want to spend some time looking at how list fusion may be affected). But I think it makes sense to treat this as two separate discussions. Hopefully we get this patch merged ASAP, then we can still discuss to what extend we want to use it in base. Cheers, Simon [1] http://hackage.haskell.org/package/logging-facade-0.0.0/docs/System-Logging-...

3757

Age (days ago)

4520

Last active (days ago)

List overview

Download

32 comments

14 participants

participants (14)

Alberto G. Corona
Alexander Kjeldaas
Daniel Trstenjak
Eric Seidel
Evan Laforge
Gershom Bazerman
Joachim Breitner
Kim-Ee Yeoh
Michael Snoyman
Petr Pudlák
Roman Cheplyaka
Simon Hengel
Simon Peyton-Jones
Twan van Laarhoven