Should exhaustiveness testing be on by default?

newer
is the library iconv necessary for...

older
Job for someone: make a VM image...

Don Stewart

17 May 2009 17 May '09

11:53 p.m.

I'm not sure I'd want -Wall on by default (though being -Wall clean is very good). But exhaustive pattern checking might well help out a lot of people coming from untyped backgrounds. http://ocaml.janestreet.com/?q=node/64 Ron's also wondering why exhaustive pattern checking isn't on ? Anyone know why it isn't the default? -- Don

Show replies by date

Lennart Augustsson

18 May 18 May

1:18 a.m.

The exhaustiveness checker in ghc is not good. It reports non-exhaustive matching a bit too often and also the error messages are often not in terms of the original source but some desugared version. If those things were improved I think it should be always on. On Mon, May 18, 2009 at 12:53 AM, Don Stewart wrote:

...

I'm not sure I'd want -Wall on by default (though being -Wall clean is very good). But exhaustive pattern checking might well help out a lot of people coming from untyped backgrounds.

http://ocaml.janestreet.com/?q=node/64

Ron's also wondering why exhaustive pattern checking isn't on ?

Anyone know why it isn't the default?

-- Don _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Neil Mitchell

8:38 a.m.

I'm not a particular fan of exhaustiveness checking. It just encourages people to write: foo (Just 1) [x:xs] = important case foo _ _ = error "doh!" So now when the program crashes, instead of getting a precise and guaranteed correct error message, I get "doh!" - not particularly helpful for debugging. I think it's not that useful for writing personal code, but for writing production code (which you're going to ship off) it's probably important that you cover all cases. However, for that it would be much better to use a tool such as Catch (http://community.haskell.org/~ndm/catch), which actually guarantees you won't crash, rather than a simple syntactic check. As such, I think it's perfect to be included in a set of warnings, but not great to be a default. I also think that if we change our policies every time someone writes a critical blog we'll be going round in circles :-) I'm also not a fan of the overlapping patterns warning being on by default, as I often write: foo (Case1 x y z) = ... foo (Bar x) = ... foo x = error $ show ("Unhandled in foo",x) This is overlapping, and for a very good reason. Thanks Neil On Mon, May 18, 2009 at 2:18 AM, Lennart Augustsson wrote:

...

The exhaustiveness checker in ghc is not good. It reports non-exhaustive matching a bit too often and also the error messages are often not in terms of the original source but some desugared version.

If those things were improved I think it should be always on.

On Mon, May 18, 2009 at 12:53 AM, Don Stewart wrote:

...
I'm not sure I'd want -Wall on by default (though being -Wall clean is very good). But exhaustive pattern checking might well help out a lot of people coming from untyped backgrounds.

http://ocaml.janestreet.com/?q=node/64

Ron's also wondering why exhaustive pattern checking isn't on ?

Anyone know why it isn't the default?

-- Don _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Peter Hercek

9:45 a.m.

Neil Mitchell wrote:

...

I'm not a particular fan of exhaustiveness checking. It just encourages people to write:

foo (Just 1) [x:xs] = important case foo _ _ = error "doh!"

So now when the program crashes, instead of getting a precise and guaranteed correct error message, I get "doh!" - not particularly helpful for debugging Is there some compile option to automatically annotate error call with its source code location (so that one dos not need to mention it in the string argument)?

Peter.

Simon Peyton-Jones

20 May 20 May

7:36 a.m.

Yes indeed http://www.haskell.org/ghc/docs/latest/html/users_guide/assertions.html Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Peter Hercek | Sent: 18 May 2009 10:46 | To: glasgow-haskell-users@haskell.org | Subject: Re: Should exhaustiveness testing be on by default? | | Neil Mitchell wrote: | > I'm not a particular fan of exhaustiveness checking. It just | > encourages people to write: | > | > foo (Just 1) [x:xs] = important case | > foo _ _ = error "doh!" | > | > So now when the program crashes, instead of getting a precise and | > guaranteed correct error message, I get "doh!" - not particularly | > helpful for debugging | Is there some compile option to automatically annotate error call with | its source | code location (so that one dos not need to mention it in the string | argument)? | | Peter. | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Peter Hercek

10:27 a.m.

I was looking for something which works in optimized builds too. I know I could do it with preprocessor or (I think) template haskell too but these tools seem to heavy for such a simple goal. The point is the exhaustiveness check saves me from some errors sometimes, but I often need to switch it off for a specific case statement too. Adding an catch all alternative and an error call would be cool way to do it if there is a way to automatically add source code location. This way I could get the best error telling me where it happend and also why I thought the other alternatives should not happen (the error call argument). Thanks, Peter. Simon Peyton-Jones wrote:

...

Yes indeed http://www.haskell.org/ghc/docs/latest/html/users_guide/assertions.html

Simon

| -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Peter Hercek | Sent: 18 May 2009 10:46 | To: glasgow-haskell-users@haskell.org | Subject: Re: Should exhaustiveness testing be on by default? | | Neil Mitchell wrote: | > I'm not a particular fan of exhaustiveness checking. It just | > encourages people to write: | > | > foo (Just 1) [x:xs] = important case | > foo _ _ = error "doh!" | > | > So now when the program crashes, instead of getting a precise and | > guaranteed correct error message, I get "doh!" - not particularly | > helpful for debugging | Is there some compile option to automatically annotate error call with | its source | code location (so that one dos not need to mention it in the string | argument)? | | Peter. | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Claus Reinke

12:33 p.m.

...

I was looking for something which works in optimized builds too.

{-# OPTIONS_GHC -fno-ignore-asserts #-} overrides the -O default setting -fignore-asserts.

...

I know I could do it with preprocessor or (I think) template haskell too but these tools seem to heavy for such a simple goal.

Given how long http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack has been under discussion, it is probably time to provide a short-term workaround in GHC, just a token to be replaced by the current source location. Then assert could be redefined in terms of it, so GHC wouldn't be any more complicated than it is, and users would have access to the same functionality for their uses, while the more useful variations are still being discussed. Below is another hacked-up version, this time using quasiquoting to generate a piece of code that will trigger an error with source location, which will only be forced when we need the source location info:-) It is reasonably easy to use (though one should trim the part of the error message one is not interested in, and I don't like that I can't simply call error in 'f', because that would trigger the nested error before printing 'f's message), but a standard solution in the libraries would be a lot better. Claus ------------------------------------------ {-# LANGUAGE QuasiQuotes #-} {-# OPTIONS_GHC -fno-ignore-asserts #-} import Control.Exception(assert) import SrcLocQQ import Debug.Trace f srcloc Nothing = "okay" f srcloc _ = trace "error: f applied to not-Nothing in: " srcloc main = do print $ f [$srcloc||] Nothing print $ f [$srcloc||] (Just ()) print $ assert False True ------------------------------------------ {-# LANGUAGE TemplateHaskell #-} module SrcLocQQ where import Language.Haskell.TH.Quote import Language.Haskell.TH srcloc = QuasiQuoter (\_->return $ CaseE (ConE 'True) [Match (ConP 'False []) (NormalB (LitE (StringL "srcloc"))) []]) (error "pattern srclocs not supported") ------------------------------------------ $ ghc -e main srcloc.hs "okay" error: f applied to not-Nothing in: <interactive>: srcloc.hs:13:12-22: Non-exhaustive patterns in case

Peter Hercek

2:03 p.m.

Claus Reinke wrote:

...

Given how long http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack has been under discussion, it is probably time to provide a short-term workaround in GHC, just a token to be replaced by the current source location.

This would be the best solution. Although -fno-ignore-asserts is acceptable till I do not need asserts for what they are actually supposed to be used for. The second solution requires QuasiQuotes, so I do not know. If I would want to compile with a different compiler it would break. If srcloc can be defined as a simple token (not requiring special extensions at places where it is used) then I could define it to an empty string in some low level module if trying to compile with a different haskell compiler which does not know srcloc. Thanks for the tips, Peter.

Claus Reinke

3:18 p.m.

...

The second solution requires QuasiQuotes, so I do not know. If I would want to compile with a different compiler it would break. If srcloc can be defined as a simple token (not requiring special extensions at places where it is used) then I could define it to an empty string in some low level module if trying to compile with a different haskell compiler which does not know srcloc.

You can do better than that, if you combine the QuasiQuotes hack with the CPP hack (I've also simplified the srcloc handling by adding a version of error that adds source location info, moving the exception manipulation out into SrcLocQQ, avoiding the need for Debug.Trace alltogether). The portable version does get a bit uglier because you need macros, not functions (you'll probably want to check for GHC version or -better, but not supported- QuasiQuotes availability). Also, CPP only gives you the line number, not the position, but that is better than nothing, and often sufficient. Still, it would be much nicer if GHC inserted the location info at the call sites if a pragma at the definition site asked it to do so. Then this {-# SRCLOC f #-} f Nothing = "okay" f _ = error "f applied to not-Nothing in: " could be equivalent to the code below, without QuasiQuotes or CPP or ERRORSRC all over the place. But such niceties are on hold while the discussion of even nicer help is ongoing.. (which is partly justified because we cannot easily build nicer abstractions over a barebones solution, due to the macro vs function issue, so the design does need thought). Perhaps the code below is sufficient as an interim workaround. Claus ----------------------------- {-# LANGUAGE CPP #-} {-# LANGUAGE QuasiQuotes #-} #ifdef __GLASGOW_HASKELL__ #define SRCLOC [$srcloc||] #define ERRORSRC [$errorSrc||] #else #define SRCLOC (show (__FILE__,__LINE__)) #define ERRORSRC (\msg->error $ msg++SRCLOC) #endif import SrcLocQQ f errorSrc Nothing = "okay" f errorSrc _ = errorSrc "f applied to not-Nothing in: " main = do print $ f ERRORSRC Nothing print $ f ERRORSRC (Just ()) print $ SRCLOC ----------------------------- {-# LANGUAGE TemplateHaskell #-} module SrcLocQQ where import Language.Haskell.TH.Quote import Language.Haskell.TH import Control.Exception srcloc = QuasiQuoter (\_->[| mapException (\(PatternMatchFail fail)-> let srcloc = reverse (dropWhile (/=':') (reverse fail)) in PatternMatchFail srcloc) $ case True of False -> "srcloc" |]) (error "pattern srclocs not supported") errorSrc = QuasiQuoter (\_->[| \msg->mapException (\(PatternMatchFail fail)-> let srcloc = reverse (dropWhile (/=':') (reverse fail)) in PatternMatchFail (msg++srcloc)) $ case True of False -> "srcloc" |]) (error "pattern srclocs not supported") -----------------------------

Peter Hercek

21 May 21 May

1:15 p.m.

Ok, I went with the preprocessor solution only. It is simple, stupid and works well enough ... and template haskell alternative needs it anyway not to be too unportable. Both template haskell alternatives reported "Pattern match(es) are non-exhaustive" of their own. The second alternative moreover needs a change of '$ case True of False -> "srcloc"' to '$ case True of False -> undefined' to be usable. The warning problem is critical by its own since the goal of using it is to selectively disable the very same warning for a specific case statement. Although the warning can be eliminated probably in the first template haskell alternative. Not sure since I do not know template haskell. As well as I still do not know how to write a haskell function in C-- which is the reason there is no :next command in ghci yet :) Thanks, Peter. Claus Reinke wrote:

...

...
The second solution requires QuasiQuotes, so I do not know. If I would want to compile with a different compiler it would break. If srcloc can be defined as a simple token (not requiring special extensions at places where it is used) then I could define it to an empty string in some low level module if trying to compile with a different haskell compiler which does not know srcloc.

You can do better than that, if you combine the QuasiQuotes hack with the CPP hack (I've also simplified the srcloc handling by adding a version of error that adds source location info, moving the exception manipulation out into SrcLocQQ, avoiding the need for Debug.Trace alltogether). The portable version does get a bit uglier because you need macros, not functions (you'll probably want to check for GHC version or -better, but not supported- QuasiQuotes availability). Also, CPP only gives you the line number, not the position, but that is better than nothing, and often sufficient.

Still, it would be much nicer if GHC inserted the location info at the call sites if a pragma at the definition site asked it to do so. Then this

{-# SRCLOC f #-} f Nothing = "okay" f _ = error "f applied to not-Nothing in: "

could be equivalent to the code below, without QuasiQuotes or CPP or ERRORSRC all over the place. But such niceties are on hold while the discussion of even nicer help is ongoing.. (which is partly justified because we cannot easily build nicer abstractions over a barebones solution, due to the macro vs function issue, so the design does need thought). Perhaps the code below is sufficient as an interim workaround.

Claus

----------------------------- {-# LANGUAGE CPP #-} {-# LANGUAGE QuasiQuotes #-}

#ifdef __GLASGOW_HASKELL__ #define SRCLOC [$srcloc||] #define ERRORSRC [$errorSrc||] #else #define SRCLOC (show (__FILE__,__LINE__)) #define ERRORSRC (\msg->error $ msg++SRCLOC) #endif

import SrcLocQQ

f errorSrc Nothing = "okay" f errorSrc _ = errorSrc "f applied to not-Nothing in: "

main = do print $ f ERRORSRC Nothing print $ f ERRORSRC (Just ()) print $ SRCLOC

----------------------------- {-# LANGUAGE TemplateHaskell #-} module SrcLocQQ where import Language.Haskell.TH.Quote import Language.Haskell.TH import Control.Exception

srcloc = QuasiQuoter (\_->[| mapException (\(PatternMatchFail fail)-> let srcloc = reverse (dropWhile (/=':') (reverse fail)) in PatternMatchFail srcloc) $ case True of False -> "srcloc" |]) (error "pattern srclocs not supported") errorSrc = QuasiQuoter (\_->[| \msg->mapException (\(PatternMatchFail fail)-> let srcloc = reverse (dropWhile (/=':') (reverse fail)) in PatternMatchFail (msg++srcloc)) $ case True of False -> "srcloc" |]) (error "pattern srclocs not supported") -----------------------------

John Meacham

23 May 23 May

12:05 a.m.

JHC has had this for a while, but it calls the pragma 'SRCLOC_ANNOTATE'. It is actually mentioned on this page: http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack I am not entirely sure whether I will keep the current syntax (it doesn't really make sense for operators), but if ghc implements something related, then it would be good to be compatible. John -- John Meacham - ⑆repetae.net⑆john⑈ - http://notanumber.net/

Claus Reinke

1:53 p.m.

...

JHC has had this for a while, but it calls the pragma 'SRCLOC_ANNOTATE'.

It is actually mentioned on this page: http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack

Yes, I know, but the discussion on that page wanted to go beyond this (possibly triggered by your demonstration that one can go beyond plain SRCLOC;-), which is why GHC still has neither of SRCLOC or SRCLOC_ANNOTATE (apart from the various SRCLOC hacks). What is really frustrating is that GHC has the machinery to do this trivially (RULES, soon core2core phase plugins as well), only that this machinery gets applied when source location information is no longer available, so it is useless for this problem:-( One thing that wasn't available when this discussion was last active is 'mapException' (btw, similar to 'catch'/'catches', a 'mapExceptions' would be useful). For instance, appended below is the example from that wiki page, with entirely local transformations to add source locations and to use that info to augment 'ErrorCall' exceptions (we should really augment 'PatternMatchFail' exception messages as well..). $ ghc -e main callstack.hs <interactive>: hd: empty list ("callstack.hs",25) ("callstack.hs",21) ("callstack.hs",16) ("callstack.hs",13) JHC could probably easily adapt its SRCLOC_ANNOTATE scheme to use a mapException-based scheme instead?-) So one could use the original code, and just add {-# SRCLOC_mapException e, f, hd #-} With GHC, I'm always tempted to write something like {-# RULES "hd->loc(hd)" hd = mapError SRCLOC . hd #-} but that uses the source location of the rule and, worse, when the rule is actually applied, 'hd's source location info is no longer available, so one cannot simply augment the 'RULES' mechanism to supply the source location of its main left-hand side symbol.. Claus ------------------------------ {-# LANGUAGE CPP #-} import Control.Exception #define SRCLOC (show (__FILE__,__LINE__)) #define ERRORSRC (\msg->error $ msg++SRCLOC) mapError src = mapException (\(ErrorCall e)->ErrorCall $ e++"\n"++src) errorSrc src = error . (++"\n"++src) main = print d d :: Int d = mapError SRCLOC (e []) {- line 13 -} e :: [Int] -> Int e = mapError SRCLOC . f 10 {- line 16 -} f :: Int -> [Int] -> Int f = \x -> case fac x < 10 of True -> \_ -> 3 False -> mapError SRCLOC . hd {- line 21 -} hd :: [a] -> a hd = \x -> case x of [] -> errorSrc SRCLOC "hd: empty list" {- line 25 -} (x:_) -> x fac :: Int -> Int fac = \n -> case n == 0 of True -> 1 False -> n * fac (n - 1)

Claus Reinke

10:34 p.m.

New subject: error stack traces/source locations (was: Should exhaustiveness testing be on by default?)

...

What is really frustrating is that GHC has the machinery to do this trivially (RULES, soon core2core phase plugins as well), only that this machinery gets applied when source location information is no longer available, so it is useless for this problem:-(

I'd be happy to be proven wrong in this, of course!-) I had the feeling that there'd been some recent work on all this and, indeed, reddit has: http://www.reddit.com/r/haskell/comments/8mbar/finding_the_needle_stack_trac... Right on topic, and I assume the authors are on this list!-) It appears that the bulk of the paper is concerned with performance improvements - if we just want to recreate the user view, the mapException approach would appear to be a lot simpler (perhaps the paper could discuss it as a poor man's alternative, in particular since its source rewrites seem less intrusive?). Of course, having any support for this in GHC will be an improvement. However, it is interesting that the paper suggests an implementation using a core2core transformation, *with access to source location l*: [|x_l|]s = x deb (push l s), if x has (Debugged ’x deb) ann (Figure 1, page 4) Does that mean it would be possible to add an extension to RULES that would allow things like: {-# RULES "f->f_" forall x. f{SRCLOC} x = mapError SRCLOC (f_ x) #-} ie, referring to the source location of 'f' in the right hand side? That would seem to be the smallest possible change permitting a user-level implementation of error stack traces. And it might be more generally useful as well. Attached is another example source, again doing the expansion by hand ( (f x) -> mapError SRCLOC (f x); error ".." -> errorSrc SRCLOC "..") and relying on mapException to collect the traces, with a few recursive examples suggested by the paper, and three variations of eliding recursive call sites from the stack traces (note that this only elides when presenting the trace, the real implementation from the paper elides when constructing the trace). Claus ------------------------------------ example output: $ ghc -e main stacktraces.hs -DPLAIN -- fib 5 fib with non-positive number: 0 ("stacktraces.hs",46) ("stacktraces.hs",45) ("stacktraces.hs",45) ("stacktraces.hs",45) ("stacktraces.hs",45) ("stacktraces.hs",36) -- odd_ 5 odd_: no match ("stacktraces.hs",51) ("stacktraces.hs",54) ("stacktraces.hs",50) ("stacktraces.hs",54) ("stacktraces.hs",50) ("stacktraces.hs",38) -- firstLetters2 Prelude.head: empty list ("stacktraces.hs",58) ("stacktraces.hs",62) ("stacktraces.hs",62) ("stacktraces.hs",62) $ ghc -e main stacktraces.hs -- fib 5 fib with non-positive number: 0 ("stacktraces.hs",46) ... ("stacktraces.hs",45) ("stacktraces.hs",36) -- odd_ 5 odd_: no match ("stacktraces.hs",51) ... ("stacktraces.hs",54) ("stacktraces.hs",50) ("stacktraces.hs",38) -- firstLetters2 Prelude.head: empty list ("stacktraces.hs",58) ... ("stacktraces.hs",62)

Isaac Dupree

11:11 p.m.

New subject: error stack traces/source locations

Claus Reinke wrote:

...

the mapException approach

I'm afraid it won't work as you hope for functions that return lazy data structures. The 'evaluate' implicit in mapException only catches top-level errors/exceptions, like 'seq' and not like 'deepSeq'. -Isaac

Claus Reinke

24 May 24 May

9:49 a.m.

New subject: error stack traces/source locations

...

...
the mapException approach

I'm afraid it won't work as you hope for functions that return lazy data structures. The 'evaluate' implicit in mapException only catches top-level errors/exceptions, like 'seq' and not like 'deepSeq'.

Have a look at 'firstLetters2' in the example I provided. Perhaps it is easier to see if you change the 'map_' to 'map', or at least put the definition on two lines (since the CPP-based SRCLOC doesn't include columns), but it is there in any case. The initial call to 'map_' is not part of the stack trace, nor are any of the recursive calls. Nor should they be, I think - if 'map' delivers as much as '[]' or 'undefined : undefined' without raising an error, it has done its job. If evaluating any part of the non-strict data structure runs into trouble, the inspection, not the generation, is to blame. That may be inconvenient when one actually wants to see where the parts of the data structure came from, but there are established techniques for forcing strictness, or one can annotate data with producer info (in ways similar to what Hood does: make sure the producer info is there if the part is ever inspected, but don't actually force the part). Here is a slightly rewritten 'firstLetters2', to illustrate: firstLetters2 = mapError SRCLOC $ -- 58 map_ (mapError SRCLOC . head) ["hi", "world", "", "!"] -- 59 map_ f [] = [] map_ f (h:t) = f1 h : mapError SRCLOC (map_ f2 t) -- 62 where f1 = mapError SRCLOC . f -- 63 f2 = mapError SRCLOC . f -- 64 and here is the relevant part of the output (using -DPLAIN): -- firstLetters2 Prelude.head: empty list ("stacktraces.hs",59) ("stacktraces.hs",64) ("stacktraces.hs",64) ("stacktraces.hs",63) Neither 58 nor 62 appear in the trace (there is nothing wrong with 'map_'). The appearance of 63 and 64 in the trace gives one simple example of how information can be added without prematurely forcing evaluation (eg, 'head firstLetters2' gives no trace at all). Claus

Simon Peyton-Jones

26 May 26 May

1:58 p.m.

New subject: error stack traces/source locations (was: Should exhaustiveness testing be on by default?)

| I'd be happy to be proven wrong in this, of course!-) I had the feeling | that there'd been some recent work on all this and, indeed, reddit has: | | http://www.reddit.com/r/haskell/comments/8mbar/finding_the_needle_stack_trac... | hc_pdf/ Sorry I've been very buried recently. Some brief rejoinders to this thread. - Yes, Tristan's paper "Finding the needle" http://research.microsoft.com/~simonpj/papers/stack-trace/DebugTraces.pdf describes his intern work, and represents a good stab at the problem. I think he'd be very interested in feedback. Is this the best way to tackle the problem? Are there better alternatives? - It's not in GHC HEAD, because of the unresolved issues that are discussed towards the end of the paper. - Template Haskell does let you find the current source location, thus: loc :: Q Exp loc = do { l <- location ; lift (loc_module l ++ ":" ++ show (loc_start l)) } Now you can say (error $(loc)), and you'll get the source location of the call to $loc. So that's pretty close to what was originally asked for. Simon

Simon Marlow

28 May 28 May

11:07 a.m.

On 23/05/2009 14:53, Claus Reinke wrote:

...

...
JHC has had this for a while, but it calls the pragma 'SRCLOC_ANNOTATE'. It is actually mentioned on this page: http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack

Yes, I know, but the discussion on that page wanted to go beyond this (possibly triggered by your demonstration that one can go beyond plain SRCLOC;-), which is why GHC still has neither of SRCLOC or SRCLOC_ANNOTATE (apart from the various SRCLOC hacks).

What is really frustrating is that GHC has the machinery to do this trivially (RULES, soon core2core phase plugins as well), only that this machinery gets applied when source location information is no longer available, so it is useless for this problem:-(

One thing that wasn't available when this discussion was last active is 'mapException' (btw, similar to 'catch'/'catches', a 'mapExceptions' would be useful). For instance, appended below is the example from that wiki page, with entirely local transformations to add source locations and to use that info to augment 'ErrorCall' exceptions (we should really augment 'PatternMatchFail' exception messages as well..).

$ ghc -e main callstack.hs <interactive>: hd: empty list ("callstack.hs",25) ("callstack.hs",21) ("callstack.hs",16) ("callstack.hs",13)

Your example is giving you the dynamic call stack. Is that really what you want? The dynamic call stack will often be quite different from the structure of your program, may well be surprising, and may even differ depending on compiler flags. My personal preference would be to fix cost center stacks to do the right thing here. Currently we have +RTS -xc which shows the call stack when an exception is raised, available when your program is profiled. Sometimes it gives odd results because it has bugs, but the idea is that it gives you a *lexical* call stack, which corresponds exactly to the code that you wrote, not the unpredictable evaluation strategy of the compiler. Obviously the down side of CCSs is that you have to compile your code and libraries for profiling, but OTOH you don't have to add any mapExceptions, pragmas, or other explicit annotation stuff. And it could be added to GHCi by default, so when working in GHCi you could have full lexical call stacks for all interpreted code. Cheers, Simon

Claus Reinke

2:09 p.m.

...

...
One thing that wasn't available when this discussion was last active is 'mapException' (btw, similar to 'catch'/'catches', a 'mapExceptions' would be useful). For instance, appended below is the example from that wiki page, with entirely local transformations to add source locations and to use that info to augment 'ErrorCall' exceptions (we should really augment 'PatternMatchFail' exception messages as well..).

$ ghc -e main callstack.hs <interactive>: hd: empty list ("callstack.hs",25) ("callstack.hs",21) ("callstack.hs",16) ("callstack.hs",13)

Your example is giving you the dynamic call stack. Is that really what you want? The dynamic call stack will often be quite different from the structure of your program, may well be surprising, and may even differ depending on compiler flags.

Given the source locations, the lexical _position_ is obvious, so for mere traces, dynamic seems to be the choice (with an option of pseudo-cbv or the real dynamic stack). What is neither obvious nor provided (dynamic or static, in any of the proposed stack traces) are the parameters for the stack. If those are available, as in a debugger, the balance might shift to favour lexical stack, especially if I'm investigating "what is there?", rather than "who asked for this, and why?", "where am I?" or "how did I get there?". Both static and dynamic stack can be useful - different queries, different answers. And without automated annotation support, it is difficult to test how useful either stack traces are (apart from: better than nothing;-). I have the feeling that dynamic stack trace is the right _initial_ answer: if that induces a need to know about the construction environment, one can annotate construction Hood-style (preferred), without changing strictness, or make it strict (not useful if there are many constructions and only one failing access), so that construction and first observation coincide, or pretend to have made it strict while evaluating non-strictly (the pseudo-cbv stack traces). Or one could switch tools, from dynamic to a lexical stack, and from mere stack trace to debugger (where I would like good lexical and dynamic stacks with parameters:-).

...

My personal preference would be to fix cost center stacks to do the right thing here. Currently we have +RTS -xc which shows the call stack when an exception is raised, available when your program is profiled. Sometimes it gives odd results because it has bugs, but the idea is that it gives you a *lexical* call stack, which corresponds exactly to the code that you wrote, not the unpredictable evaluation strategy of the compiler.

The more information one can get, the more queries about one's program one can answer, the more puzzles/bugs one can solve. So, by all means, lets have lexical stack, pseudo-call-by-value stack, and dynamic stack choices available. And lexical stack with parameters in the debugger. Here are the +RTS -xc and mapException outputs together (when I remove the mapError annotations, only the first <..> is printed, so that is the part to focus on, the rest is confusion) - they seem to complement each other (-xc has no locations, but names for the lexical stack; mapError has no names, but locations for the dynamic stack; we're still missing the parameters for either stack): $ ghc --make -prof -auto-all stacktraces.hs [1 of 1] Compiling Main ( stacktraces.hs, stacktraces.o ) Linking stacktraces.exe ... $ ./stacktraces.exe +RTS -xc -- fib 5

...

fib with non-positive number: 0 ("stacktraces.hs",46) ... ("stacktraces.hs",45) ("stacktraces.hs",36)

-- odd_ 5 odd_: no match ("stacktraces.hs",51) ... ("stacktraces.hs",54) ("stacktraces.hs",50) ("stacktraces.hs",38) -- firstLetters2 Prelude.head: empty list ("stacktraces.hs",59) ... ("stacktraces.hs",64) ("stacktraces.hs",63)

...

Obviously the down side of CCSs is that you have to compile your code and libraries for profiling, but OTOH you don't have to add any mapExceptions, pragmas, or other explicit annotation stuff. And it could be added to GHCi by default, so when working in GHCi you could have full lexical call stacks for all interpreted code.

If RULES right hand sides could be given access to the source locations of the code that their left hand sides have matched, the annotation could be automated, and would seem to be less intrusive than the finding-the- needle transformation. Also, this seems to be the smallest possible change to GHC (I just don't know whether it is possible, or how to do it). But, as I said, the more info, the better. And getting some info without annotations would be great (currently, +RTC -sc is no use with GHCi, and even with printE replaced with print, -fbreak-on-error, and :trace main, the GHCi debugger doesn't seem to offer much help for this example). Claus

Claus Reinke

6:05 p.m.

...

Here are the +RTS -xc and mapException outputs together (when I remove the mapError annotations, only the first <..> is printed, so that is the part to focus on, the rest is confusion)

Actually, it looks as if the implementation of mapException modifies the stack that +RTS -xc prints. Not entirely surprising, perhaps. We can use that to illustrate the RULES for call site annotation suggestion, by rewriting call sites of functions to be traced to prefix 'mapException id' (see attached source). That won't change the error message, because the RULES don't have the source location we'd need there, but it will put a stack frame with lexical stack info onto the stack. On error, +RTS -xc prints the stack frames, corresponding to a dynamic stack trace, each stack frame giving a lexical stack for an annotated call site. 1) no annotations, no +RTS -xc: $ ghc -e main stacktracesXcHack.hs -- fib 5 fib with non-positive number: 0 -- odd_ 5 odd_: no match -- firstLetters2 Prelude.head: empty list 2) no annotations, +RTS -xc only: $ ghc --make -prof -auto-all stacktracesXcHack.hs -DHACK [1 of 1] Compiling Main ( stacktracesXcHack.hs, stacktracesXcHack.o ) Linking stacktracesXcHack.exe ... $ ./stacktracesXcHack.exe +RTS -xc -- fib 5 fib with non-positive number: 0 -- odd_ 5 odd_: no match -- firstLetters2 Prelude.head: empty list 3) RULES-based call site annotations, +RTS -xc (reformatted for readability): $ ghc --make -prof -auto-all stacktracesXcHack.hs -DHACK -fenable-rewrite-rules [1 of 1] Compiling Main ( stacktracesXcHack.hs, stacktracesXcHack.o ) Linking stacktracesXcHack.exe ... $ ./stacktracesXcHack.exe +RTS -xc -- fib 5 fib with non-positive number: 0 -- odd_ 5 odd_: no match -- firstLetters2 Prelude.head: empty list Would be much better if RULES could provide source location info from their application sites, or if at least +RTS -xc would contain source locations (shouldn't the parameters belonging to each stack frame be accessible, too?-). And is there an easier way to specify apply-once RULES than this double naming business? But perhaps it help to illustrate the potential, both for combining dynamic and lexical stack, and for using RULES for call site annotations.. Claus

Simon Marlow

2 Jun 2 Jun

11:13 a.m.

On 28/05/2009 15:09, Claus Reinke wrote:

...

...
...
One thing that wasn't available when this discussion was last active is 'mapException' (btw, similar to 'catch'/'catches', a 'mapExceptions' would be useful). For instance, appended below is the example from that wiki page, with entirely local transformations to add source locations and to use that info to augment 'ErrorCall' exceptions (we should really augment 'PatternMatchFail' exception messages as well..).

$ ghc -e main callstack.hs <interactive>: hd: empty list ("callstack.hs",25) ("callstack.hs",21) ("callstack.hs",16) ("callstack.hs",13)

Your example is giving you the dynamic call stack. Is that really what you want? The dynamic call stack will often be quite different from the structure of your program, may well be surprising, and may even differ depending on compiler flags.

Given the source locations, the lexical _position_ is obvious,

You mean the call site? Yes, most methods will point you to the call site, since the top element of the dynamic stack is often the same as the top element of the dynamic stack (but not always). The call site on its own is often not enough, however - it's just one level of the stack.

...

so for mere traces, dynamic seems to be the choice (with an option of pseudo-cbv or the real dynamic stack).

I don't know what pseudo-cbv is. And I claim the dynamic stack is almost never what you want. Ok, so there's one place you might want to see the dynamic stack: when catching exceptions raised by pure code. Then it really makes a difference whether something is evaluated or not, and you need a handle on the demand context in order to find out why an exception was raised. On the other hand, I think this case is relatively rare compared to cases of the form "finding out why my program called head []".

...

What is neither obvious nor provided (dynamic or static, in any of the proposed stack traces) are the parameters for the stack. If those are available, as in a debugger, the balance might shift to favour lexical stack, especially if I'm investigating "what is there?", rather than "who asked for this, and why?", "where am I?" or "how did I get there?".

Sure, providing access to free variables is certainly desirable, but I think it's an orthogonal issue. There are engineering difficulties, such as introduction of space leaks.

...

Here are the +RTS -xc and mapException outputs together (when I remove the mapError annotations, only the first <..> is printed, so that is the part to focus on, the rest is confusion) - they seem to complement each other (-xc has no locations, but names for the lexical stack; mapError has no names, but locations for the dynamic stack; we're still missing the parameters for either stack):

I'm not claiming that +RTS -xc in its present form is what you want. I'm interested in finding an underlying mechanism that allows the right information to be obtained; things like source locations and free variables are just decoration. Now, most of the existing methods have problems with CAFs. I doubt that the problems with CAFs are fixable using the source-to-source transformation methods, but I think they might be fixable using cost-centre stacks. Cheers, Simon

Peter Hercek

7:34 p.m.

Simon Marlow wrote:

...

On 28/05/2009 15:09, Claus Reinke wrote:

...
so for mere traces, dynamic seems to be the choice (with an option of pseudo-cbv or the real dynamic stack).

I don't know what pseudo-cbv is. And I claim the dynamic stack is almost never what you want.

Ok, so there's one place you might want to see the dynamic stack: when catching exceptions raised by pure code.

Would it not help also when finding out why some code is not as lazy as it is hoped for? Now, I do not know how often this problem happens, I did not have it yet, but it looks like it would help. I remember I also wanted it when I was trying to understand how uulib works. I would expect it to be useful anytime laziness is critical for efficient program execution. If the stacks do not come with variables in the scope available then both are useful about the same from my rather unexperienced point of view. Anyway, after I'd learned to use GhciExt (thank you both for helping me out with it), the next command became more important to me than the stack. That is for my code, if/when I get to uulib again I may change my mind quickly :-D Peter.

Claus Reinke

7 Jun 7 Jun

11:23 a.m.

...

...
...
...
One thing that wasn't available when this discussion was last active is 'mapException' (btw, similar to 'catch'/'catches', a 'mapExceptions' would be useful).

...

...
so for mere traces, dynamic seems to be the choice (with an option of pseudo-cbv or the real dynamic stack).

I don't know what pseudo-cbv is.

oops, forward reference - explained later in the same email: pretend that evaluation is call-by-value and produce stack for that (instead of the actuall call-by-need stack which can be confusing).

...

And I claim the dynamic stack is almost never what you want.

Maybe so - I'd be interested in an example on which this claim is based. Perhaps I've been misunderstanding what you mean by "lexical stack"? "lexical" to me implies only scope information, nothing related to run time call chains, which would be "dynamic". In the "dynamic" case, one can then distinguish between call-by-need stack (what actually happens in GHC) and call-by-value stack (pretend that everything is strict). What the cost-centre stack delivers appears to be more than scopes, and less than a full static call graph (which would have to include non deterministic branches, since the actual choice of branches depends on runtime information) - it seems to use runtime information to give a slice of the full call graph (eg, not all call sites that could call the current function, but only the one that did so in the current run)?

...

...
Here are the +RTS -xc and mapException outputs together (.. - they seem to complement each other (-xc has no locations, but names for the lexical stack; mapError has no names, but locations for the dynamic stack; we're still missing the parameters for either stack):

I'm not claiming that +RTS -xc in its present form is what you want. I'm interested in finding an underlying mechanism that allows the right information to be obtained; things like source locations and free variables are just decoration.

And I'm saying that adding mapException annotations is a way to get there, with very little extra effort (just the get-the-source-location part of the finding-the-needle transformation would be sufficient).

...

Now, most of the existing methods have problems with CAFs. I doubt that the problems with CAFs are fixable using the source-to-source transformation methods, but I think they might be fixable using cost-centre stacks.

One of the nice things about my suggestion to mix an "annotate with mapException" transformation with cost-centre stacks is that it would cover CAFs as well. As another example, I tried the nofib-buggy:anna test case discussed at http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack/StackTraceExperie... which does have just such a CAF problem (the error is caused in a local CAF, and raised in a standard library CAF), amongst other nasty features (no CPP/sed only transformation will handle infix applications, the initial error message doesn't even tell us where to start annotating). Starting from the cost-centre approach (which fails on this example), I proceeded as follows: - the initial error is "Main: divide by zero" - this doesn't tell us where the issue is, so we have to annotate all calls to 'div', which we do by wrapping in 'mapException', adding some location information to any exceptions raised (*) - the changes are $ darcs wh hunk ./real/anna/SmallerLattice.hs 229 - = let diff = ((n2 * 10) `div` n1) - scaleup_ratio - scaleFact = n2 `div` 10 + = let diff = mapError "local_cost:diff" $ ((n2 * 10) `div` n1) - scaleup_ratio + scaleFact = mapError "local_cost:scaleFact" $ n2 `div` 10 hunk ./real/anna/Utils.hs 38 +import Control.Exception hunk ./real/anna/Utils.hs 105 - = let k = s1 `div` 53668 + = let k = mapError "rands:k" $ s1 `div` 53668 hunk ./real/anna/Utils.hs 109 - k' = s2 `div` (s1' - s1') + k' = mapError "rands:k'" $ s2 `div` (s1' - s1') hunk ./real/anna/Utils.hs 635 + +mapError src = mapException (\(SomeException e)->ErrorCall $ show e++"\n"++src) +errorSrc src = error . (++"\n"++src) - I added these annotations by hand - any automated transformation could easily be more precise about the source location messages - output of annotated version (reformatted, as for the others on that page): Main.exe: divide by zero rands:k' This appears to be in the same ballpark as what the more complex transformations (hat, finding-the-needle) deliver. Claus (*) I took a shortcut here, by mapping all exceptions to 'ErrorCalls'; if the original code contained exception handlers, one would have to be more careful, generically extending all exceptions without changing their types; but this has to be done only once, and the extended 'mapError' would go into the standard debug libraries.

Simon Marlow

2:32 p.m.

Claus Reinke wrote:

...

Perhaps I've been misunderstanding what you mean by "lexical stack"? "lexical" to me implies only scope information, nothing related to run time call chains, which would be "dynamic". In the "dynamic" case, one can then distinguish between call-by-need stack (what actually happens in GHC) and call-by-value stack (pretend that everything is strict).

Ah, ok. Terminology mismatch. My "lexical call stack" and your "pseudo-cbv" are almost the same thing, I think. The way a cost-centre stack is built is described in the docs: http://www.haskell.org/ghc/docs/latest/html/users_guide/profiling.html#prof-...

...

What the cost-centre stack delivers appears to be more than scopes, and less than a full static call graph (which would have to include non deterministic branches, since the actual choice of branches depends on runtime information) - it seems to use runtime information to give a slice of the full call graph (eg, not all call sites that could call the current function, but only the one that did so in the current run)?

I'm not sure what you mean here. e.g. "non-deterministic branches"? Obviously the shape of the call stack depends upon values at runtime.

...

...
...
Here are the +RTS -xc and mapException outputs together (.. - they seem to complement each other (-xc has no locations, but names for the lexical stack; mapError has no names, but locations for the dynamic stack; we're still missing the parameters for either stack):

I'm not claiming that +RTS -xc in its present form is what you want. I'm interested in finding an underlying mechanism that allows the right information to be obtained; things like source locations and free variables are just decoration.

And I'm saying that adding mapException annotations is a way to get there, with very little extra effort (just the get-the-source-location part of the finding-the-needle transformation would be sufficient).

mapException exposes information about the call-by-need stack, which is not what you want (I claim).

...

...
Now, most of the existing methods have problems with CAFs. I doubt that the problems with CAFs are fixable using the source-to-source transformation methods, but I think they might be fixable using cost-centre stacks.

One of the nice things about my suggestion to mix an "annotate with mapException" transformation with cost-centre stacks is that it would cover CAFs as well. As another example, I tried the nofib-buggy:anna test case discussed at http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack/StackTraceExperie...

which does have just such a CAF problem (the error is caused in a local CAF, and raised in a standard library CAF), amongst other nasty features (no CPP/sed only transformation will handle infix applications, the initial error message doesn't even tell us where to start annotating).

The CAF problem I'm referring to is a bit different - the goal is to get a good stack trace without affecting performance by more than a constant factor. i.e. CAFs have to be evaluated no more than once, even when doing stack tracing. This turns out to be quite hard, especially when using a source-to-source transformation. The CCS implementation currently errs on the side of not giving you much information, but without re-evaluating CAFs. Hence you get a not-very-helpful call stack. However, it wouldn't be difficult to report the call stack from the site that first evaluated the CAF.

...

- the initial error is "Main: divide by zero" - this doesn't tell us where the issue is, so we have to annotate all calls to 'div', which we do by wrapping in 'mapException',

So yes, you can use mapException to get the dynamic call stack. I'm not keen on this approach though, because I think the dynamic call stack is not what you want. Also, mapExceptionn is not helpful for doing profiling or displaying the call stack at a breakpoint, and I think a general call-stack mechanism should enable both of those. Cheers, Simon

Claus Reinke

18 May 18 May

11:06 a.m.

...

I'm not sure I'd want -Wall on by default (though being -Wall clean is very good). But exhaustive pattern checking might well help out a lot of people coming from untyped backgrounds.

http://ocaml.janestreet.com/?q=node/64

Ron's also wondering why exhaustive pattern checking isn't on ?

Anyone know why it isn't the default?

The answer to the latter is probably just that it is imprecise. Don't read on if that's all you wanted to hear;-) But before anyone goes suggesting to change the defaults, consider that it would be solving the wrong problem - exhaustiveness of Haskell matching is not decidable, so you end up with an approximation; if you want to enforce this particular approximation, you end up with "defensive programming" - defensive programming may look good in blame-driven development methods, but is actually bad for fault-tolerance (google for "Erlang defensive programming") - if you are serious about correctness, rather than the appearance of correctness, there are other approaches: - non-defensive programming with external recovery management is the "industry-standard" (well, in Erlang industries, at least;-); the idea being that trying to handle all cases everywhere makes for unreadable code that doesn't address the problem, while the combination of checking cases on entry/handling the expected case locally/handling unexpected cases non-locally has been demonstrated to work in high-reliability scenarios. - checking for non-crashing rather than syntactic coverage is slightly better, but will likely only demonstrate that there are cases where you can't put in a useful answer locally; you can add local information to the error before raising it (which would still be reported as a crash), but trying to handle a case locally (just to avoid the local crash) when you don't have the information to do so is just going to hide bugs (which is not what a coding style or warning should encourage); the way forward, if you believe Erlangers, is to embrace localised crashes (google for Erlang motto: "let it crash";-), and have the infrastructure in place to automatically add useful information to crash messages [1], to programatically deal with crashes (preferably in a separate process on an independent machine), and to factor out the recovery code from the business-as-usual code (which is why Erlang programs with good fault-tolerance characteristics are often shorter than non-Erlang programs without). The added benefit is that when a crash happens, in spite of serious efforts spent on proving/testing/model checking, the system doesn't just go "oops!". - if you want coverage to have value semantically, you need to restrict the expressiveness of matching. In Haskell/OCaml, that would probably mean extensible variants/records, or possibly GADTs, (and no guards), so that you get a type error when *using* a value that is not covered (instead of a syntax error when not covering a value that may not ever be used). In particular, if you really care about such things, you should check for the good cases once, then convert to types that do not have the bad cases. That still doesn't help you with the entry check, where the best option for the unexpected case might still be to raise an exception, but such a typing strategy makes exhaustiveness checking slightly more useful. In management brief: enforcing exhaustiveness testing without any other measures just ensures that your team members will never tell you about things they do not know how to handle, so the first sign of trouble you see is when everything comes down. Establishing a non-local framework for dealing with local non-exhaustiveness and encouraging team members to raise alarms when they see trouble gives your team a chance to notice and recover from issues before they become critical. Perhaps syntactic exhaustiveness testing should be renamed to certified-everything-swept-under-the-rug testing?-) Note: this is no excuse for using things like 'head' unguardedly in a large code base! Just that the solution is not to add an empty case to 'head', but to ensure that 'head' is never called with an empty list, so "exhaustiveness everywhere" is the wrong target. Adding an empty case to 'head' is to raise an alert with useful info, working around current weaknesses in the error-reporting infrastructure [1], after we've already run into an unhandled unexpected case. If we get a "non-exhaustive" warning, it might mean a variety of things (you might want to document things like 3/4 in the types): 1 nothing (error in exhaustiveness checking) 2 nothing (approximation in exhaustiveness checking) 3 nothing (missing cases are proven never to arrive here) 4 nothing (exceptional cases crash here, are dealt with properly elsewhere) 5 wrong type used (you shouldn't even try to make this match exhaustive, you should use a type that encodes only the cases you handle) beside the naive expectation - missing case (need to cover the case) These warnings need to be used with care, and with a conscious choice of strategy/framework, not by default. If you have a good strategy/framework in place, switch them on, even make them errors, but doing so by default would only encourage bad coding practices. Just another opinion (should I've put it on a blog instead?-) Claus "Did you want to talk about the weather or were you just making chit-chat?" Weather man, Groundhog Day [1] http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack

Simon Marlow

11:23 a.m.

On 18/05/2009 12:06, Claus Reinke wrote:

...

...
I'm not sure I'd want -Wall on by default (though being -Wall clean is very good). But exhaustive pattern checking might well help out a lot of people coming from untyped backgrounds.

http://ocaml.janestreet.com/?q=node/64

Ron's also wondering why exhaustive pattern checking isn't on ?

Anyone know why it isn't the default?

The answer to the latter is probably just that it is imprecise. Don't read on if that's all you wanted to hear;-)

Two^HThree^HFour reasons really: 1. it's not very good (as Lennart pointed out) 1.1 it's not possible to make it accurate in general (as you point out) 1.1.1. it's not possible to disable it per-definition 2. not everyone wants it (as Neil pointed out) actually reason (2) is the guiding principle we use for whether a warning should be on by default or not. If there's a clear concensus for having a warning on, then we're more than happy to turn it on. Cheers, Simon

Norman Ramsey

8 p.m.

...

... exhaustive pattern checking might well help out a lot of people coming from untyped backgrounds...

Or even people from typed backgrounds. I worship at the altar of exhaustiveness checking.

...

Anyone know why it isn't the default?

I have been bleating to GHC Central about the generally low level of the default warnings. I even accused them of being tarred with the same brush as Richard Stallman! Their (reasonable) response was that they would rather wait for the user community to reach a consensus on exactly *which* warnings should be on by default. I myself like to compile with -Wall -fno-warn-name-shadowing, but I recognize that these are questions about which reasonable people could differ. I think if we could reach a consensus on the list, we might see a stronger level of warnings in 6.12. Norman P.S. The exhaustiveness checker does need improvement, and it is completely unaware of GADT's. My code is littered with pattern matches where the last case is foo _ = can't match where can't_match :: a can't_match = panic "the GADT pattern matcher is too stupid to live" I would really like to get rid of these. I hate wildcard matches, but I can't put in the constructors because they don't typecheck. And if I put in nothing, the exhaustiveness checker bleats. And I typically compile with -Werror.

Robert Greayer

19 May 19 May

2:59 a.m.

On Mon, May 18, 2009 at 4:00 PM, Norman Ramsey wrote:

...

P.S. The exhaustiveness checker does need improvement...

Is it documented somewhere what deficiencies the exhaustiveness checker has (where it can report problems that don't exist or fails to report problems that do...), and which deficiencies can't be resolved? Rob

Alexander Dunlap

3:56 a.m.

On Mon, May 18, 2009 at 7:59 PM, Robert Greayer wrote:

...

On Mon, May 18, 2009 at 4:00 PM, Norman Ramsey wrote:

...
P.S. The exhaustiveness checker does need improvement...

Is it documented somewhere what deficiencies the exhaustiveness checker has (where it can report problems that don't exist or fails to report problems that do...), and which deficiencies can't be resolved?

Rob

One problem is that it reports erroneous errors when combined with ViewPatterns, which certainly ought to be fixed if -fwarn-incomplete-patterns and -fwarn-overlapping-patterns are switched on by default. This isn't the bug others are referring to, though; I don't know where you'd find that information. Alex

Simon Peyton-Jones

7:37 a.m.

No, the shortcomings are not documented I'm afraid. It's a squishy question because when you add guards and view patterns it's undecidable whether patterns overlap or are exhaustive. Still, GHC's current implementation is poor. It's a well-contained project that is awaiting a competent implementor. see http://hackage.haskell.org/trac/ghc/wiki/ProjectSuggestions Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell- | users-bounces@haskell.org] On Behalf Of Robert Greayer | Sent: 19 May 2009 04:00 | Cc: glasgow-haskell-users@haskell.org | Subject: Re: Should exhaustiveness testing be on by default? | | On Mon, May 18, 2009 at 4:00 PM, Norman Ramsey wrote: | > P.S. The exhaustiveness checker does need improvement... | | Is it documented somewhere what deficiencies the exhaustiveness | checker has (where it can report problems that don't exist or fails to | report problems that do...), and which deficiencies can't be resolved? | | | Rob | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Simon Marlow

9:04 a.m.

On 19/05/2009 08:37, Simon Peyton-Jones wrote:

...

No, the shortcomings are not documented I'm afraid. It's a squishy question because when you add guards and view patterns it's undecidable whether patterns overlap or are exhaustive.

There are various open bugs regarding exhaustiveness/incompleteness though. Start here: http://hackage.haskell.org/trac/ghc/ticket/595 and follow the links in the comments. Also http://hackage.haskell.org/trac/ghc/ticket/2395 for view patterns. Cheers, Simon

...

Still, GHC's current implementation is poor. It's a well-contained project that is awaiting a competent implementor. see http://hackage.haskell.org/trac/ghc/wiki/ProjectSuggestions

Simon

| -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell- | users-bounces@haskell.org] On Behalf Of Robert Greayer | Sent: 19 May 2009 04:00 | Cc: glasgow-haskell-users@haskell.org | Subject: Re: Should exhaustiveness testing be on by default? | | On Mon, May 18, 2009 at 4:00 PM, Norman Ramsey wrote: |> P.S. The exhaustiveness checker does need improvement... | | Is it documented somewhere what deficiencies the exhaustiveness | checker has (where it can report problems that don't exist or fails to | report problems that do...), and which deficiencies can't be resolved? | | | Rob | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Neil Mitchell

11:01 a.m.

...

> ... exhaustive pattern checking might well help out a lot of > people coming from untyped backgrounds...

Or even people from typed backgrounds. I worship at the altar of exhaustiveness checking.

Do you really want exhaustiveness, or is what you actually want safety? With -fwarn-incomplete-patterns: test1 = head [] test2 = x where (x:xs) = [] test3 = (\(x:xs) -> 1) [] test4 = f [] where f [] = 1 GHC reports that test4 has incomplete patterns, but the others don't. However, test4 is safe, but the others aren't. Exhaustiveness is a poor approximation of safety. GHC's exhaustiveness checker is a poor approximation of exhaustiveness. 2 is a poor approximation of pi :-) Using Catch, it reports that test1..3 were faulty, but test4 is safe. Thanks Neil

Lennart Augustsson

11:18 a.m.

Excellent, is there a -fuse-catch flag for ghc? :) On Tue, May 19, 2009 at 12:01 PM, Neil Mitchell wrote:

...

...
> ... exhaustive pattern checking might well help out a lot of > people coming from untyped backgrounds...

Or even people from typed backgrounds. I worship at the altar of exhaustiveness checking.

Do you really want exhaustiveness, or is what you actually want safety?

With -fwarn-incomplete-patterns:

test1 = head []

test2 = x where (x:xs) = []

test3 = (\(x:xs) -> 1) []

test4 = f [] where f [] = 1

GHC reports that test4 has incomplete patterns, but the others don't. However, test4 is safe, but the others aren't. Exhaustiveness is a poor approximation of safety. GHC's exhaustiveness checker is a poor approximation of exhaustiveness. 2 is a poor approximation of pi :-)

Using Catch, it reports that test1..3 were faulty, but test4 is safe.

Thanks

Neil _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Neil Mitchell

11:21 a.m.

...

Excellent, is there a -fuse-catch flag for ghc? :)

No, but there is for Yhc. If you write to the Haskell standard (minus a little bit), don't like libraries and can get Yhc to compile (good luck!) then it's just a few command lines away. If GHC (or GHC + some scripts) could produce a single Core file representing a whole program, including all necessary libraries, then implementing Catch would be a weekends work. Thanks Neil

...

On Tue, May 19, 2009 at 12:01 PM, Neil Mitchell wrote:

...
...
> ... exhaustive pattern checking might well help out a lot of > people coming from untyped backgrounds...

Or even people from typed backgrounds. I worship at the altar of exhaustiveness checking.

Do you really want exhaustiveness, or is what you actually want safety?

With -fwarn-incomplete-patterns:

test1 = head []

test2 = x where (x:xs) = []

test3 = (\(x:xs) -> 1) []

test4 = f [] where f [] = 1

GHC reports that test4 has incomplete patterns, but the others don't. However, test4 is safe, but the others aren't. Exhaustiveness is a poor approximation of safety. GHC's exhaustiveness checker is a poor approximation of exhaustiveness. 2 is a poor approximation of pi :-)

Using Catch, it reports that test1..3 were faulty, but test4 is safe.

Thanks

Neil _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Norman Ramsey

20 May 20 May

9:41 p.m.

...

...
> ... exhaustive pattern checking might well help out a lot of > people coming from untyped backgrounds...

Or even people from typed backgrounds. I worship at the altar of exhaustiveness checking.

Do you really want exhaustiveness, or is what you actually want safety?

I want both. Exhaustiveness checking now and forever, because it's a modular property. Safety when somebody gets around to implementing whole-program analysis in the compiler I use, when I feel like waiting around for a whole-program analysis to complete, and when I'm not making local modifications to somebody else's enormous, unsafe Haskell program. Needless to say, safety analysis should identify 'assert False' and confirm at compile time that there are no assertion failures. Norman

Neil Mitchell

21 May 21 May

1:50 p.m.

...

> Do you really want exhaustiveness, or is what you actually want safety?

I want both. Exhaustiveness checking now and forever, because it's a modular property. Safety when somebody gets around to implementing whole-program analysis in the compiler I use, when I feel like waiting around for a whole-program analysis to complete, and when I'm not making local modifications to somebody else's enormous, unsafe Haskell program.

Exhaustiveness is handy if every function is exhaustive, then it's a local property contributing to global safety. If you have functions like head floating around, then local exhaustiveness does not equal global safety. I can see why some people want it, but I'm not one of them (which in my mind makes it perfect for a flag).

...

Needless to say, safety analysis should identify 'assert False' and confirm at compile time that there are no assertion failures.

Catch already does assertion checking (1). Its runtime on moderate to small programs (HsColour in particular) is far less than the time GHC takes to compile them, and I still have no idea what its runtime is on enormous programs (2). An analysis can be whole program and can be slow, one does not imply the other. Thanks Neil 1) To the extent that it can.... It certainly tries to prove the assertions can't fail, and reports each one it fails to prove. 2) I think HsColour was fairly near to the largest program Yhc ever compiled...

Don Stewart

1:54 p.m.

...

Catch already does assertion checking (1). Its runtime on moderate to small programs (HsColour in particular) is far less than the time GHC takes to compile them, and I still have no idea what its runtime is on enormous programs (2). An analysis can be whole program and can be slow, one does not imply the other.

But the primary problem with Catch is that its analysis not well defined. I have no guarantee regarding the existence or not of false positives or false negatives, as Catch has no underlying formal logic to guide such reasoning. Despite this, it is a useful tool. -- Don

Neil Mitchell

2:55 p.m.

...

...
Catch already does assertion checking (1). Its runtime on moderate to small programs (HsColour in particular) is far less than the time GHC takes to compile them, and I still have no idea what its runtime is on enormous programs (2). An analysis can be whole program and can be slow, one does not imply the other.

But the primary problem with Catch is that its analysis not well defined. I have no guarantee regarding the existence or not of false positives or false negatives, as Catch has no underlying formal logic to guide such reasoning.

If Catch says your program will not crash, then it will not crash. I even gave an argument for correctness in the final appendix of my thesis http://community.haskell.org/~ndm/thesis/ (pages 175-207). Of course, there are engineering concerns (perhaps your Haskell compiler will mis-translate the program to Core, perhaps the libraries will be wrong, perhaps a bit in RAM will flip due to electrical interference), but Catch has a formal basis. I guarantee that there are safe programs Catch can't prove safe, for a proof see Turing 1937 :-) Thanks Neil

Don Stewart

3:32 p.m.

ndmitchell:

...

...
...
Catch already does assertion checking (1). Its runtime on moderate to small programs (HsColour in particular) is far less than the time GHC takes to compile them, and I still have no idea what its runtime is on enormous programs (2). An analysis can be whole program and can be slow, one does not imply the other.

But the primary problem with Catch is that its analysis not well defined. I have no guarantee regarding the existence or not of false positives or false negatives, as Catch has no underlying formal logic to guide such reasoning.

If Catch says your program will not crash, then it will not crash. I even gave an argument for correctness in the final appendix of my thesis http://community.haskell.org/~ndm/thesis/ (pages 175-207). Of course, there are engineering concerns (perhaps your Haskell compiler will mis-translate the program to Core, perhaps the libraries will be wrong, perhaps a bit in RAM will flip due to electrical interference), but Catch has a formal basis.

Oh, very good! I wasn't aware you'd tried this. I imagine you do something like: * identify all partial functions * bubble that information outwards, crossing off partial functions that are actually total due to tests in callers that effectively reduce the possible inhabitants of the types passed to the partial function * and you have some argument for why your travesal doesn't miss, or mislabel constraints. Is it possible for Catch to print out its reasoning for why some function 'f' is total, such that I could check it (with another tool)? -- Don

Neil Mitchell

3:49 p.m.

...

...
If Catch says your program will not crash, then it will not crash. I even gave an argument for correctness in the final appendix of my thesis http://community.haskell.org/~ndm/thesis/ (pages 175-207). Of course, there are engineering concerns (perhaps your Haskell compiler will mis-translate the program to Core, perhaps the libraries will be wrong, perhaps a bit in RAM will flip due to electrical interference), but Catch has a formal basis.

Oh, very good! I wasn't aware you'd tried this. I imagine you do something like:

* identify all partial functions * bubble that information outwards, crossing off partial functions that are actually total due to tests in callers that effectively reduce the possible inhabitants of the types passed to the partial function * and you have some argument for why your travesal doesn't miss, or mislabel constraints.

Nope, not at all. I assume all missing case branches are replaced with calls to error (as both GHC and Yhc Core do), then prove that: satE $ pre e ==> not $ isBottom $ eval e If the preconditions generated by Catch on an expression are a tautology, that implies the evaluation of e won't contain any _|_ terms at any level. If the precondition Catch generates is const True, then that implies the evaluation is never bottom. I then proceed by induction with a few lemmas, and fusing things - the "proof" is all at the level of Haskell equational reasoning.

...

Is it possible for Catch to print out its reasoning for why some function 'f' is total, such that I could check it (with another tool)?

It already does. My plan was always to output the reasoning into an ESC/Haskell file, and then have the "Catch" process run the Catch algorithm, and then check the results with ESC/Haskell - this way I hoped to avoid writing a proof for Catch... Things didn't quite turn out that way, as I needed to submit my thesis, I didn't have a copy of ESC/Haskell good enough to do what I wanted, and every thesis needs a nice proof in the appendix. Thanks, Neil

Don Stewart

3:56 p.m.

ndmitchell:

...

...
...
If Catch says your program will not crash, then it will not crash. I even gave an argument for correctness in the final appendix of my thesis http://community.haskell.org/~ndm/thesis/ (pages 175-207). Of course, there are engineering concerns (perhaps your Haskell compiler will mis-translate the program to Core, perhaps the libraries will be wrong, perhaps a bit in RAM will flip due to electrical interference), but Catch has a formal basis.

Oh, very good! I wasn't aware you'd tried this. I imagine you do something like:

* identify all partial functions * bubble that information outwards, crossing off partial functions that are actually total due to tests in callers that effectively reduce the possible inhabitants of the types passed to the partial function * and you have some argument for why your travesal doesn't miss, or mislabel constraints.

Nope, not at all. I assume all missing case branches are replaced with calls to error (as both GHC and Yhc Core do), then prove that:

satE $ pre e ==> not $ isBottom $ eval e

If the preconditions generated by Catch on an expression are a tautology, that implies the evaluation of e won't contain any _|_ terms at any level. If the precondition Catch generates is const True, then that implies the evaluation is never bottom. I then proceed by induction with a few lemmas, and fusing things - the "proof" is all at the level of Haskell equational reasoning.

OK. i'm just trying to get an intuition for the analysis. What is the nature of the preconditions generated? In order for them to cancel out the calls to error, are they constructed from information in the caller (as I speculated) about how the function under analysis is used? Obviously, you're also using a restricted notion of "bottom". -- Don

Neil Mitchell

4:14 p.m.

...

OK. i'm just trying to get an intuition for the analysis.

Catch is defined by a small Haskell program. You can write a small Haskell evaluation for a Core language. The idea is to write the QuickCheck style property, then proceed using Haskell style proof steps. The checker is recursive - it assigns a precondition to an expression based on the precondition of subexpressions, therefore I just induct upwards proving that each step is correct individually. There wasn't an intention of trying to move partiality around, but perhaps it has worked out that way.

...

What is the nature of the preconditions generated?

A precondition is a proposition of pairs of (expression, constraint), where an pair is satisfied if the expression satisfies the constraint. I prove that given a couple of lemmas about the constraint language, the analysis is correct. I then prove that 2 particular constraint languages have these necessary lemmas. As a side note, I'm pretty certain that the constraint languages I give aren't the best choice. The second one (MP constraints) is good, but lacks an obvious normal form, which makes the fixed point stuff a little more fragile/slow. I'm sure someone could come up with something better, and providing it meets a few simple lemmas, it's suitable for Catch.

...

In order for them to cancel out the calls to error, are they constructed from information in the caller (as I speculated) about how the function under analysis is used?

Yes, information from case branches add to the constraint, so: case xs of [] -> [] _:_ -> head xs becomes: xs == [] \/ safe (head xs) xs == [] \/ xs == _:_ True

...

Obviously, you're also using a restricted notion of "bottom".

For my purposes, bottom = call to error. Things such as missing case branches and asserts are translated to error. I actually consider non-termination to be a safe outcome, for example Catch says: assert (last xs) True This is safe if all elements of the list are True (Catch approximates here), or if the list xs is infinite. Thanks Neil

Don Stewart

4:17 p.m.

ndmitchell:

...

...
OK. i'm just trying to get an intuition for the analysis.

Catch is defined by a small Haskell program. You can write a small Haskell evaluation for a Core language. The idea is to write the QuickCheck style property, then proceed using Haskell style proof steps. The checker is recursive - it assigns a precondition to an expression based on the precondition of subexpressions, therefore I just induct upwards proving that each step is correct individually. There wasn't an intention of trying to move partiality around, but perhaps it has worked out that way.

...
What is the nature of the preconditions generated?

A precondition is a proposition of pairs of (expression, constraint), where an pair is satisfied if the expression satisfies the constraint. I prove that given a couple of lemmas about the constraint language, the analysis is correct. I then prove that 2 particular constraint languages have these necessary lemmas.

As a side note, I'm pretty certain that the constraint languages I give aren't the best choice. The second one (MP constraints) is good, but lacks an obvious normal form, which makes the fixed point stuff a little more fragile/slow. I'm sure someone could come up with something better, and providing it meets a few simple lemmas, it's suitable for Catch.

...
In order for them to cancel out the calls to error, are they constructed from information in the caller (as I speculated) about how the function under analysis is used?

Yes, information from case branches add to the constraint, so:

case xs of [] -> [] _:_ -> head xs

becomes: xs == [] \/ safe (head xs) xs == [] \/ xs == _:_ True

...
Obviously, you're also using a restricted notion of "bottom".

For my purposes, bottom = call to error. Things such as missing case branches and asserts are translated to error. I actually consider non-termination to be a safe outcome, for example Catch says:

assert (last xs) True

This is safe if all elements of the list are True (Catch approximates here), or if the list xs is infinite.

Thanks, that's helpful, and much what I expected. -- Don

5886

Age (days ago)

5907

Last active (days ago)

List overview

Download

41 comments

12 participants

participants (12)

Alexander Dunlap
Claus Reinke
Don Stewart
Isaac Dupree
John Meacham
Lennart Augustsson
Neil Mitchell
Norman Ramsey
Peter Hercek
Robert Greayer
Simon Marlow
Simon Peyton-Jones

Should exhaustiveness testing be on by default?

tags

participants (12)