[GHC] #13593: Unexpected behavior from Data.List.groupBy

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core | Version: 8.0.1 Libraries | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Incorrect result Unknown/Multiple | at runtime Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I was hoping that {{{#!hs let notBoth1 a b = not (a == 1 && b == 1) in groupBy notBoth1 [1,1,2,3,1,1,4,5,6,1] }}} would give me {{{ [[1],[1,2,3,1],[1,4,5,6,1]] }}} but instead I get {{{ [[1],[1,2,3],[1],[1,4,5,6],[1]] }}} It seems that groupBy assumes transitivity in the argument function. I have a new implementation that does not make this assumption. Of course, the implications of changing this function's behavior are troubling. {{{#!hs groupBy' :: (a -> a -> Bool) -> [a] -> [[a]] groupBy' p (x : xs) = go [x] xs where go (x : xs) (y : zs) | p x y = go (y : x : xs) zs go g (y : zs) = reverse g : go [y] zs go g [] = [reverse g] }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dsf): Also needs a clause {{{groupBy' _ [] = []}}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dsf): To be fair, the documentation says "supply your own equality function", and what I'm supplying above is not an equality function. But a better interface might be something like {{{#!hs groupOn :: Eq b => (a -> b) -> [a] -> [[a]] groupOn f xs = groupBy ((==) `on` f) xs }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): The behavior of `groupBy` is defined by the [[https://www.haskell.org/onlinereport/list.html|Haskell Report]]. We really can't unilaterally change it in GHC. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dsf): Of course. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: wontfix | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by dsf): * status: new => closed * resolution: => wontfix -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: wontfix | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by Iceland_jack): Hold on, it's at least worth documenting this infelicity. Ideally the user would be directed to a correct version -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by dsf): * status: closed => new * failure: Incorrect result at runtime => Documentation bug * resolution: wontfix => * type: bug => task -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dsf): I would add: It is assumed that the argument function is transitive. A function like `not (a == 1 && b == 1)` will give a result you might not expect: {{{#!hs
let notBoth1 a b = not (a == 1 && b == 1) groupBy notBoth1 [1,1,2,3,1,1,4,5,6,1] [[1],[1,2,3],[1],[1,4,5,6],[1]] }}} rather than {{{#!hs [[1],[1,2,3,1],[1,4,5,6,1]] }}}
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

When the "By" function replaces an Eq context by a binary predicate, the
#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by svenpanne): The Haskell report is very explicit about these functions: predicate is assumed to define an equivalence; when the "By" function replaces an Ord context by abinary predicate, the predicate is assumed to define a total ordering. And the `Data.List` documentation repeats these requirements, so I propose to close this issue as "invalid". As a side note, I wouldn't call this an "infelicity": If your `Eq`/`Ord` instances don't obey the required laws, you would have similar fun with plain `group`. The `By` functions just make an implicit argument explicit, and I would be very surprised if they behaved differently. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dsf): Yes, the Haskell report does contain this specification. I'm just trying to improve the documentation which is typically used by someone who is in the process of building Haskell code and is perhaps somewhat removed (at that moment) from the design decisions discussed in the Haskell report. I have found that the requirements of the reference documentation used while writing code is very different from those of the documents defining the language. The name and signature of groupBy give the impression that what I tried to do above would work, so I'm assuming that my mistake and consequent loss of time might be one that others would also make. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Indeed, we should absolutely update the documentation to reflect this. I'll add a variant of the language in comment:8 to the Haddocks. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by svenpanne): Hmmm, does that mean that every single function under 'The "By" operations' will get the sentence 'The predicate is assumed to define an equivalence.' '/ The function is assumed to define a total ordering.'? It's a little bit like repeating similar sentences for every function taking an Eq/Ord context... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Well, my plan was just to point this out for `groupBy` (see Phab:D3475), but I suppose you are right that this theme applies to many of the `*By` functions. Frankly, Haskell's core library documentation is often accused of being too sparse, so I don't consider a bit (perhaps slightly repetitious) detail to be a problem. However, if we did want to avoid repeating ourselves we could just add a general discussion of the expectations of the `By` functions to the documentation at the head of the module. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dsf): Replying to [comment:12 svenpanne]:
Hmmm, does that mean that every single function under 'The "By" operations' will get the sentence 'The predicate is assumed to define an equivalence.' '/ The function is assumed to define a total ordering.'? It's a little bit like repeating similar sentences for every function taking an Eq/Ord context...
Definitely not - this is a recipe for teaching people to ignore stuff. Elsewhere I said it makes sense to put it in the header, but consider that the particularly unexpected results come from groupBy. In the other cases you probablyu wouldn't consider using a non-transitive function argument: {{{#!hs -- What do these mean?
nubBy notBoth1 [1,1,2,3,1,1,4,5,6,1] [1,1,1,1,1] deleteBy notBoth1 3 [1,1,2,3,1,1,4,5,6,1] [1,2,3,1,1,4,5,6,1] }}}
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by svenpanne): I still claim that `groupBy` is by no means special: We have literally tons of functions which expect that some argument, be it implicit or explicit, obeys some laws, and those laws are not enforced by the type (so you can actually write down the type without having a Ph.D. in theoretical CS ;-). Take e.g. `Functor`: We expect that the Functor laws are obeyed by all instances. If they're broken, all odds are off and probably most of the function expecting such an argument do strange things. Non-transitive comparison for `sortBy`? Have fun... etc. etc. Nevertheless, we don't repeat such law-breaking examples all over the library documentation. Do we really know which of the 10 `fooBy` functions do something remotely sensible if they don't get an equivalence/total ordering? I doubt that. Without looking at the implementation I would have a hard time figuring that out. Putting the `notBoth1` example below the 'User-supplied equality (replacing an Eq context)' header and just repeating the equivalence requirement in each function documentation is probably the right thing. Similarly for `Ord`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dsf): That seems satisfactory. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13593: Unexpected behavior from Data.List.groupBy -------------------------------------+------------------------------------- Reporter: dsf | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Core Libraries | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Documentation | Unknown/Multiple bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I haven't read these comments through to their conclusion but perhaps someone could offer a patch implementing what was agreed upon? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13593#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC