instance {Semigroup, Monoid} (Bag a) ?

Hi devs, In the work on simplifying the error-message infrastructure (heavy lifting by Alfredo, in cc), I've been tempted (twice!) to add
instance Semigroup (Bag a) where (<>) = unionBags
instance Monoid (Bag a) where mempty = emptyBag
to GHC.Data.Bag. The downside to writing these is that users might be tempted to write e.g. mempty instead of emptyBag, while the latter gives more information to readers and induces less manual type inference (to a human reader). The upside is that it means Bags work well with Monoid-oriented functions, like foldMap. I favor adding them, and slipped them into !5509 (a big commit with lots of other stuff). Alfredo rightly wondered whether this decision deserved more scrutiny, and so I'm asking the question here. What do we think? Thanks, Richard

I wonder if it would be possible to have an hlint rule to check for mempty
instead of emptyBag.
On Wed, 14 Apr 2021 at 19:27, Richard Eisenberg
Hi devs,
In the work on simplifying the error-message infrastructure (heavy lifting by Alfredo, in cc), I've been tempted (twice!) to add
instance Semigroup (Bag a) where (<>) = unionBags
instance Monoid (Bag a) where mempty = emptyBag
to GHC.Data.Bag.
The downside to writing these is that users might be tempted to write e.g. mempty instead of emptyBag, while the latter gives more information to readers and induces less manual type inference (to a human reader). The upside is that it means Bags work well with Monoid-oriented functions, like foldMap.
I favor adding them, and slipped them into !5509 (a big commit with lots of other stuff). Alfredo rightly wondered whether this decision deserved more scrutiny, and so I'm asking the question here.
What do we think?
Thanks, Richard _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi Richard, I've been guilty of slipping in similar instances myself. In fact, I like OrdList better than Bag precisely because it has more instances and thus a far better interface. Not being able to see whether mempty denotes a Bag should be as simple as a mouse hover with HLS set up. So a +99 from me. Cheers, Sebastian Am Mi., 14. Apr. 2021 um 20:28 Uhr schrieb Richard Eisenberg < rae@richarde.dev>:
Hi devs,
In the work on simplifying the error-message infrastructure (heavy lifting by Alfredo, in cc), I've been tempted (twice!) to add
instance Semigroup (Bag a) where (<>) = unionBags
instance Monoid (Bag a) where mempty = emptyBag
to GHC.Data.Bag.
The downside to writing these is that users might be tempted to write e.g. mempty instead of emptyBag, while the latter gives more information to readers and induces less manual type inference (to a human reader). The upside is that it means Bags work well with Monoid-oriented functions, like foldMap.
I favor adding them, and slipped them into !5509 (a big commit with lots of other stuff). Alfredo rightly wondered whether this decision deserved more scrutiny, and so I'm asking the question here.
What do we think?
Thanks, Richard _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Wed, Apr 14, 2021 at 06:26:38PM +0000, Richard Eisenberg wrote:
In the work on simplifying the error-message infrastructure (heavy lifting by Alfredo, in cc), I've been tempted (twice!) to add
instance Semigroup (Bag a) where (<>) = unionBags
instance Monoid (Bag a) where mempty = emptyBag
to GHC.Data.Bag.
I agree that the new Monoid is appropriate.
The downside to writing these is that users might be tempted to write e.g. mempty instead of emptyBag, while the latter gives more information to readers and induces less manual type inference (to a human reader). The upside is that it means Bags work well with Monoid-oriented functions, like foldMap.
I don't see the possibility of writing `mempty` as an issue. I find myself not infrequently writing `mempty` for, e.g., empty ByteStrings, rather than ByteString.empty, because while there are lots of type-specific "empties", they often need to be used qualified, while the polymorphic `mempty` is both clear and flexible. If anything, what's atypical here is that "emptyBag" has the type in its name. With many other types we have: empty :: ByteString empty :: ByteString.Builder empty :: Map k v empty :: Set a empty :: Seq a empty :: Text empty :: Vector a ... when the type is a Monoid, it is much simpler to just use mempty. -- Viktor.

+1 to add from me, seems sensible
On Wed, Apr 14, 2021, 14:31 Viktor Dukhovni
On Wed, Apr 14, 2021 at 06:26:38PM +0000, Richard Eisenberg wrote:
In the work on simplifying the error-message infrastructure (heavy lifting by Alfredo, in cc), I've been tempted (twice!) to add
instance Semigroup (Bag a) where (<>) = unionBags
instance Monoid (Bag a) where mempty = emptyBag
to GHC.Data.Bag.
I agree that the new Monoid is appropriate.
The downside to writing these is that users might be tempted to write e.g. mempty instead of emptyBag, while the latter gives more information to readers and induces less manual type inference (to a human reader). The upside is that it means Bags work well with Monoid-oriented functions, like foldMap.
I don't see the possibility of writing `mempty` as an issue. I find myself not infrequently writing `mempty` for, e.g., empty ByteStrings, rather than ByteString.empty, because while there are lots of type-specific "empties", they often need to be used qualified, while the polymorphic `mempty` is both clear and flexible.
If anything, what's atypical here is that "emptyBag" has the type in its name. With many other types we have:
empty :: ByteString empty :: ByteString.Builder empty :: Map k v empty :: Set a empty :: Seq a empty :: Text empty :: Vector a ...
when the type is a Monoid, it is much simpler to just use mempty.
-- Viktor. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Viktor Dukhovni
On Wed, Apr 14, 2021 at 06:26:38PM +0000, Richard Eisenberg wrote:
In the work on simplifying the error-message infrastructure (heavy lifting by Alfredo, in cc), I've been tempted (twice!) to add
instance Semigroup (Bag a) where (<>) = unionBags
instance Monoid (Bag a) where mempty = emptyBag
to GHC.Data.Bag.
I agree that the new Monoid is appropriate.
The downside to writing these is that users might be tempted to write e.g. mempty instead of emptyBag, while the latter gives more information to readers and induces less manual type inference (to a human reader). The upside is that it means Bags work well with Monoid-oriented functions, like foldMap.
I don't see the possibility of writing `mempty` as an issue. I find myself not infrequently writing `mempty` for, e.g., empty ByteStrings, rather than ByteString.empty, because while there are lots of type-specific "empties", they often need to be used qualified, while the polymorphic `mempty` is both clear and flexible.
I think the "clear" is where some might disagree. It has been argued in the past (fairly convincingly, in my opinion) that there is value in being explicit what concrete type an expression has. Afterall, understanding GHC is already hard enough; there's no reason to make it harder by forcing the reader to do type inference. This is why GHC has historically preferred concrete interfaces to using things like mempty. Cheers, - Ben

Richard Eisenberg
Hi devs,
In the work on simplifying the error-message infrastructure (heavy lifting by Alfredo, in cc), I've been tempted (twice!) to add
instance Semigroup (Bag a) where (<>) = unionBags
instance Monoid (Bag a) where mempty = emptyBag
to GHC.Data.Bag.
The downside to writing these is that users might be tempted to write e.g. mempty instead of emptyBag, while the latter gives more information to readers and induces less manual type inference (to a human reader). The upside is that it means Bags work well with Monoid-oriented functions, like foldMap.
I favor adding them, and slipped them into !5509 (a big commit with lots of other stuff). Alfredo rightly wondered whether this decision deserved more scrutiny, and so I'm asking the question here.
My sense is that adding the instances is a Good Thing. However, I do think that we probably ought to refrain from using (<>) and mempty where more specific functions would do. Adding a lint would be one way to accomplish this. Hiding the functions from GhcPrelude would be another. Cheers, - Ben

Good morning all,
I have now split Richard's commit in two, so that it's explicit in the
commit history that we are adding those instances:
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5509/diffs?commit_id=2cb...
*Ben*: I have briefly tried to add the Hlint annotations, but it turned out
to be trickier than anticipated. I couldn't find any easy way in HLint to
add a custom hint on a *typeclass instance*, and adding something like this
into `compiler/.hlint.yaml` didn't change anything:
- modules:
- {name: GHC.Data.Bag, badidents: [mempty], message: "Use emptyBag as
it's more descriptive"}
- {name: GHC.Data.Bag, badidents: [(<>)], message: "Use unionBags as it's
more descriptive"}
(I suspect that's not what's the badidents syntax is for). I have also
tried to add two `ANN` to the definitions, like this:
{-# ANN (<>) ("HLint: use unionBags as it's more descriptive" :: String) #-}
instance Semigroup (Bag a) where
(<>) = unionBags
However, this won't work as (<>) is imported qualified and there is no way
in the ANN syntax to specify a fully qualified identifier. However, I
suspect this is not the correct syntax either, as `(<>)` is really the
*typeclass* method, but we want to target the particular instance
implementation.
*Having said that*, Richard carefully defined `(<>) = unionBags` and
`mempty = emptyBag`, so I agree that using `(<>)` and `mempty` would be
more opaque but at the end of the day it should have the asymptotic
complexity than using `unionBags` and `emptyBag` (modulo dictionary
passing, but I hope that won't each our lunch).
A.
On Thu, 15 Apr 2021 at 02:20, Ben Gamari
Richard Eisenberg
writes: Hi devs,
In the work on simplifying the error-message infrastructure (heavy lifting by Alfredo, in cc), I've been tempted (twice!) to add
instance Semigroup (Bag a) where (<>) = unionBags
instance Monoid (Bag a) where mempty = emptyBag
to GHC.Data.Bag.
The downside to writing these is that users might be tempted to write e.g. mempty instead of emptyBag, while the latter gives more information to readers and induces less manual type inference (to a human reader). The upside is that it means Bags work well with Monoid-oriented functions, like foldMap.
I favor adding them, and slipped them into !5509 (a big commit with lots of other stuff). Alfredo rightly wondered whether this decision deserved more scrutiny, and so I'm asking the question here.
My sense is that adding the instances is a Good Thing. However, I do think that we probably ought to refrain from using (<>) and mempty where more specific functions would do. Adding a lint would be one way to accomplish this. Hiding the functions from GhcPrelude would be another.
Cheers,
- Ben
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
participants (7)
-
Alan & Kim Zimmerman
-
Alfredo Di Napoli
-
Ben Gamari
-
chessai
-
Richard Eisenberg
-
Sebastian Graf
-
Viktor Dukhovni