A more useful Monoid instance for Data.Map

Hi all, Currently Data.Map has a Monoid instance, but it's rather lossy and not as general as it could be: instance (Ord k) => Monoid (Map k v) where mempty = empty mappend = union mconcat = unions The instance would be much nicer if it required a Monoid on v and used unionWith mappend instead of just union. The current behavior could be emulated by using First/Last as the Monoid (or ideally a semigroup, but that's or another discussion), but other more interesting Monoid instances could also be used for the values. I realize that changing instances could break code, but I'd be curious to see how many people even use the current monoid instance. Does anyone have any system for testing hypotheses like this (by typechecking a large randomized chunk of hackage or something)? It looks like the idea has been around for at least five years: http://hackage.haskell.org/trac/ghc/ticket/1460 but the proposal was abandoned, so I wanted to see if I could get people to start talking about it again. Thanks, Dan

On 4/27/12 9:04 PM, Daniel Peebles wrote:
Hi all,
Currently Data.Map has a Monoid instance, but it's rather lossy and not as general as it could be:
instance (Ord k) => Monoid (Map k v) where mempty = empty mappend = union mconcat = unions
The instance would be much nicer if it required a Monoid on v and used unionWith mappend instead of just union.
I'm inclined to agree as well.
I realize that changing instances could break code, but I'd be curious to see how many people even use the current monoid instance. Does anyone have any system for testing hypotheses like this (by typechecking a large randomized chunk of hackage or something)?
The thing I'd be more worried about is silent changes to semantics. At least if it doesn't typecheck then you know something went wrong; but there are lots of monoids and so it's very likely to typecheck but to alter semantics. When I've been curious about things like this before, I've just used grep on my local copy of Hackage. But that only works for things with relatively unique names; it'd be completely unhelpful for something type-directed like this. You could try removing the instance entirely and then see how much of Hackage you can get to compile. That way you'll detect all uses, not just the uses for non-monoidal value types. You should be able to get your hands on one of the build-all-of-Hackage scripts people've used for this sort of thing in the past. -- Live well, ~wren

On Fri, 27 Apr 2012, wren ng thornton wrote:
On 4/27/12 9:04 PM, Daniel Peebles wrote:
Currently Data.Map has a Monoid instance, but it's rather lossy and not as general as it could be:
instance (Ord k) => Monoid (Map k v) where mempty = empty mappend = union mconcat = unions
The instance would be much nicer if it required a Monoid on v and used unionWith mappend instead of just union.
I'm inclined to agree as well.
I do not know whether I used the Monoid instance already, but I know that several times I used fromList by accident where I had to use fromListWith mappend. I think I would also prefer your instance because the current one silently drops information.
I realize that changing instances could break code, but I'd be curious to see how many people even use the current monoid instance. Does anyone have any system for testing hypotheses like this (by typechecking a large randomized chunk of hackage or something)?
The thing I'd be more worried about is silent changes to semantics. At least if it doesn't typecheck then you know something went wrong; but there are lots of monoids and so it's very likely to typecheck but to alter semantics.
I am also worried about such changes, at least I think that the major version number of 'containers' must be increased in order to raise awareness. There is at least a certain chance that uses of the old instance are detected, since the new instance requires a Monoid constraint on the 'v' type that was not there before.
You could try removing the instance entirely and then see how much of Hackage you can get to compile. That way you'll detect all uses, not just the uses for non-monoidal value types. You should be able to get your hands on one of the build-all-of-Hackage scripts people've used for this sort of thing in the past.
This sounds like a good idea. However I suspect that a large portion of packages cannot be build together, because some of them rely on old compiler versions and other ones on new versions and even more packages rely transitively on installed foreign libraries. Why is the Monoid instance important? Instead of mappend you can always use explicitly Map.union or (Map.unionWith mappend). I think the instance is important when combining different structures. E.g. in a Writer monad the written value must a Monoid. Thus a criterion for a useful Monoid instance could be: What is most useful for a Map as target value in a Writer monad?

I realize that changing instances could break code, but I'd be curious to see how many people even use the current monoid instance. Does anyone have any system for testing hypotheses like this (by typechecking a large randomized chunk of hackage or something)?
I use both definitions. I have some data types where new values should replace old ones, and some where the values themselves should be joined. I do the latter with a utility function 'Map.mappend = Map.unionWith Monoid.mappend'. I think both definitions are useful and switching one for the other would just break a lot of code without a lot of benefit.

Why not be explicit about the replacement strategy by injecting your values
into First/Last? My point is that in terms of functionality, using mappend
on the values is strictly more general than the current instance. It seems
unfortunate to be stuck with the current instance for historical reasons,
but I guess that's how a lot of this stuff works :/
On Fri, Apr 27, 2012 at 9:52 PM, Evan Laforge
I realize that changing instances could break code, but I'd be curious to see how many people even use the current monoid instance. Does anyone have any system for testing hypotheses like this (by typechecking a large randomized chunk of hackage or something)?
I use both definitions. I have some data types where new values should replace old ones, and some where the values themselves should be joined. I do the latter with a utility function 'Map.mappend = Map.unionWith Monoid.mappend'.
I think both definitions are useful and switching one for the other would just break a lot of code without a lot of benefit.

On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
Why not be explicit about the replacement strategy by injecting your values into First/Last? My point is that in terms of functionality, using mappend on the values is strictly more general than the current instance. It seems unfortunate to be stuck with the current instance for historical reasons, but I guess that's how a lot of this stuff works :/
Yeah, I suppose it would be a bit more regular that way. I'm always reluctant to map newtypes over things other than lists because I don't trust there to be a RULES that will eliminate it, but I suppose for Map there must be. I guess I wouldn't mind updating my code if the definition changed. It's hard to change a general purpose method though, simply because searching for it in your code will turn up so many false positives.

On Sat, 28 Apr 2012, Evan Laforge wrote:
On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
wrote: Why not be explicit about the replacement strategy by injecting your values into First/Last? My point is that in terms of functionality, using mappend on the values is strictly more general than the current instance. It seems unfortunate to be stuck with the current instance for historical reasons, but I guess that's how a lot of this stuff works :/
Yeah, I suppose it would be a bit more regular that way. I'm always reluctant to map newtypes over things other than lists because I don't trust there to be a RULES that will eliminate it, but I suppose for Map there must be. I guess I wouldn't mind updating my code if the definition changed. It's hard to change a general purpose method though, simply because searching for it in your code will turn up so many false positives.
The version containers-0.6 may not have a Monoid Map instance at all, such that all people have to note the change and then containers-0.7 should contain the new instance. :-)

I don't actually think there are any rules/optimizations for fmap of
newtype constructors or extractors in general. Luckily, unsafeCoerce is
explicitly specified to be safe, in this kind of situation (assuming the
Map is actually parametric in its value type, which it is)! ;)
On Sat, Apr 28, 2012 at 12:31 PM, Evan Laforge
On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
wrote: Why not be explicit about the replacement strategy by injecting your values into First/Last? My point is that in terms of functionality, using mappend on the values is strictly more general than the current instance. It seems unfortunate to be stuck with the current instance for historical reasons, but I guess that's how a lot of this stuff works :/
Yeah, I suppose it would be a bit more regular that way. I'm always reluctant to map newtypes over things other than lists because I don't trust there to be a RULES that will eliminate it, but I suppose for Map there must be. I guess I wouldn't mind updating my code if the definition changed. It's hard to change a general purpose method though, simply because searching for it in your code will turn up so many false positives.

Hi all,
After repeated frustration over the wrong Monoid (Data.Map.Map k v)
instance I finally went ahead and did a practical test concerning its
current usage.
After removing the Monoid instance for Map and IntMap, each reverse
dependency of containers was separately compiled under a standard setup of
GHC 7.6.1 in order to avoid shared dependency problems. Out of 1440 reverse
dependencies, I could get 545 to compile. However, only the following
packages fail because of Monoid instance issues:
- caledon
- data-default
- dom-lt
- EnumMap
- i18n
- semigroups
- unamb-custom
- vacuum
- stringtable-atom
EnumMap has containers <0.3, semigroups declares <0.6, unamb-custom appears
to be a private abandoned clone with uploads only on 24/12/08,
stringtable-atom fails to build because of a previous API change for
updateMax, and the rest only use the instance internally for saying mempty
instead of Data.Map.empty.
Under these circumstances, fixing the Monoid instance mistake for
containers 0.6.0.0 does not seem to introduce any semantic breakage at all.
I have CCed the maintainers of the lastly mentioned packages.
Let's do it!
Christian
2012/4/28 Daniel Peebles
I don't actually think there are any rules/optimizations for fmap of newtype constructors or extractors in general. Luckily, unsafeCoerce is explicitly specified to be safe, in this kind of situation (assuming the Map is actually parametric in its value type, which it is)! ;)
On Sat, Apr 28, 2012 at 12:31 PM, Evan Laforge
wrote: On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
wrote: Why not be explicit about the replacement strategy by injecting your values into First/Last? My point is that in terms of functionality, using mappend on the values is strictly more general than the current instance. It seems unfortunate to be stuck with the current instance for historical reasons, but I guess that's how a lot of this stuff works :/
Yeah, I suppose it would be a bit more regular that way. I'm always reluctant to map newtypes over things other than lists because I don't trust there to be a RULES that will eliminate it, but I suppose for Map there must be. I guess I wouldn't mind updating my code if the definition changed. It's hard to change a general purpose method though, simply because searching for it in your code will turn up so many false positives.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Great job!
To be fair, there might be code which is not released on hackage (e.g.
proprietary), but the current instance is so bad that we really should
take a chance to fix it.
But there's another problem... The "right" instance should really be
based on Semigroup, not Monoid, but Semigroup is not currently in the
base. And adding a dependency on semigroups to containers is hardly an
option.
So maybe we should push the change that was discussed in another thread
to move Semigroup into base, and only then change this instance.
Roman
* Christian Sattler
Hi all,
After repeated frustration over the wrong Monoid (Data.Map.Map k v) instance I finally went ahead and did a practical test concerning its current usage.
After removing the Monoid instance for Map and IntMap, each reverse dependency of containers was separately compiled under a standard setup of GHC 7.6.1 in order to avoid shared dependency problems. Out of 1440 reverse dependencies, I could get 545 to compile. However, only the following packages fail because of Monoid instance issues:
- caledon - data-default - dom-lt - EnumMap - i18n - semigroups - unamb-custom - vacuum - stringtable-atom
EnumMap has containers <0.3, semigroups declares <0.6, unamb-custom appears to be a private abandoned clone with uploads only on 24/12/08, stringtable-atom fails to build because of a previous API change for updateMax, and the rest only use the instance internally for saying mempty instead of Data.Map.empty.
Under these circumstances, fixing the Monoid instance mistake for containers 0.6.0.0 does not seem to introduce any semantic breakage at all. I have CCed the maintainers of the lastly mentioned packages.
Let's do it! Christian
2012/4/28 Daniel Peebles
I don't actually think there are any rules/optimizations for fmap of newtype constructors or extractors in general. Luckily, unsafeCoerce is explicitly specified to be safe, in this kind of situation (assuming the Map is actually parametric in its value type, which it is)! ;)
On Sat, Apr 28, 2012 at 12:31 PM, Evan Laforge
wrote: On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
wrote: Why not be explicit about the replacement strategy by injecting your values into First/Last? My point is that in terms of functionality, using mappend on the values is strictly more general than the current instance. It seems unfortunate to be stuck with the current instance for historical reasons, but I guess that's how a lot of this stuff works :/
Yeah, I suppose it would be a bit more regular that way. I'm always reluctant to map newtypes over things other than lists because I don't trust there to be a RULES that will eliminate it, but I suppose for Map there must be. I guess I wouldn't mind updating my code if the definition changed. It's hard to change a general purpose method though, simply because searching for it in your code will turn up so many false positives.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Irrespective of the eventual replacement, I propose that we remove the current Monoid instance right now. This would also give proprietary codebases more time to adapt to the semantic change. Christian On 2013-01-05 21:00, Roman Cheplyaka wrote:
Great job!
To be fair, there might be code which is not released on hackage (e.g. proprietary), but the current instance is so bad that we really should take a chance to fix it.
But there's another problem... The "right" instance should really be based on Semigroup, not Monoid, but Semigroup is not currently in the base. And adding a dependency on semigroups to containers is hardly an option.
So maybe we should push the change that was discussed in another thread to move Semigroup into base, and only then change this instance.
Roman
* Christian Sattler
[2013-01-05 20:07:02+0100] Hi all,
After repeated frustration over the wrong Monoid (Data.Map.Map k v) instance I finally went ahead and did a practical test concerning its current usage.
After removing the Monoid instance for Map and IntMap, each reverse dependency of containers was separately compiled under a standard setup of GHC 7.6.1 in order to avoid shared dependency problems. Out of 1440 reverse dependencies, I could get 545 to compile. However, only the following packages fail because of Monoid instance issues:
- caledon - data-default - dom-lt - EnumMap - i18n - semigroups - unamb-custom - vacuum - stringtable-atom
EnumMap has containers <0.3, semigroups declares <0.6, unamb-custom appears to be a private abandoned clone with uploads only on 24/12/08, stringtable-atom fails to build because of a previous API change for updateMax, and the rest only use the instance internally for saying mempty instead of Data.Map.empty.
Under these circumstances, fixing the Monoid instance mistake for containers 0.6.0.0 does not seem to introduce any semantic breakage at all. I have CCed the maintainers of the lastly mentioned packages.
Let's do it! Christian
2012/4/28 Daniel Peebles
I don't actually think there are any rules/optimizations for fmap of newtype constructors or extractors in general. Luckily, unsafeCoerce is explicitly specified to be safe, in this kind of situation (assuming the Map is actually parametric in its value type, which it is)! ;)
On Sat, Apr 28, 2012 at 12:31 PM, Evan Laforge
wrote: Why not be explicit about the replacement strategy by injecting your values into First/Last? My point is that in terms of functionality, using mappend on the values is strictly more general than the current instance. It seems unfortunate to be stuck with the current instance for historical reasons, but I guess that's how a lot of this stuff works :/ Yeah, I suppose it would be a bit more regular that way. I'm always reluctant to map newtypes over things other than lists because I don't
On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
wrote: trust there to be a RULES that will eliminate it, but I suppose for Map there must be. I guess I wouldn't mind updating my code if the definition changed. It's hard to change a general purpose method though, simply because searching for it in your code will turn up so many false positives. _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

But that would unnecessarily break the code which only uses mempty.
Roman
* Christian Sattler
Irrespective of the eventual replacement, I propose that we remove the current Monoid instance right now. This would also give proprietary codebases more time to adapt to the semantic change.
Christian
On 2013-01-05 21:00, Roman Cheplyaka wrote:
Great job!
To be fair, there might be code which is not released on hackage (e.g. proprietary), but the current instance is so bad that we really should take a chance to fix it.
But there's another problem... The "right" instance should really be based on Semigroup, not Monoid, but Semigroup is not currently in the base. And adding a dependency on semigroups to containers is hardly an option.
So maybe we should push the change that was discussed in another thread to move Semigroup into base, and only then change this instance.
Roman
* Christian Sattler
[2013-01-05 20:07:02+0100] Hi all,
After repeated frustration over the wrong Monoid (Data.Map.Map k v) instance I finally went ahead and did a practical test concerning its current usage.
After removing the Monoid instance for Map and IntMap, each reverse dependency of containers was separately compiled under a standard setup of GHC 7.6.1 in order to avoid shared dependency problems. Out of 1440 reverse dependencies, I could get 545 to compile. However, only the following packages fail because of Monoid instance issues:
- caledon - data-default - dom-lt - EnumMap - i18n - semigroups - unamb-custom - vacuum - stringtable-atom
EnumMap has containers <0.3, semigroups declares <0.6, unamb-custom appears to be a private abandoned clone with uploads only on 24/12/08, stringtable-atom fails to build because of a previous API change for updateMax, and the rest only use the instance internally for saying mempty instead of Data.Map.empty.
Under these circumstances, fixing the Monoid instance mistake for containers 0.6.0.0 does not seem to introduce any semantic breakage at all. I have CCed the maintainers of the lastly mentioned packages.
Let's do it! Christian
2012/4/28 Daniel Peebles
I don't actually think there are any rules/optimizations for fmap of newtype constructors or extractors in general. Luckily, unsafeCoerce is explicitly specified to be safe, in this kind of situation (assuming the Map is actually parametric in its value type, which it is)! ;)
On Sat, Apr 28, 2012 at 12:31 PM, Evan Laforge
wrote: Why not be explicit about the replacement strategy by injecting your values into First/Last? My point is that in terms of functionality, using mappend on the values is strictly more general than the current instance. It seems unfortunate to be stuck with the current instance for historical reasons, but I guess that's how a lot of this stuff works :/ Yeah, I suppose it would be a bit more regular that way. I'm always reluctant to map newtypes over things other than lists because I don't
On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
wrote: trust there to be a RULES that will eliminate it, but I suppose for Map there must be. I guess I wouldn't mind updating my code if the definition changed. It's hard to change a general purpose method though, simply because searching for it in your code will turn up so many false positives. _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

As I understand, that would break anyway later on, as it is used only in contexts where there is no Semigroup instance and Map.empty should have been used in the first place. On 2013-01-06 10:41, Roman Cheplyaka wrote:
But that would unnecessarily break the code which only uses mempty.
Roman
* Christian Sattler
[2013-01-06 08:08:14+0100] Irrespective of the eventual replacement, I propose that we remove the current Monoid instance right now. This would also give proprietary codebases more time to adapt to the semantic change.
Christian
On 2013-01-05 21:00, Roman Cheplyaka wrote:
Great job!
To be fair, there might be code which is not released on hackage (e.g. proprietary), but the current instance is so bad that we really should take a chance to fix it.
But there's another problem... The "right" instance should really be based on Semigroup, not Monoid, but Semigroup is not currently in the base. And adding a dependency on semigroups to containers is hardly an option.
So maybe we should push the change that was discussed in another thread to move Semigroup into base, and only then change this instance.
Roman
* Christian Sattler
[2013-01-05 20:07:02+0100] Hi all,
After repeated frustration over the wrong Monoid (Data.Map.Map k v) instance I finally went ahead and did a practical test concerning its current usage.
After removing the Monoid instance for Map and IntMap, each reverse dependency of containers was separately compiled under a standard setup of GHC 7.6.1 in order to avoid shared dependency problems. Out of 1440 reverse dependencies, I could get 545 to compile. However, only the following packages fail because of Monoid instance issues:
- caledon - data-default - dom-lt - EnumMap - i18n - semigroups - unamb-custom - vacuum - stringtable-atom
EnumMap has containers <0.3, semigroups declares <0.6, unamb-custom appears to be a private abandoned clone with uploads only on 24/12/08, stringtable-atom fails to build because of a previous API change for updateMax, and the rest only use the instance internally for saying mempty instead of Data.Map.empty.
Under these circumstances, fixing the Monoid instance mistake for containers 0.6.0.0 does not seem to introduce any semantic breakage at all. I have CCed the maintainers of the lastly mentioned packages.
Let's do it! Christian
2012/4/28 Daniel Peebles
I don't actually think there are any rules/optimizations for fmap of newtype constructors or extractors in general. Luckily, unsafeCoerce is explicitly specified to be safe, in this kind of situation (assuming the Map is actually parametric in its value type, which it is)! ;)
On Sat, Apr 28, 2012 at 12:31 PM, Evan Laforge
wrote: On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
wrote: > Why not be explicit about the replacement strategy by injecting your values > into First/Last? My point is that in terms of functionality, using mappend > on the values is strictly more general than the current instance. It seems > unfortunate to be stuck with the current instance for historical reasons, > but I guess that's how a lot of this stuff works :/ Yeah, I suppose it would be a bit more regular that way. I'm always reluctant to map newtypes over things other than lists because I don't trust there to be a RULES that will eliminate it, but I suppose for Map there must be. I guess I wouldn't mind updating my code if the definition changed. It's hard to change a general purpose method though, simply because searching for it in your code will turn up so many false positives. _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Hm, you are right.
Still, breaking it later would give an opportunity to introduce an
appropriate Semigroup instance, while now the only solution would be, as
you say, to use Map.empty and Map.union.
Moreover, if we remove the instance now, that might force people to
define an instance of their own, which will break once again when we
actually introduce the standard instance back. (Why would people do it?
Well, imagine some function that expects a monoid like
Data.Foldable.fold.)
And if it's going to break anyway, I don't see the benefit of doing it
earlier.
Roman
* Christian Sattler
As I understand, that would break anyway later on, as it is used only in contexts where there is no Semigroup instance and Map.empty should have been used in the first place.
On 2013-01-06 10:41, Roman Cheplyaka wrote:
But that would unnecessarily break the code which only uses mempty.
Roman
* Christian Sattler
[2013-01-06 08:08:14+0100] Irrespective of the eventual replacement, I propose that we remove the current Monoid instance right now. This would also give proprietary codebases more time to adapt to the semantic change.
Christian
On 2013-01-05 21:00, Roman Cheplyaka wrote:
Great job!
To be fair, there might be code which is not released on hackage (e.g. proprietary), but the current instance is so bad that we really should take a chance to fix it.
But there's another problem... The "right" instance should really be based on Semigroup, not Monoid, but Semigroup is not currently in the base. And adding a dependency on semigroups to containers is hardly an option.
So maybe we should push the change that was discussed in another thread to move Semigroup into base, and only then change this instance.
Roman
* Christian Sattler
[2013-01-05 20:07:02+0100] Hi all,
After repeated frustration over the wrong Monoid (Data.Map.Map k v) instance I finally went ahead and did a practical test concerning its current usage.
After removing the Monoid instance for Map and IntMap, each reverse dependency of containers was separately compiled under a standard setup of GHC 7.6.1 in order to avoid shared dependency problems. Out of 1440 reverse dependencies, I could get 545 to compile. However, only the following packages fail because of Monoid instance issues:
- caledon - data-default - dom-lt - EnumMap - i18n - semigroups - unamb-custom - vacuum - stringtable-atom
EnumMap has containers <0.3, semigroups declares <0.6, unamb-custom appears to be a private abandoned clone with uploads only on 24/12/08, stringtable-atom fails to build because of a previous API change for updateMax, and the rest only use the instance internally for saying mempty instead of Data.Map.empty.
Under these circumstances, fixing the Monoid instance mistake for containers 0.6.0.0 does not seem to introduce any semantic breakage at all. I have CCed the maintainers of the lastly mentioned packages.
Let's do it! Christian
2012/4/28 Daniel Peebles
I don't actually think there are any rules/optimizations for fmap of newtype constructors or extractors in general. Luckily, unsafeCoerce is explicitly specified to be safe, in this kind of situation (assuming the Map is actually parametric in its value type, which it is)! ;)
On Sat, Apr 28, 2012 at 12:31 PM, Evan Laforge
wrote: >On Fri, Apr 27, 2012 at 7:00 PM, Daniel Peebles
>wrote: >>Why not be explicit about the replacement strategy by injecting your >values >>into First/Last? My point is that in terms of functionality, using >mappend >>on the values is strictly more general than the current instance. It >seems >>unfortunate to be stuck with the current instance for historical >reasons, >>but I guess that's how a lot of this stuff works :/ >Yeah, I suppose it would be a bit more regular that way. I'm always >reluctant to map newtypes over things other than lists because I don't >trust there to be a RULES that will eliminate it, but I suppose for >Map there must be. I guess I wouldn't mind updating my code if the >definition changed. It's hard to change a general purpose method >though, simply because searching for it in your code will turn up so >many false positives. > _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Roman Cheplyaka
But there's another problem... The "right" instance should really be based on Semigroup, not Monoid, but Semigroup is not currently in the base. And adding a dependency on semigroups to containers is hardly an option.
just to be clear, do you propose the instance to read instance (Ord k, Semigroup v) => Monoid (Map k v) or rather instance (Ord k, Semigroup v) => Semigroup (Map k v) ? cheers, hvr

* Herbert Valerio Riedel
Roman Cheplyaka
writes: But there's another problem... The "right" instance should really be based on Semigroup, not Monoid, but Semigroup is not currently in the base. And adding a dependency on semigroups to containers is hardly an option.
just to be clear, do you propose the instance to read
instance (Ord k, Semigroup v) => Monoid (Map k v)
or rather
instance (Ord k, Semigroup v) => Semigroup (Map k v)
The former, where mempty = Map.empty. (But the latter would also be present, since every monoid is (mathematically) a semigroup.) Roman

Roman Cheplyaka
Great job!
To be fair, there might be code which is not released on hackage (e.g. proprietary), but the current instance is so bad that we really should take a chance to fix it.
But there's another problem... The "right" instance should really be based on Semigroup, not Monoid, but Semigroup is not currently in the base. And adding a dependency on semigroups to containers is hardly an option.
So as I understand it,
1) There are few packages on Hackage that would break as a result of
replacing Map's existing Monoid instance.
2) However, we may want to first cut a release with no Monoid instance
at all to alert users outside of Hackage of the change in
semantics. This was suggested for containers-0.6.
3) Dropping the Monoid instance would hurt current users
relying only on mempty. Moreover, those who do need a full Monoid
instance will be forced to define it themselves, incurring two
breakages (once on removal of the current instance and again on
reintroduction of the "correct" Monoid instance). This ultimately
reflects the current problems in refactoring our typeclass
structure.
4) As Johan points out in [1], refactoring the typeclass hierarchy is
currently painful. He says this needs to be fixed before this sort
of refactoring is attempted.
5) There are proposals[2] seeking to fix this issue but, as Edward
points out, it's unclear whether they will adequately solve the
problem[3]
6) As a result, the present proposal regarding Map's Monoid instance
is stuck
Sadly, I can see only a few places where this impass can be broken,
1) We decide it's not necessary to first drop the Monoid instance this
is unlikely to affect the current users on Hackage, this could be
very frustrating for external users. This would obviously need to
be very well advertised.
2) We decide it is acceptable to break users code multiple times, drop
the Monoid instance and reintroduce the new instance after some
delay. The length of this delay could range from no delay at all
(allowing folks to quickly move to the new instance, although
potentially unwittingly) to several months (hoping that most users
will realize the change during this window).
3) We miraculously resolve the typeclass refactoring problem, allowing
us to add Semigroup to base, add a Semigroup instance for Map, and
fix its Monoid instance in one fell swoop
While I agree with hvr that incurring more breakage events than
necessary is not desirable, I nevertheless think we should again
consider option (2) as,
* Semantic changes like (1) seem too reckless to undertake like this
* The (3) depends upon our ability to rectify deep problems in how
typeclasses are defined and refactored and will not be solved in the
near future
We should take advantage of this opportunity to begin an easy cleanup
and accept that there are still other issues that will need to be dealt
with once the tools for doing so exist.
Thoughts?
Cheers,
- Ben
[1]

On Tue, Mar 12, 2013 at 8:06 AM, Ben Gamari
2) We decide it is acceptable to break users code multiple times, drop the Monoid instance and reintroduce the new instance after some delay. The length of this delay could range from no delay at all (allowing folks to quickly move to the new instance, although potentially unwittingly) to several months (hoping that most users will realize the change during this window).
Before we even consider breaking user code we should see how much code will be broken. If someone with spare cycles could download a copy of Hackage and grep for uses of mappend on Data.Map that would be great.

Johan Tibell
On Tue, Mar 12, 2013 at 8:06 AM, Ben Gamari
wrote: 2) We decide it is acceptable to break users code multiple times, drop the Monoid instance and reintroduce the new instance after some delay. The length of this delay could range from no delay at all (allowing folks to quickly move to the new instance, although potentially unwittingly) to several months (hoping that most users will realize the change during this window).
Before we even consider breaking user code we should see how much code will be broken. If someone with spare cycles could download a copy of Hackage and grep for uses of mappend on Data.Map that would be great.
It seems like Christian did a pretty good approximation to this[1] in January, no?
After removing the Monoid instance for Map and IntMap, each reverse
dependency of containers was separately compiled under a standard setup of
GHC 7.6.1 in order to avoid shared dependency problems. Out of 1440 reverse
dependencies, I could get 545 to compile. However, only the following
packages fail because of Monoid instance issues:
- caledon
- data-default
- dom-lt
- EnumMap
- i18n
- semigroups
- unamb-custom
- vacuum
- stringtable-atom
Given that the Monoid instance is completely gone in this test, this
list should be a superset of the packages that will break due to the
semantic change.
Cheers,
- Ben
[1]

Breaking the semigroups instance breaks transitively almost every package
I've written. Did you attempt to fix each of those and see what was broken
downstream? I have 60+ packages that depend on semigroups alone, though I'm
not sure how many use the union instance publicly.
I am really not a fan of the idea of an interim release where no instance
exists, because it forces folks who do heavily use union to make a bunch of
orphans that cannot work together or abandon this functionality entirely.
It effectively makes it impossible for users like me who need to support a
wide-array of platforms.
If this change goes forward I'm going to have to go make a lot of code
uglier, as a currently useful Monoid will be dead to me for years, until I
can rely on it again portably across enough platforms to cover the window
I've agreed to support for my many of my packages.
If that is the consensus I'll go along with it, but I'll miss the
functionality.
-Edward
On Tue, Mar 12, 2013 at 11:19 AM, Ben Gamari
Johan Tibell
writes: On Tue, Mar 12, 2013 at 8:06 AM, Ben Gamari
wrote: 2) We decide it is acceptable to break users code multiple times, drop the Monoid instance and reintroduce the new instance after some delay. The length of this delay could range from no delay at all (allowing folks to quickly move to the new instance, although potentially unwittingly) to several months (hoping that most users will realize the change during this window).
Before we even consider breaking user code we should see how much code will be broken. If someone with spare cycles could download a copy of Hackage and grep for uses of mappend on Data.Map that would be great.
It seems like Christian did a pretty good approximation to this[1] in January, no?
After removing the Monoid instance for Map and IntMap, each reverse dependency of containers was separately compiled under a standard setup of GHC 7.6.1 in order to avoid shared dependency problems. Out of 1440 reverse dependencies, I could get 545 to compile. However, only the following packages fail because of Monoid instance issues:
- caledon - data-default - dom-lt - EnumMap - i18n - semigroups - unamb-custom - vacuum - stringtable-atom
Given that the Monoid instance is completely gone in this test, this list should be a superset of the packages that will break due to the semantic change.
Cheers,
- Ben
[1]

Hi all,
-----Original message----- From: Ben Gamari
Sent: 12 Mar 2013, 11:06 <snip>
We should take advantage of this opportunity to begin an easy cleanup and accept that there are still other issues that will need to be dealt with once the tools for doing so exist.
IIRC, the instances we are talking about are Current: instance (Ord k) => Monoid (Map k v) where mempty = empty mappend = union mconcat = unions Proposed: instance (Ord k, Monoid v) => Monoid (Map k v) where mempty = empty mappend map0 map1 = unionWith mappend map0 map1 mconcat maps = foldr (unionWith mappend) empty maps or even instance (Ord k, Semigroup v) => Monoid (Map k v) where mempty = empty mappend map0 map1 = unionWith <> map0 map1 mconcat maps = foldr (unionWith <>) empty maps I am not convinced that the proposed instance is universally the best one. I like the current instance -- I sometimes use it on Map k v where v is a "primitive" type like Int or Double for which no Monoid instance exists. Personally, I am also not sure that fromList [(1, "A")] `mappend` fromList [(1, "B") should really be fromList [(1, "AB")] The price for any change is very high: a) silent semantic change for replacing the instance is a no go, consider for example for Map k String or Map k ByteString. b) removing and then adding the instance means we will have two major version bumps. That places a burden on every user of the library to at least upgrade upper bounds. And as Edward Kmett wrote earlier today, the users that want to use the Monoid instance must revert to some kind of conditional compilation. I am not saying we should never do any breaking changes, but in this case, the benefit seems rather small to me. So -1 on changing the Monoid instance from me. Cheers, Milan Straka

On Tue, 12 Mar 2013, Milan Straka wrote:
Hi all,
-----Original message----- From: Ben Gamari
Sent: 12 Mar 2013, 11:06 <snip>
We should take advantage of this opportunity to begin an easy cleanup and accept that there are still other issues that will need to be dealt with once the tools for doing so exist.
IIRC, the instances we are talking about are
Current: instance (Ord k) => Monoid (Map k v) where mempty = empty mappend = union mconcat = unions
Proposed: instance (Ord k, Monoid v) => Monoid (Map k v) where mempty = empty mappend map0 map1 = unionWith mappend map0 map1 mconcat maps = foldr (unionWith mappend) empty maps or even instance (Ord k, Semigroup v) => Monoid (Map k v) where mempty = empty mappend map0 map1 = unionWith <> map0 map1 mconcat maps = foldr (unionWith <>) empty maps
I am not convinced that the proposed instance is universally the best one. I like the current instance -- I sometimes use it on Map k v where v is a "primitive" type like Int or Double for which no Monoid instance exists. Personally, I am also not sure that fromList [(1, "A")] `mappend` fromList [(1, "B") should really be fromList [(1, "AB")]
The price for any change is very high: a) silent semantic change for replacing the instance is a no go, consider for example for Map k String or Map k ByteString. b) removing and then adding the instance means we will have two major version bumps. That places a burden on every user of the library to at least upgrade upper bounds. And as Edward Kmett wrote earlier today, the users that want to use the Monoid instance must revert to some kind of conditional compilation.
I am not saying we should never do any breaking changes, but in this case, the benefit seems rather small to me.
First of all, I do not know, why people require the current semantics of "mappend" for Data.Map and why they do not just use "Map.union" in order to explicitly tell that they drop duplicate elements intentionally. The best semantics for "mappend" should be driven by the question what is best for working with monoids (e.g. as writer type in the Writer monad). The original proposal gave the good argument that by choosing a monoid type for the Map element you can choose how elements for duplicate keys are merged. My experience with Map.union and Map.fromList is that I had bugs that were very hard to debug because I did not kept the case of duplicate keys in mind. Now I have added warnings to HLint for every use of Map.union and Map.fromList. Either duplicate keys are allowed, then I should explicitely state how to combine them, or keys must be distinct, then I should use Map.unionWith (error "duplicate keys"). Unfortunately, my HLint checks are bypassed if the 'Map.union' is disguised as a 'mappend'. I think that changing 'mappend' in the proposed way has the advantage to make programmers think about duplicate keys. The answer could be to replace 'mappend' by 'Map.union' or choose an element Monoid for combining the elements. Additionally Semigroup/Monoid types like First and Last are more explicit than the current behaviour of "mappend" where you cannot guess whether the left or the right element is kept when keys clash.

On Tue, Mar 12, 2013 at 11:06:22AM -0400, Ben Gamari wrote:
2) We decide it is acceptable to break users code multiple times, drop the Monoid instance and reintroduce the new instance after some delay. The length of this delay could range from no delay at all (allowing folks to quickly move to the new instance, although potentially unwittingly) to several months (hoping that most users will realize the change during this window).
If we're talking about containers, if you want a meaningful delay, you'd probably need at least one GHC+HP major release with no instance, and and then the new instance would come with the next GHC+HP major release. i.e. the delay would be at least a year. Thanks Ian

And then another 2 years before I can use it again in code that I have to
maintain that will need to be compatible with the GHC version released in
the meantime.
The 3 years I'd be unable to use it weigh strongly against it for me.
-1 from me.
On Tue, Mar 12, 2013 at 1:18 PM, Ian Lynagh
On Tue, Mar 12, 2013 at 11:06:22AM -0400, Ben Gamari wrote:
2) We decide it is acceptable to break users code multiple times, drop the Monoid instance and reintroduce the new instance after some delay. The length of this delay could range from no delay at all (allowing folks to quickly move to the new instance, although potentially unwittingly) to several months (hoping that most users will realize the change during this window).
If we're talking about containers, if you want a meaningful delay, you'd probably need at least one GHC+HP major release with no instance, and and then the new instance would come with the next GHC+HP major release. i.e. the delay would be at least a year.
Thanks Ian
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Edward Kmett
And then another 2 years before I can use it again in code that I have to maintain that will need to be compatible with the GHC version released in the meantime.
The 3 years I'd be unable to use it weigh strongly against it for me.
-1 from me.
When the matter is framed this way I agree that the cost definitely outweighs the benefit. I suppose this means that we need to resign ourselves to living with the instance to eternity? Cheers, - Ben

-----Original message----- From: Johan Tibell
Sent: 12 Mar 2013, 13:00 On Tue, Mar 12, 2013 at 12:13 PM, Ben Gamari
wrote: I suppose this means that we need to resign ourselves to living with the instance to eternity?
Most likely. Fortunately there's an infinite amount of other things we can work on, like writing new libraries! :)
One way to go would be to define a new set of API for containers instead of modifying the current one, e.g., create Data.Containers.Ord{Set,Map}. There are several issues with the current API -- wrong Monoid instance being one of those, maybe lack of class hierarchy, wrong module names -- Data.Map is really a Data.OrdMap, some people being unhappy with the API as it is, and so on. The problem is that devising new API that would make everybody happy is quite a challenging task :) But it wouldn't be so bad probably, the current API is not terribly wrong. Cheers, Milan

-----Original message----- From: Johan Tibell
Sent: 12 Mar 2013, 13:00 On Tue, Mar 12, 2013 at 12:13 PM, Ben Gamari
wrote: I suppose this means that we need to resign ourselves to living with the instance to eternity?
Most likely. Fortunately there's an infinite amount of other things we can work on, like writing new libraries! :) One way to go would be to define a new set of API for containers instead of modifying the current one, e.g., create Data.Containers.Ord{Set,Map}.
There are several issues with the current API -- wrong Monoid instance being one of those, maybe lack of class hierarchy, wrong module names -- Data.Map is really a Data.OrdMap, some people being unhappy with the API as it is, and so on.
The problem is that devising new API that would make everybody happy is quite a challenging task :) But it wouldn't be so bad probably, the current API is not terribly wrong.
Cheers, Milan
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries I've been following this thread for a while, and I have to say I like Milan's suggestion. It is backwards compatible for those who need the original libraries, and the people who like the new implementation can start using it. This could involve some duplicate work between the two branches for a while, but the old one can be depreciated whenever the
On 3/13/2013 4:53 AM, Milan Straka wrote: maintainer(s) like. Jeff

Roman Cheplyaka
So maybe we should push the change that was discussed in another thread to move Semigroup into base, and only then change this instance.
Are you proposing that (.++.) /= mappend? I think this sort of inconsistency would be a mistake, especially given the evidence that suggests that relatively little code will be affected by the change in mappend's semantics. Cheers, - Ben
participants (13)
-
Ben Gamari
-
Christian Sattler
-
Daniel Peebles
-
Edward Kmett
-
Evan Laforge
-
Henning Thielemann
-
Herbert Valerio Riedel
-
Ian Lynagh
-
Jeff Shaw
-
Johan Tibell
-
Milan Straka
-
Roman Cheplyaka
-
wren ng thornton