
I am pleased to announce the first release of signed-multiset, which implements an abstract datatype for multisets with negative membership. The package can be obtained from http://hackage.haskell.org/package/signed-multiset-0.1 As always, feedback is welcome. Cheers, Stefan signed-multiset-0.1 ------------------- Multisets (or bags) are sets in which elements may occur more than once. The number of times an element occurs in a multiset is called its multiplicity. This package provides an efficient implementation of so-called signed multisets, which generalise multisets by allowing for negative membership. That is, elements in a signed multiset can have negative multiplicities. See also: Wayne D. Blizard. Negative membership. Notre Dame Journal of Formal Logic/, 31(3):346--368, 1990.

On 18 April 2012 14:26, Stefan Holdermans
I am pleased to announce the first release of signed-multiset, which implements an abstract datatype for multisets with negative membership.
The package can be obtained from
http://hackage.haskell.org/package/signed-multiset-0.1
As always, feedback is welcome.
Cheers,
Stefan
signed-multiset-0.1 -------------------
Multisets (or bags) are sets in which elements may occur more than once. The number of times an element occurs in a multiset is called its multiplicity.
This package provides an efficient implementation of so-called signed multisets, which generalise multisets by allowing for negative membership. That is, elements in a signed multiset can have negative multiplicities.
Wait, something can be not in the multiset more than once? :o
See also: Wayne D. Blizard. Negative membership. Notre Dame Journal of Formal Logic/, 31(3):346--368, 1990. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com http://IvanMiljenovic.wordpress.com

Ivan,
This package provides an efficient implementation of so-called signed multisets, which generalise multisets by allowing for negative membership. That is, elements in a signed multiset can have negative multiplicities.
Wait, something can be not in the multiset more than once? :o
That's correct. :) Cheers, Stefan

Stefan Holdermans
This package provides an efficient implementation of so-called signed multisets, which generalise multisets by allowing for negative membership.
SignedMultiset a = Data.Map.Map a Integer so what do I gain by using your library? (what is the API difference?) Perhaps you could state this clearly at the top of the package description (visible on hackage). I sometimes find I want a type "Map with default" (the default value is stored when the Map is constructed, and you never put keys in there that map to this default). Best - J.W.

Johannes,
This package provides an efficient implementation of so-called signed multisets, which generalise multisets by allowing for negative membership.
SignedMultiset a = Data.Map.Map a Integer
Indeed.
so what do I gain by using your library? (what is the API difference?)
The library essentially provides some abstractions for recurring patterns when interpreting maps from elements to integers as signed multisets or free abelian groups: cardinality, union, additiveUnion, intersection, difference. Additionally it provides dedicated parsing and pretty printing utilities. It's just like Data.Set provides abstractions from the patterns that arise when interpreting values of type Data.Map.Map a Bool (for an element type a) as sets.
I sometimes find I want a type "Map with default" (the default value is stored when the Map is constructed, and you never put keys in there that map to this default).
You could think of SignedMultiset as an instantiation of that type (i.e., the one obtained by fixing the default to 0. HTH, Stefan

Release early, release often. Here's a second version: http://hackage.haskell.org/package/signed-multiset-0.2 Changes: * Added the functions Data.SignedMultiset.isPositive and Data.SignedMultiset.isNegative, which return whether all elements in a signed multiset have respectively nonnegative or nonpositive multiplicity. * Added the functions Data.SignedMultiset.shadow, Data.SignedMultiset.modulus, Data.SignedMultiset.signum, and Data.SignedMultiset.unitstep which return respecitively the additive inverse, modulus, signum, and left-continuous unit step of a signed multiset. * Added the functions Data.SignedMultiset.filter, Data.SignedMultiset.partition, and Data.SignedMultiset.split for filtering and partitioning signed multisets. * Added the functions Data.SignedMultiset.foldr' and Data.SignedMultiset.foldl', which are strict versions of Data.SignedMultiset.foldr and Data.SignedMultiset.foldl. * Changed the behaviour of both the function Data.SignedMultiset.fromList and the Read instance of Data.SignedMultiset: it multiple element/multiplicity pairs are provided for the same element, the multiplicity of the element in the constructed multiset is now taken to be the sum of the provided multiplicities (rather than the rightmost multiplicity). * Added the functions Data.SignedMultiset.toLists and Data.SignedMultiset.fromLists, which convert between signed multisets and pairs of lists of elements with positive and negative multiplicity. * Made the dependency on the base and containers packages more restrictive. The dependency for base now carries the version constraint == 4.5.*, while the dependency for containers now comes with the constraint >= 0.4.2 && < 0.5. Cheers, Stefan

On 19/04/2012 4:10 AM, Stefan Holdermans wrote:
Release early, release often. Here's a second version:
Very cool. In the literature, we have found [1] that these are sometimes also called 'hybrid sets'. They do have some rather interesting applications, at least when paired up with an appropriate notion of partition (shameless plug warning). Jacques PS: the Haddock did not work for 0.2, but did for 0.1, you might want to look into that. [1] http://dl.acm.org/citation.cfm?id=1894501 but also http://www.cas.mcmaster.ca/~carette/publications/calculemus10-final.pdf

Jacques,
In the literature, we have found [1] that these are sometimes also called 'hybrid sets'.
They also go by the name of 'shadow sets', which also has a cool ring to it. ;)
PS: the Haddock did not work for 0.2, but did for 0.1, you might want to look into that.
As I just uploaded the 0.2 version today, I don't think the documentation on Hackage was built yet. Locally, Haddock runs fine on it for me, but let me know if encounter any issues.
but also http://www.cas.mcmaster.ca/~carette/publications/calculemus10-final.pdf
Interesting! Added it to my reading list. Cheers, Stefan

Signed multisets are unfamiliar to most of us, and I for one found the paper a little fast-paced. Can you put a bit more into the documentation? Just for starters, I found it confusing when the documentation talks about "an element with multiplicity zero", because in the _paper_ a value that has multiplicity zero in an mzset is NOT an element of it. For a specific example, I haven't the faintest intuition about what 'map' should do. Suppose we have {(k1)x1, (k2)x2} and f x1 == f x2 = y. Should the value of map f {...} be {(k1+k2)y} or {(k1`max`k2)y} or what? I think anyone is going to be confused that a non-empty mzset can have 'cardinality' 0, so a little more reassurance that cardinality/=size is intentional might not go amiss. Perhaps an additional name such as totalMultiplicity could make things a little easier for writers and readers, even though cardinality is correct. Are you _sure_ that the definitions of isSubmultisetOf and isProperSubmultisetOf are correct? Proper sub<thing> is normally taken as requiring that only *one* member of the larger collection occur strictly fewer times in the smaller; here you require that for *all*. At least once, "multiplicities" is spelled "multiplicites".

On 4/19/12 7:02 PM, Richard O'Keefe wrote:
For a specific example, I haven't the faintest intuition about what 'map' should do. Suppose we have {(k1)x1, (k2)x2} and f x1 == f x2 = y. Should the value of map f {...} be {(k1+k2)y} or {(k1`max`k2)y} or what?
Good question. I'd suppose that they should be parametrized by any (Abelian?) group on the weights/multiplicities, where (+) is the canonical one since we're talking about "negative" membership. Though it'd be nice to get clarification, and to actually have the group/etc be a parameter (i.e., by type class constraint) rather than hard-coded into the structure. -- Live well, ~wren

Wren,
For a specific example, I haven't the faintest intuition about what 'map' should do. Suppose we have {(k1)x1, (k2)x2} and f x1 == f x2 = y. Should the value of map f {...} be {(k1+k2)y} or {(k1`max`k2)y} or what?
Good question. I'd suppose that they should be parametrized by any (Abelian?) group on the weights/multiplicities, where (+) is the canonical one since we're talking about "negative" membership.
Any groupoid on the multiplicities would do, I guess. As I wrote in my answer to Richard, max seems a better choise, as it nicely generalises mapping on sets. Cheers, Stefan

I don't think max would be a good choice, as that would mean that the default multiplicity would have to be negative infinity (the identity element of max) instead of 0, and then using Int as the type for multiplicity would not cut it. greetings, Sjoerd On Apr 20, 2012, at 11:02 AM, Stefan Holdermans wrote:
Wren,
For a specific example, I haven't the faintest intuition about what 'map' should do. Suppose we have {(k1)x1, (k2)x2} and f x1 == f x2 = y. Should the value of map f {...} be {(k1+k2)y} or {(k1`max`k2)y} or what?
Good question. I'd suppose that they should be parametrized by any (Abelian?) group on the weights/multiplicities, where (+) is the canonical one since we're talking about "negative" membership.
Any groupoid on the multiplicities would do, I guess.
As I wrote in my answer to Richard, max seems a better choise, as it nicely generalises mapping on sets.
Cheers,
Stefan _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Sjoerd Visscher sjoerd@w3future.com

Sjoerd,
For a specific example, I haven't the faintest intuition about what 'map' should do. Suppose we have {(k1)x1, (k2)x2} and f x1 == f x2 = y. Should the value of map f {...} be {(k1+k2)y} or {(k1`max`k2)y} or what?
I don't think max would be a good choice, as that would mean that the default multiplicity would have to be negative infinity (the identity element of max) instead of 0, and then using Int as the type for multiplicity would not cut it.
Why would one need such an identity element for map? Note that the monoidal structure isn't defined over the "maximum of multiplicites" but over the "maximum of *nonzero* multiplicites". (For ordinary sets and ordinary multisets these operations happen to coincide.) Cheers, Stefan

This is not just about map, but it also a problem for the Monoid instance. You are basically adding an extra identity element, 0, to the max monoid, which works but is weird. You'll have to call norm everywhere to make it work, f.e. you would expect this to work: empty' = insert () $ delete () empty but: empty' <> delete () empty == empty' while: empty <> delete () empty == delete () empty (I couldn't test it as I don't have base 4.5, so I hope I didn't make a mistake here.) greetings, Sjoerd On Apr 23, 2012, at 2:07 PM, Stefan Holdermans wrote:
Sjoerd,
For a specific example, I haven't the faintest intuition about what 'map' should do. Suppose we have {(k1)x1, (k2)x2} and f x1 == f x2 = y. Should the value of map f {...} be {(k1+k2)y} or {(k1`max`k2)y} or what?
I don't think max would be a good choice, as that would mean that the default multiplicity would have to be negative infinity (the identity element of max) instead of 0, and then using Int as the type for multiplicity would not cut it.
Why would one need such an identity element for map?
Note that the monoidal structure isn't defined over the "maximum of multiplicites" but over the "maximum of *nonzero* multiplicites". (For ordinary sets and ordinary multisets these operations happen to coincide.)
Cheers,
Stefan

Sjoerd,
This is not just about map, but it also a problem for the Monoid instance. You are basically adding an extra identity element, 0, to the max monoid, which works but is weird.
Still that's how union is typically defined for hybrid sets. It's what happens if want union and empty to behave as generalisations of these concepts for ordinary (multi)sets.
empty' = insert () $ delete () empty
but:
empty' <> delete () empty == empty'
while:
empty <> delete () empty == delete () empty
(I couldn't test it as I don't have base 4.5, so I hope I didn't make a mistake here.)
*Data.SignedMultiset> let empty' = insert () $ delete () empty *Data.SignedMultiset> empty' `union` delete () empty == empty' False *Data.SignedMultiset> empty `union` delete () empty == delete () empty True Cheers, Stefan

On Apr 23, 2012, at 3:18 PM, Stefan Holdermans wrote:
Sjoerd,
This is not just about map, but it also a problem for the Monoid instance. You are basically adding an extra identity element, 0, to the max monoid, which works but is weird.
Still that's how union is typically defined for hybrid sets. It's what happens if want union and empty to behave as generalisations of these concepts for ordinary (multi)sets.
Then why would you want that?
*Data.SignedMultiset> let empty' = insert () $ delete () empty
*Data.SignedMultiset> empty' `union` delete () empty == empty' False
*Data.SignedMultiset> empty `union` delete () empty == delete () empty True
Ah, I missed the check in insertMany. What about the same with let empty' = multiply 0 $ delete () empty greetings, Sjoerd

Sjoerd,
This is not just about map, but it also a problem for the Monoid instance. You are basically adding an extra identity element, 0, to the max monoid, which works but is weird.
Still that's how union is typically defined for hybrid sets. It's what happens if want union and empty to behave as generalisations of these concepts for ordinary (multi)sets.
Then why would you want that?
You don't have to. (SignedMultiset a, additiveUnion, empty) gives you the Monoid that you seem to have a preference for. The library supplies it through the Additive wrapper. The point is that you have a choice: different applications may ask for different monoidal structures.
*Data.SignedMultiset> let empty' = insert () $ delete () empty
*Data.SignedMultiset> empty' `union` delete () empty == empty' False
*Data.SignedMultiset> empty `union` delete () empty == delete () empty True
Ah, I missed the check in insertMany.
What about the same with
let empty' = multiply 0 $ delete () empty
*Data.SignedMultiset> let empty' = multiply 0 $ delete () empty *Data.SignedMultiset> empty' `union` delete () empty == empty' True *Data.SignedMultiset> empty `union` delete () empty == delete () empty True Cheers, Stefan

On Apr 23, 2012, at 4:34 PM, Stefan Holdermans wrote:
Sjoerd,
This is not just about map, but it also a problem for the Monoid instance. You are basically adding an extra identity element, 0, to the max monoid, which works but is weird.
Still that's how union is typically defined for hybrid sets. It's what happens if want union and empty to behave as generalisations of these concepts for ordinary (multi)sets.
Then why would you want that?
You don't have to. (SignedMultiset a, additiveUnion, empty) gives you the Monoid that you seem to have a preference for. The library supplies it through the Additive wrapper. The point is that you have a choice: different applications may ask for different monoidal structures.
Agreed. But I just can't imagine that the other instance is in any way useful. You basically define a function max': max' :: Int -> Int -> Int max' 0 b = b max' a 0 = a max' a b = max a b i.e. max' -2 -1 = -1 max' -2 0 = -2 max' -2 1 = 1 Wouldn't you agree that if you saw this defined in some code, you'd think something is wrong?
*Data.SignedMultiset> let empty' = multiply 0 $ delete () empty
*Data.SignedMultiset> empty' `union` delete () empty == empty' True
*Data.SignedMultiset> empty `union` delete () empty == delete () empty True
And this doesn't bother you? greetings, Sjoerd

Sjoerd,
Then why would you want that?
You don't have to. (SignedMultiset a, additiveUnion, empty) gives you the Monoid that you seem to have a preference for. The library supplies it through the Additive wrapper. The point is that you have a choice: different applications may ask for different monoidal structures.
Agreed. But I just can't imagine that the other instance is in any way useful. You basically define a function max':
max' :: Int -> Int -> Int max' 0 b = b max' a 0 = a max' a b = max a b
i.e.
max' -2 -1 = -1 max' -2 0 = -2 max' -2 1 = 1
Wouldn't you agree that if you saw this defined in some code, you'd think something is wrong?
If max' is supposed to implement the maximum of two nonzero values, I wouldn't be the slightest bit concerned. Seriously: if this is what people have agreed on to be a sensible semantics for hybrid sets, I am fine implementing it like this.
*Data.SignedMultiset> let empty' = multiply 0 $ delete () empty
*Data.SignedMultiset> empty' `union` delete () empty == empty' True
*Data.SignedMultiset> empty `union` delete () empty == delete () empty True
And this doesn't bother you?
Of course it does; it pinpoints a bug in multiply. It's fixed now: *Data.SignedMultiset> let empty' = multiply 0 $ delete () empty *Data.SignedMultiset> empty' `union` delete () empty == empty' False *Data.SignedMultiset> empty `union` delete () empty == delete () empty True Cheers, Stefan

On Apr 23, 2012, at 7:04 PM, Stefan Holdermans wrote:
if this is what people have agreed on to be a sensible semantics for hybrid sets, I am fine implementing it like this.
I have a hard time believing you have implemented the semantics that people have agreed on to be a sensible semantics for hybrid sets. I would expect that multiplicity k (m `union` n) == multiplicity k m `max` multiplicity k n. Which means that an element of m with negative multiplicity is not a member of m `union` n if it is not a member of n. So that would mean that union would have to be implemented something like this: SMS m `union` SMS n = SMS $ Map.intersectionWith max m n `Map.union` Map.filter (>= 0) ((m Map.\\ n) `Map.union` (n Map.\\ m)) greetings, Sjoerd

On 4/23/12 9:18 AM, Stefan Holdermans wrote:
Sjoerd,
This is not just about map, but it also a problem for the Monoid instance. You are basically adding an extra identity element, 0, to the max monoid, which works but is weird.
Still that's how union is typically defined for hybrid sets. It's what happens if want union and empty to behave as generalisations of these concepts for ordinary (multi)sets.
Why? Whenever I've dealt with bags, the appropriate way to handle collision on union has been to use addition--- not maximization. Ditto for all the other set ops. Addition seems far more natural as the extension from basic sets. The fact that you have negative multiplicities only underscores the naturality of addition[1]. Why do you think max is natural? [1] With addition, {a:-2} `union` {a:3} == {a:1} and so "negation" is exactly that, and zero multiplicity is exactly the case of not being an element because zero is the identity for addition. Whereas with maximization {a:-2} `union` {a:3} == {a:3} so "negation" doesn't actually mean negation. What you actually mean is that your multiplicities are linearly ordered but do not have a bottom element (as they would for bags or sets). This can be useful for when there's no sound conception of "zero", as when dealing with interval scale variables in statistics http://en.wikipedia.org/wiki/Level_of_measurement. And if you left it at that, this would be sensible, though I'm not sure I'd have much use for it in my own work. However, then you're tacking on that even though there's no zero in your measurement scale, for some reason one particular level is considered special and is used to mean that an element is not in the set. Consequently, if I start with some set {a:x} and union it with {a:1} repeatedly, then I'll find that a is in the set more and more often (which is good) except potentially at some point where suddenly a is not in the set, but then it'll be in the set again on the next turn. This seems extremely unnatural to me. It can have practical utility in certain contexts (e.g., if used in conjunction with addition to form the log-Viterbi semiring), but I don't see any justification for it being a natural extension of sets and bags. -- Live well, ~wren

Wren, Sjoerd,
This is not just about map, but it also a problem for the Monoid instance. You are basically adding an extra identity element, 0, to the max monoid, which works but is weird.
Still that's how union is typically defined for hybrid sets. It's what happens if want union and empty to behave as generalisations of these concepts for ordinary (multi)sets.
Why? Whenever I've dealt with bags, the appropriate way to handle collision on union has been to use addition--- not maximization. Ditto for all the other set ops. Addition seems far more natural as the extension from basic sets. The fact that you have negative multiplicities only underscores the naturality of addition[1]. Why do you think max is natural?
The union of two sets is typically defined as the smallest set that is a superset of both the operands; this definition extends nicely for multisets and hybrid sets [1,2,3]. I do not recall claiming maximisation to be "natural"; what's natural for one application, may not be for another. The library provides both union (based on maximisation) and additive union, and furthermore gives Monoid instances over both of them. Not having a preference for either one of them, I thought that having both of them in the library made sense. That said, my goal was never to upload a package that only leads to confusion and false expectations. As the common opinion seems to be that a maximising union is bad, I will drop it from the library and deprecate the package on Hackage. Thanks for your feedback, Stefan [1] Wayne D. Blizard: Multiset Theory. Notre Dame Journal of Formal Logic 30(1): 36-66 (1989) [2] Wayne D. Blizard: Negative Membership. Notre Dame Journal of Formal Logic 31(3): 346-368 (1990) [3] Apostolos Syropoulos: Mathematics of Multisets. WMP 2000: 347-358

On Apr 25, 2012, at 11:39 AM, Stefan Holdermans wrote:
The union of two sets is typically defined as the smallest set that is a superset of both the operands; this definition extends nicely for multisets and hybrid sets [1,2,3].
[3] differs from [1] and [2] (and your implementation). [3] defines the union as h(u) = max(f(u), g(u)) where f, g and h are multiplicity functions. [4] does the same. But [2] says on page 352: "For binary union, take the maximum of nonzero multiplicities", and this is also what your implementation does. (Map.unionWith max does not do a max if the key is not in one of the maps, i.e. if one of the multiplicities is 0.)
I do not recall claiming maximisation to be "natural"; what's natural for one application, may not be for another. The library provides both union (based on maximisation) and additive union, and furthermore gives Monoid instances over both of them. Not having a preference for either one of them, I thought that having both of them in the library made sense.
That said, my goal was never to upload a package that only leads to confusion and false expectations. As the common opinion seems to be that a maximising union is bad, I will drop it from the library and deprecate the package on Hackage.
I think if your union would follow the definition with max as in [3] and [4] it is not confusing. But then empty would not be the identity of that union (that would be the signed multiset with multiplicity negative infinity for all elements), so it would still not be a good choice for the Monoid instance. greetings, Sjoerd
[1] Wayne D. Blizard: Multiset Theory. Notre Dame Journal of Formal Logic 30(1): 36-66 (1989) [2] Wayne D. Blizard: Negative Membership. Notre Dame Journal of Formal Logic 31(3): 346-368 (1990) [3] Apostolos Syropoulos: Mathematics of Multisets. WMP 2000: 347-358 [4] A Cavalcanti: Theoretical Aspects of Computing

Sjoerd, I am sorry, as I already wrote, I decided to deprecate the package.
[3] defines the union as h(u) = max(f(u), g(u)) where f, g and h are multiplicity functions.
Which is the same, as [3] is about multisets, not signed multisets.
[...] and this is also what your implementation does. (Map.unionWith max does not do a max if the key is not in one of the maps, i.e. if one of the multiplicities is 0.)
I am aware of that. Why are you explaining this?
I think if your union would follow the definition with max as in [3] and [4] it is not confusing. But then empty would not be the identity of that union (that would be the signed multiset with multiplicity negative infinity for all elements), so it would still not be a good choice for the Monoid instance.
Let's agree to disagree. This is not constructive and, moreover, going nowhere. Cheers, Stefan
[3] Apostolos Syropoulos: Mathematics of Multisets. WMP 2000: 347-358

On Apr 26, 2012, at 12:54 AM, Stefan Holdermans wrote:
Sjoerd,
I am sorry, as I already wrote, I decided to deprecate the package.
That's too bad, I really love these kind of data structures. (That's why I keep ranting about it, sorry about that.)
[3] defines the union as h(u) = max(f(u), g(u)) where f, g and h are multiplicity functions.
Which is the same, as [3] is about multisets, not signed multisets.
Chapter 3 of [3] is about Hybrid Sets.
[...] and this is also what your implementation does. (Map.unionWith max does not do a max if the key is not in one of the maps, i.e. if one of the multiplicities is 0.)
I am aware of that. Why are you explaining this?
Because we can't seem to get to an agreement. Usually this is because of a misunderstanding, so I try be as clear as possible, so you can understand my position, or point out my mistake.
I think if your union would follow the definition with max as in [3] and [4] it is not confusing. But then empty would not be the identity of that union (that would be the signed multiset with multiplicity negative infinity for all elements), so it would still not be a good choice for the Monoid instance.
Let's agree to disagree. This is not constructive and, moreover, going nowhere.
I'd prefer to keep arguing actually. ;-) But if you have something better to do, I won't bother you anymore. greetings, Sjoerd
[3] Apostolos Syropoulos: Mathematics of Multisets. WMP 2000: 347-358

Sjoerd,
[3] defines the union as h(u) = max(f(u), g(u)) where f, g and h are multiplicity functions.
Which is the same, as [3] is about multisets, not signed multisets.
Chapter 3 of [3] is about Hybrid Sets.
And there the union is defined by taking the *minimum* of multiplicities, which is even more strange.
[...] and this is also what your implementation does. (Map.unionWith max does not do a max if the key is not in one of the maps, i.e. if one of the multiplicities is 0.)
I am aware of that. Why are you explaining this?
Because we can't seem to get to an agreement. Usually this is because of a misunderstanding, so I try be as clear as possible, so you can understand my position, or point out my mistake.
I don't think you are mistaken. I think that 1. you're happy with the monoid over additiveUnion; 2. you're not happy with the definition of union. I'm not a mathematician, but I can see value in how union is defined now: it nicely generalises familiar concepts and associated properties (!) from sets and bags, and there seems to be support for it in literature.
Let's agree to disagree. This is not constructive and, moreover, going nowhere.
I'd prefer to keep arguing actually. ;-) But if you have something better to do, I won't bother you anymore.
I do enjoy this kind of discussion, but I think neither one of us will be able to convince the other. I am totally okay with that. That is not to say that I cannot be convinced by new, more compelling arguments. Cheers, Stefan

On 4/25/12 7:27 PM, Stefan Holdermans wrote:
Sjoerd,
[3] defines the union as h(u) = max(f(u), g(u)) where f, g and h are multiplicity functions.
Which is the same, as [3] is about multisets, not signed multisets.
Chapter 3 of [3] is about Hybrid Sets.
And there the union is defined by taking the *minimum* of multiplicities, which is even more strange.
FWIW, I've run into the minimum version of bags/multisets as well. It's a bit strange notionally, but mathematically it actually works out pretty well as I recall. Do note that if we consider multisets to be sets with extra structure, then some minimisation is implicit in the maximization definition as well. That is, consider the definition in Syropoulos where a ("real") multiset is a tuple (X,p) of X a set and p an equivalence relation on X. Given two multisets (X,p) and (Y,q), let (Z,r) be their max-union. Since we're thinking of these as sets with extra structure, we'd like for Z to be the union of X and Y. In order for multiplicities to work out right, that means we must require that X and Y be "maximally non-disjoint". That is, consider (W,r') where W is the intersection of X and Y, and r' is the restriction of r to W. We want that the multiplicities in (W,r') are the *minimum* of the multiplicities in (X,p) and (Y,q). That guarantees that elements are shared whenever possible, and hence that Z = X `union` Y is as small as possible while being consistent with the multiplicities on (X,p) and (Y,q). -- Live well, ~wren

On 4/25/12 5:39 AM, Stefan Holdermans wrote:
The union of two sets is typically defined as the smallest set that is a superset of both the operands;
Or, the smallest set containing all the elements of both/all operands. The two definitions coincide for sets. They diverge for bags/multisets: yours uses maximization whereas mine uses addition[1]. Notably, both operations are monoids with zero being the identity element. Thus it is sensible for zero to denote non-membership in both versions. Both are natural extensions of the case for sets; which is "more natural" is subject to debate and context. (I've always considered "bags" and "multisets" to be synonymous, though perhaps others mean them to distinguish these two different ideas of generalizing sets.) They continue to diverge when multiplicities can be negative; here the use of zero becomes an issue. For the additive definition, zero is still okay for denoting non-membership because zero is the identity for addition on integers (as well as the identity for addition on naturals). This is a natural extension of the additive case for bags, we're just adding closure for subtraction. However, while zero is the identity for maximization on naturals, there is no identity for maximization on the integers. You'd have to use the integers plus negative infinity in order to have a monoid. [1] Actually, maximization only works out if we assume that the arguments are "maximally non-disjoint" ---i.e., share elements whenever possible--- whereas addition only works out if we assume the arguments are disjoint. These are the two extreme cases, and therefore the most interesting, though it's possible to get any multiplicity between those extremes. Assuming sets to be disjoint is a common thing, however; whereas I don't know of any set-theoretic way of defining "maximally non-disjoint" to be anything meaningful.
I do not recall claiming maximisation to be "natural"; what's natural for one application, may not be for another.
I'm only using natural in the mathematical sense (i.e., as the "obvious" extension of something). And you did claim maximization was the obvious/appropriate way to extend the case for sets. I'm fine with the claim that maximization is natural for bags/multisets. It's just that addition is natural there too. Sets are where lattice theory and ring theory meet; so there's no a priori reason to say that generalizations of sets belong more to one theory than the other. But the one thing both theories agree on is the importance of monoids. The problem with the further extension from bags/multisets to signed multisets is that the difference between lattices and rings becomes a lot more apparent. You can make maximization work in a sensible way, and it's a desirable thing to have; you just can't make it work with zero because zero has no special significance to the maximization function on integers.
That said, my goal was never to upload a package that only leads to confusion and false expectations. As the common opinion seems to be that a maximising union is bad, I will drop it from the library and deprecate the package on Hackage.
Deprecating the package seems a bit excessive. Why not just extend the documentation to explain exactly what properties the two variants have and why different people might want one over the other? Why not just rename "union" (and "map") to something else, thereby giving neither variant the default status? Users can always define union = theUnionIWant.
[1] Wayne D. Blizard: Multiset Theory. Notre Dame Journal of Formal Logic 30(1): 36-66 (1989) [2] Wayne D. Blizard: Negative Membership. Notre Dame Journal of Formal Logic 31(3): 346-368 (1990) [3] Apostolos Syropoulos: Mathematics of Multisets. WMP 2000: 347-358
If you take a look at Syropoulos's definition 12 (the subhybridset definition of Loeb) you can see that there's something funny going on. In particular, the definition of (<<) cannot be correct because Syropoulos says: i << j = (i <= j) && (i < 0 || 0 <= j) But that's identical to just using (<=) in the first place! I think what Syropoulos/Loeb means to say instead is that: i << j = (abs i <= abs j) && (i < 0 || 0 <= j) In which case you actually get a partial order where zero does have significance. Though it's a rather peculiar ordering, so maybe they mean something else again. Also, lest you think the rest of us are crazy, all the other authors I've seen use "union" to mean what Syropoulos calls "sum" (defn.7), and use some other terminology like "minimal-union" to mean what Syropoulos calls "union" (defn.9). E.g., http://www.cs.utexas.edu/~moore/acl2/workshop-2002/contrib/martin-alonso-hid... Which came up when trying to locate the papers you mentioned. -- Live well, ~wren

Richard, Thanks for your excellent feedback! Very helpful!
Signed multisets are unfamiliar to most of us, and I for one found the paper a little fast-paced. Can you put a bit more into the documentation?
Definitely. That's what weekends are for... :) I'll add some sections to the documentation that explain the main concepts and intuitions.
Just for starters, I found it confusing when the documentation talks about "an element with multiplicity zero", because in the _paper_ a value that has multiplicity zero in an mzset is NOT an element of it.
I see... It seems that with multisets every value of the appropriate type is an element of the set, but some values (i.e., those with multiplicities > 0) are more of an element than others (i.e., those with multiplicity 0). In the paper this is reflected by having the "element-of" relation a ternary relation between elements, sets, and multiplicities (with a "functional dependency" from elements and sets to multiplicities). Confusingly, "not-an-element-of" notation is used as an abbreviation for "element-of with multiplicity zero". The library labels elements with multiplicities > 0 as "members".
For a specific example, I haven't the faintest intuition about what 'map' should do. Suppose we have {(k1)x1, (k2)x2} and f x1 == f x2 = y. Should the value of map f {...} be {(k1+k2)y} or {(k1`max`k2)y} or what?
"Should" is a tricky term here. It depends on what you consider to be the structure of a multiset. Currently, map applies its function argument copywise: {x1_1, x1_2, ..., x1_k1, x2_1, x2_2, ..., x2_k2 } | | | | | | | f | f | f | f | f | f v v v v v v {y_1, y_2, ..., y_k1, y_k1+1, y_k1+2, ..., y_k1+k2}∑ That is, we preserve the additive structure. But perhaps this is inconsistent: the Monoid instance for SignedMultiset operates on the extremal structure rather than the additive structure. A wrapper type Additive, newtype Additive a = Additive (SignedMultiset a) supplies a monoid instance that operates on the additive structure. So, perhaps, it's better to have map as it is implemented now, operate on Additive rather than SignedMultiset and have a map for SignedMultiset that uses max rather than (+). A rationale for this would be that, this way, map would be a proper generalisation of ordinary sets. (The same rationale has been applied to union.) Pondering over this, I realise that the difference operator also doesn't property generalise the concept for ordinary sets. However, that concept indeed seems a bit more tricky to generalise.
Are you _sure_ that the definitions of isSubmultisetOf and isProperSubmultisetOf are correct? Proper sub<thing> is normally taken as requiring that only *one* member of the larger collection occur strictly fewer times in the smaller; here you require that for *all*.
You are right: isProperSubmultisetOf is too restrictive. I'll fix it. Thanks, Stefan
participants (7)
-
Ivan Lazar Miljenovic
-
Jacques Carette
-
Johannes Waldmann
-
Richard O'Keefe
-
Sjoerd Visscher
-
Stefan Holdermans
-
wren ng thornton