
All, Forgive me if this has been brought up before, but I was just thinking about the fact that there's no Map structure in the standard library that doesn't pretty much require an Ord instance for its keys. I say "pretty much" because, while you can declare a Map with keys of a type that's not an Ord instance, you can't do hardly anything with it. E.g., you can't do an insert. Makes it kind of pointless. Now, I understand the fact that this is for efficiency purposes, but it does seem a bit inconsistent. Why allow the construction of a Map you can't do anything with? Here are two possible more consistent ways of handling this: Suppose I have some data structure that is inherently unordered, and I want to use it as a key into a Map. How would I get around this limitation on the client side? Well, the most obvious way is to create some nonsensical Ord instance, like compare _ _ = Eq. Is that the recommended solution? If so, we could create unordered versions of all the ordered methods that assume this 'ordering'. If not, what is the recommended way, and why not create unordered versions that use that? The other way to make the library more consistent, perhaps, would be to simply move the Ord requirement up to that data structure: that is, make it Ord k => Map k a, and then have a new data structure like UnorderedMap (or, to use a more standard term, Dictionary). Thoughts? Am I completely missing something?

Hi
The other way to make the library more consistent, perhaps, would be to simply move the Ord requirement up to that data structure: that is, make it Ord k => Map k a, and then have a new data structure like UnorderedMap (or, to use a more standard term, Dictionary).
Have you tried to do this? You get errors all over the place, the monomorphism restriction kicks in, and it goes really wrong. Consider Data.Map.empty, you'd have to pass an Ord dictionary for the key, which often isn't known at that point. Suddenly the code: newMap = Data.Map.empty Stops working! Having too much polymorphism at random places breaks it. In essence, putting a context on a data type is a really bad idea. Haskell's solution with Data.Map is perfectly fine, and seems logical once you realise that its just the Haskell encoding of Ord k => Map k a Thanks Neil

Ah, good call. I don't have an interpreter handy, but I'm sure you're
right. What about the other method of making the library more
consistent? One that assumes some default (non-)ordering?
On Tue, Jul 29, 2008 at 2:05 PM, Neil Mitchell
Hi
The other way to make the library more consistent, perhaps, would be to simply move the Ord requirement up to that data structure: that is, make it Ord k => Map k a, and then have a new data structure like UnorderedMap (or, to use a more standard term, Dictionary).
Have you tried to do this? You get errors all over the place, the monomorphism restriction kicks in, and it goes really wrong. Consider Data.Map.empty, you'd have to pass an Ord dictionary for the key, which often isn't known at that point. Suddenly the code:
newMap = Data.Map.empty
Stops working! Having too much polymorphism at random places breaks it.
In essence, putting a context on a data type is a really bad idea. Haskell's solution with Data.Map is perfectly fine, and seems logical once you realise that its just the Haskell encoding of Ord k => Map k a
Thanks
Neil

Andrew Wagner wrote:
Ah, good call. I don't have an interpreter handy, but I'm sure you're right. What about the other method of making the library more consistent? One that assumes some default (non-)ordering?
Any such built-in ordering would have to be provided by the compiler, and I bet would be hard to implement in a referentially transparent way. Remember that the only thing that's common to all types in Haskell is _|_, so there's nothing a polymorphic function could use for ordering. Pete
On Tue, Jul 29, 2008 at 2:05 PM, Neil Mitchell
wrote: Hi
The other way to make the library more consistent, perhaps, would be to simply move the Ord requirement up to that data structure: that is, make it Ord k => Map k a, and then have a new data structure like UnorderedMap (or, to use a more standard term, Dictionary). Have you tried to do this? You get errors all over the place, the monomorphism restriction kicks in, and it goes really wrong. Consider Data.Map.empty, you'd have to pass an Ord dictionary for the key, which often isn't known at that point. Suddenly the code:
newMap = Data.Map.empty
Stops working! Having too much polymorphism at random places breaks it.
In essence, putting a context on a data type is a really bad idea. Haskell's solution with Data.Map is perfectly fine, and seems logical once you realise that its just the Haskell encoding of Ord k => Map k a
Thanks
Neil

Well, it wouldn't need to literally implement Ord. It would just need
to do the same operations as Data.Map does, except without the
efficiency you get from Data.Map because you can assume an Ord
instance. One way to do it is to simply internally store the data as
an ordered [(a,k)], for example. Not efficient, but you get the same
interface as Map.
On Tue, Jul 29, 2008 at 4:26 PM, Peter Gavin
Andrew Wagner wrote:
Ah, good call. I don't have an interpreter handy, but I'm sure you're right. What about the other method of making the library more consistent? One that assumes some default (non-)ordering?
Any such built-in ordering would have to be provided by the compiler, and I bet would be hard to implement in a referentially transparent way. Remember that the only thing that's common to all types in Haskell is _|_, so there's nothing a polymorphic function could use for ordering.
Pete
On Tue, Jul 29, 2008 at 2:05 PM, Neil Mitchell
wrote: Hi
The other way to make the library more consistent, perhaps, would be to simply move the Ord requirement up to that data structure: that is, make it Ord k => Map k a, and then have a new data structure like UnorderedMap (or, to use a more standard term, Dictionary).
Have you tried to do this? You get errors all over the place, the monomorphism restriction kicks in, and it goes really wrong. Consider Data.Map.empty, you'd have to pass an Ord dictionary for the key, which often isn't known at that point. Suddenly the code:
newMap = Data.Map.empty
Stops working! Having too much polymorphism at random places breaks it.
In essence, putting a context on a data type is a really bad idea. Haskell's solution with Data.Map is perfectly fine, and seems logical once you realise that its just the Haskell encoding of Ord k => Map k a
Thanks
Neil

On 2008-07-29, Andrew Wagner
Well, it wouldn't need to literally implement Ord. It would just need to do the same operations as Data.Map does, except without the efficiency you get from Data.Map because you can assume an Ord instance. One way to do it is to simply internally store the data as an ordered [(a,k)], for example. Not efficient, but you get the same interface as Map.
Yes, but then you need to check tha you have the right one, which means you need the context "Eq a". If you're going to make users write an equality function, making them write an ordering adds little effort, and a reasonable amount of gain. Usually. -- Aaron Denney -><-

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Aaron Denney wrote:
If you're going to make users write an equality function, making them write an ordering adds little effort, and a reasonable amount of gain. Usually.
Then why is there a distinction between e.g. Map and SortedMap (and Set and SortedSet) in the Java libraries? Yes yes I know Haskell is not Java etc. but they must have given this some thought. (Of course them making everything an instance of Eq and Hash is a design error but that's not the point here.) The practical problem with Haskell's type class mechanism is that all instances (of Eq, Ord) are global, so if one library (Data.Map) requires them, then you're stuck with these instances for all of your program. Of course the same thing holds for Java's "implements Comparable<>" but they have local types. Well, Haskell has newtype, but that's (module-)global. Best regards, J.W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkiS4vYACgkQDqiTJ5Q4dm+GAACfZ6sedaApVtDaErb/1A0IL650 e80AoKnIQQfvdOhBUgtct7WqkVxtg4ps =AV1r -----END PGP SIGNATURE-----

On 2008-08-01, Johannes Waldmann
Aaron Denney wrote:
If you're going to make users write an equality function, making them write an ordering adds little effort, and a reasonable amount of gain. Usually.
Then why is there a distinction between e.g. Map and SortedMap (and Set and SortedSet) in the Java libraries?
Yes yes I know Haskell is not Java etc. but they must have given this some thought. (Of course them making everything an instance of Eq and Hash is a design error but that's not the point here.)
You would have to ask the Java designers.
The practical problem with Haskell's type class mechanism is that all instances (of Eq, Ord) are global,
This is a feature, not a bug. It helps ensure that manipulations on maps will always use compatible functions. If, instead, you constructed maps by passing in a comparison function, what happens when you merge two maps? Which function gets used? Normally you would be able to re-use the structure of each map in combining them. But if the functions they used are different, than they have to be resorted according to the kept function. Or, you have an obscure, difficult to track down bug, rather than an error at compile time.
so if one library (Data.Map) requires them, then you're stuck with these instances for all of your program. Of course the same thing holds for Java's "implements Comparable<>" but they have local types. Well, Haskell has newtype, but that's (module-)global.
They don't have local types. They have inner classes, and objects get passed. Newtypes do work. The newtype deriving extension works even better. If something has multiple ways of comparing, then the context in which they are used should be tagged differently. Haskell types are cheap, both in the compiler, and when typing, because it's nowhere near as verbose as Java. -- Aaron Denney -><-

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
They [Java] don't have local types. They have inner classes, and objects get passed.
The net effect is that I can make an inner class that implements some interface, and in the implementation I can refer to things defined in some enclosing scope (not just in the global scope). Sure, I can only refer to static things, but in Haskell everything is static, no?
Newtypes do work.
I agree, to some extent. Using a newtype to simulate the above is like lambda-lifting: you have to add the information on the local things you want to use. The Java compiler does that for you. Best regards, J.W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkiS8v4ACgkQDqiTJ5Q4dm+rvwCeN+R7aRh4EwBXIKlf3Mhc9nc5 wRAAoKc+SlkCkEaybN6jIBHDSTu8J/yC =8Ndx -----END PGP SIGNATURE-----

On Aug 1, 2008, at 6:18 AM, Johannes Waldmann wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Aaron Denney wrote:
If you're going to make users write an equality function, making them write an ordering adds little effort, and a reasonable amount of gain. Usually.
Then why is there a distinction between e.g. Map and SortedMap (and Set and SortedSet) in the Java libraries?
Yes yes I know Haskell is not Java etc. but they must have given this some thought. (Of course them making everything an instance of Eq and Hash is a design error but that's not the point here.)
Au contraire, it's *exactly* the point! Java uses the hash code to implement collections that only require equality and hashing, but no ordering. Haskell, as a functional language, instead prefers equality and ordering---because trees admit efficient pure update, whereas hash tables generally do not. Two different languages, two different approaches to implementing collections---one more imperative, the other more functional. Of course, if one is simply looking for INefficient collections that require only (Eq a), this probably doesn't matter. But in that case it's hard to do much better than [(a,b)] and lookup (it's possible to do better, using e.g. unboxed tuple arrays and seekrit mutation, but probably only worth it for stuff that's big enough that we should have been using a proper map anyway). -Jan-Willem Maessen

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
[...] Two different languages, two different approaches to implementing collections---one more imperative, the other more functional.
It's "easier" to implement Hash than Ord, because for Ord, you need transitivity. while a hash function just needs to be a function. (Difficult in Java, impossible to get wrong in Haskell.) If you take a bad hash function, you're "only" hurting performance, not correctness. In fact I sometimes make a wrapper type sth. like Wrap { hash = hash x, contents = x } deriving ( Eq, Ord ) (hoping for the left-to-right comparison) and then use Data.Map. Best regards, J.W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkiTNckACgkQDqiTJ5Q4dm+FXgCeNLHcnFHCB+Bq9xJyv+qY1UE+ P+4Anja7agfdrTpDHcN9GAT3hkzav7jc =Jv1z -----END PGP SIGNATURE-----

On Fri, Aug 1, 2008 at 1:19 PM, Jan-Willem Maessen
On Aug 1, 2008, at 6:18 AM, Johannes Waldmann wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Aaron Denney wrote:
If you're going to make users write an equality function, making them write an ordering adds little effort, and a reasonable amount of gain. Usually.
Then why is there a distinction between e.g. Map and SortedMap (and Set and SortedSet) in the Java libraries?
Yes yes I know Haskell is not Java etc. but they must have given this some thought. (Of course them making everything an instance of Eq and Hash is a design error but that's not the point here.)
Au contraire, it's *exactly* the point! Java uses the hash code to implement collections that only require equality and hashing, but no ordering. Haskell, as a functional language, instead prefers equality and ordering---because trees admit efficient pure update, whereas hash tables generally do not. Two different languages, two different approaches to
You know, this isn't actually true. You can implement an immutable HashMap as newtype HashMap a b = HashMap (IntMap [(a, b)]) Where you basically store association lists of things with the same hash code as they values of the IntMap. This has fairly good merge and update properties, and performs quite reasonably. I haven't tried this with the Haskell implementations, but I've benchmarked my Scala implementation of the same idea and, while not as fast as a normal array based HashMap, it seems to be a significance performance improvement over the red black tree based maps I've compared it against and supports pretty comparable amounts of sharing.

Andrew Wagner wrote:
Suppose I have some data structure that is inherently unordered, and I want to use it as a key into a Map. How would I get around this limitation on the client side? Well, the most obvious way is to create some nonsensical Ord instance, like compare _ _ = Eq. Is that the recommended solution?
instance Ord ... where compare _ _ = EQ is a bad idea, because it makes all elements equal and therefore all maps will be empty or singletons. (In order to construct such maps the order is not needed.) The recommended solution is to put "deriving (Eq, Ord)" after your data types for keys. (At least an Eq instance is needed for association lists, i.e. "Prelude.lookup") Cheers Christian
participants (8)
-
Aaron Denney
-
Andrew Wagner
-
Christian Maeder
-
David MacIver
-
Jan-Willem Maessen
-
Johannes Waldmann
-
Neil Mitchell
-
Peter Gavin