Announce: EnumMap-0.0.1

Hi List, I've uploaded a first version of EnumMap to hackage. EnumMap is a generalization of IntMap that constrains the key to Enum rather than forcing it to be Int. I have no idea what impact this has on performance, but it still passes all the tests that ship with IntMap. (My guess is that performance will be similar/identical unless I've missed something.) If this package competes in speed (especially with an Int key) with IntMap, I'd expect this could be merged into the containers package as an IntMap replacement. The source comments/copyright/license hasn't been changed since this is a derivative work of the original IntMap package. http://hackage.haskell.org/package/EnumMap Comments/benchmarks/criticism is appreciated. Merging this in with containers eventually would be more appreciated. :) John Van Enk

On Sat, 8 Aug 2009, John Van Enk wrote:
Hi List,
I've uploaded a first version of EnumMap to hackage.
EnumMap is a generalization of IntMap that constrains the key to Enum rather than forcing it to be Int. I have no idea what impact this has on performance, but it still passes all the tests that ship with IntMap. (My guess is that performance will be similar/identical unless I've missed something.)
Could that be implemented as wrapper around IntMap?

That's originally how I was thinking about doing it, but I think that requires one to re-implement all the functions available in Data.IntMap as simple wrappers that do the toEnum/fromEnum conversion. I think making it into its own module is a little cleaner. The conversion from EnumMap to IntMap is substantially cleaner than from IntMap to EnumMap:
type IntMap v = EnumMap Int v
/jve
On Sat, Aug 8, 2009 at 4:41 PM, Henning
Thielemann
On Sat, 8 Aug 2009, John Van Enk wrote:
Hi List,
I've uploaded a first version of EnumMap to hackage.
EnumMap is a generalization of IntMap that constrains the key to Enum rather than forcing it to be Int. I have no idea what impact this has on performance, but it still passes all the tests that ship with IntMap. (My guess is that performance will be similar/identical unless I've missed something.)
Could that be implemented as wrapper around IntMap?

On Sat, 8 Aug 2009, John Van Enk wrote:
That's originally how I was thinking about doing it, but I think that requires one to re-implement all the functions available in Data.IntMap as simple wrappers that do the toEnum/fromEnum conversion. I think making it into its own module is a little cleaner. The conversion from EnumMap to IntMap is substantially cleaner than from IntMap to EnumMap:
type IntMap v = EnumMap Int v
Can you implement EnumMap in terms of the Enum methods, without many conversions to Int? I mean, if you often convert to Int and back then you could achieve the same on top of IntMap. Generally I prefer the strategy "from simple to complex". I consider Enum to be a "wrapper" around Int. http://haskell.org/haskellwiki/Simple_to_complex

What if we say that Enum a generalization, rather than a wrapper, of Int?
If the benchmarks are even, is there a reason to use the more specific
structure rather than the general one? I don't know if Enum being
"more complex" outweighs the benefits of it being "more general" (if
the EnumMap matches IntMap for speed).
Thoughts?
On Sat, Aug 8, 2009 at 6:11 PM, Henning
Thielemann
On Sat, 8 Aug 2009, John Van Enk wrote:
That's originally how I was thinking about doing it, but I think that requires one to re-implement all the functions available in Data.IntMap as simple wrappers that do the toEnum/fromEnum conversion. I think making it into its own module is a little cleaner. The conversion from EnumMap to IntMap is substantially cleaner than from IntMap to EnumMap:
type IntMap v = EnumMap Int v
Can you implement EnumMap in terms of the Enum methods, without many conversions to Int? I mean, if you often convert to Int and back then you could achieve the same on top of IntMap. Generally I prefer the strategy "from simple to complex". I consider Enum to be a "wrapper" around Int. http://haskell.org/haskellwiki/Simple_to_complex

There exists a small but measurable performance hit for at least one
test case (using Int as keys, obviously). Perhaps the bias would be
the other way if we were comparing EnumMap to an IntMap wrapped with
to/from Enum.
Thomas
-- Using Data.IntMap
[tommd@Mavlo Test]$ ghc --make -O2 im.hs
[1 of 1] Compiling Main ( im.hs, im.o )
Linking im ...
[tommd@Mavlo Test]$ ./im
buildMap: 0.625563s
lookupMap: 0.176478s
[tommd@Mavlo Test]$ ./im
buildMap: 0.613668s
lookupMap: 0.174151s
[tommd@Mavlo Test]$ ./im
buildMap: 0.607961s
lookupMap: 0.175584s
-- Using Data.EnumMap
[tommd@Mavlo Test]$ vi im.hs
[tommd@Mavlo Test]$ ghc --make -O2 im.hs
[1 of 1] Compiling Main ( im.hs, im.o )
Linking im ...
[tommd@Mavlo Test]$ ./im
buildMap: 0.705458s
lookupMap: 0.229307s
[tommd@Mavlo Test]$ ./im
buildMap: 0.71757s
lookupMap: 0.231273s
[tommd@Mavlo Test]$ ./im
buildMap: 0.685333s
lookupMap: 0.23883s
Code (sorry, its ugly I know)
{-# LANGUAGE BangPatterns #-}
module Main where
import Data.Time
import qualified Data.EnumMap as E
type IntMap = E.EnumMap Int
-- import qualified Data.IntMap as E
-- type IntMap = E.IntMap
main = do
bench "buildMap" buildMap
!e <- buildMap
bench "lookupMap" (lookupMap e)
bench str func = do
start <- getCurrentTime
!x <- func
finish <- getCurrentTime
let diff = diffUTCTime finish start
putStrLn $ str ++ ":\t" ++ (show diff)
keys = [0..1000000]
buildMap :: IO (IntMap Int)
buildMap = do
return $ go keys keys E.empty
where
go [] _ !m = m
go _ [] !m = m
go (k:ks) (e:es) m = go ks es (E.insert k e m)
lookupMap m = do
check keys m
where
check [] _ = return ()
check (k:ks) m =
if (E.lookup k m /= Just k)
then error "blah"
else check ks m
On Sat, Aug 8, 2009 at 4:02 PM, John Van Enk
What if we say that Enum a generalization, rather than a wrapper, of Int?
If the benchmarks are even, is there a reason to use the more specific structure rather than the general one? I don't know if Enum being "more complex" outweighs the benefits of it being "more general" (if the EnumMap matches IntMap for speed).
Thoughts?
On Sat, Aug 8, 2009 at 6:11 PM, Henning Thielemann
wrote: On Sat, 8 Aug 2009, John Van Enk wrote:
That's originally how I was thinking about doing it, but I think that requires one to re-implement all the functions available in Data.IntMap as simple wrappers that do the toEnum/fromEnum conversion. I think making it into its own module is a little cleaner. The conversion from EnumMap to IntMap is substantially cleaner than from IntMap to EnumMap:
type IntMap v = EnumMap Int v
Can you implement EnumMap in terms of the Enum methods, without many conversions to Int? I mean, if you often convert to Int and back then you could achieve the same on top of IntMap. Generally I prefer the strategy "from simple to complex". I consider Enum to be a "wrapper" around Int. http://haskell.org/haskellwiki/Simple_to_complex
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Sat, Aug 08, 2009 at 04:14:15PM -0700, Thomas DuBuisson wrote:
There exists a small but measurable performance hit for at least one test case (using Int as keys, obviously). Perhaps the bias would be the other way if we were comparing EnumMap to an IntMap wrapped with to/from Enum.
Perhaps some SPECIALIZE pragmas would help here. -- Felipe.

On Sat, Aug 8, 2009 at 5:30 PM, Felipe Lessa
On Sat, Aug 08, 2009 at 04:14:15PM -0700, Thomas DuBuisson wrote:
There exists a small but measurable performance hit for at least one test case (using Int as keys, obviously). Perhaps the bias would be the other way if we were comparing EnumMap to an IntMap wrapped with to/from Enum.
Perhaps some SPECIALIZE pragmas would help here.
Actually I tried that by adding SPECIALIZE to insert, insertN and lookup. it seemed to make the insert benchmark competitive but not lookup. Thomas

How bad is the lookup compared to normal?
On Sat, Aug 8, 2009 at 9:02 PM, Thomas
DuBuisson
On Sat, Aug 8, 2009 at 5:30 PM, Felipe Lessa
wrote: On Sat, Aug 08, 2009 at 04:14:15PM -0700, Thomas DuBuisson wrote:
There exists a small but measurable performance hit for at least one test case (using Int as keys, obviously). Perhaps the bias would be the other way if we were comparing EnumMap to an IntMap wrapped with to/from Enum.
Perhaps some SPECIALIZE pragmas would help here.
Actually I tried that by adding SPECIALIZE to insert, insertN and lookup. it seemed to make the insert benchmark competitive but not lookup.
Thomas _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Inflating the number of elements in the test, I see:
IntMap
inserts: 5.3 seconds
lookups: 2.0 seconds
EnumMap
inserts: 6.1 sec (15% slower)
lookups: 2.5 sec (25% slower)
EnumMap with SPECIALIZE of:
{-# SPECIALIZE join :: Prefix -> EnumMap Int a -> Prefix -> EnumMap
Int a -> EnumMap Int a #-}
{-# SPECIALIZE lookup :: Int -> EnumMap Int a -> Maybe a #-}
{-# SPECIALIZE lookupN :: Nat -> EnumMap Int a -> Maybe a #-}
{-# SPECIALIZE insert :: Int -> a -> EnumMap Int a -> EnumMap Int a #-}
inserts: 5.3 seconds (dead on)
lookups: 2.6 seconds (owch!)
Additionally specializing the functions used in lookup{,N} doesn't
help. I tried inlining (via INLINE) a couple things but that only
made performance notably worse.
Thomas
On Sat, Aug 8, 2009 at 8:29 PM, John Van Enk
How bad is the lookup compared to normal?
On Sat, Aug 8, 2009 at 9:02 PM, Thomas DuBuisson
wrote: On Sat, Aug 8, 2009 at 5:30 PM, Felipe Lessa
wrote: On Sat, Aug 08, 2009 at 04:14:15PM -0700, Thomas DuBuisson wrote:
There exists a small but measurable performance hit for at least one test case (using Int as keys, obviously). Perhaps the bias would be the other way if we were comparing EnumMap to an IntMap wrapped with to/from Enum.
Perhaps some SPECIALIZE pragmas would help here.
Actually I tried that by adding SPECIALIZE to insert, insertN and lookup. it seemed to make the insert benchmark competitive but not lookup.
Thomas _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

http://github.com/sw17ch/EnumMap/tree/master
Perhaps you could patch what I have? :)
On Sun, Aug 9, 2009 at 12:08 AM, Thomas
DuBuisson
Inflating the number of elements in the test, I see:
IntMap inserts: 5.3 seconds lookups: 2.0 seconds
EnumMap inserts: 6.1 sec (15% slower) lookups: 2.5 sec (25% slower)
EnumMap with SPECIALIZE of: {-# SPECIALIZE join :: Prefix -> EnumMap Int a -> Prefix -> EnumMap Int a -> EnumMap Int a #-} {-# SPECIALIZE lookup :: Int -> EnumMap Int a -> Maybe a #-} {-# SPECIALIZE lookupN :: Nat -> EnumMap Int a -> Maybe a #-} {-# SPECIALIZE insert :: Int -> a -> EnumMap Int a -> EnumMap Int a #-} inserts: 5.3 seconds (dead on) lookups: 2.6 seconds (owch!)
Additionally specializing the functions used in lookup{,N} doesn't help. I tried inlining (via INLINE) a couple things but that only made performance notably worse.
Thomas
On Sat, Aug 8, 2009 at 8:29 PM, John Van Enk
wrote: How bad is the lookup compared to normal?
On Sat, Aug 8, 2009 at 9:02 PM, Thomas DuBuisson
wrote: On Sat, Aug 8, 2009 at 5:30 PM, Felipe Lessa
wrote: On Sat, Aug 08, 2009 at 04:14:15PM -0700, Thomas DuBuisson wrote:
There exists a small but measurable performance hit for at least one test case (using Int as keys, obviously). Perhaps the bias would be the other way if we were comparing EnumMap to an IntMap wrapped with to/from Enum.
Perhaps some SPECIALIZE pragmas would help here.
Actually I tried that by adding SPECIALIZE to insert, insertN and lookup. it seemed to make the insert benchmark competitive but not lookup.
Thomas _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Inlining natFromInt and intFromNat improves things considerably. Using Thomas DuBuisson's benchmarking code (with a larger number of keys): IntMap: buildMap: 12.2 lookupMap: 2.7 Original EnumMap: buildMap: 13.0s lookupMap: 3.6s EnumMap built with -O2 (not sure the implications of building libraries with -O2) buildMap: 12.9s lookupMap: 2.7s effectively the same time for inserts compared with the original EnumMap improved performance for lookups (effectivly the same as IntMap) EnumMap built with -O2, inline on natFromInt and intFromNat and specialize for insert, and join: (doesn't appear to get a speedup with specialize on lookup and lookupN if EnumMap is built with -O2) buildMap: 12.2s lookupMap: 2.7s The same performance as IntMap! I'm kinda dissapointed ghc can't figure this out automatically, fromEnum/toEnum for Ints is just id. Oh well, a few more annotations in the library and wouldn't see any reason why EnumMap would be slower that IntMap. I'll submit a patch I also tried EnumMap but with a newtyped Int for the keys. I get the same performance for lookups as Int, but the inserts are slower. buildMap: 12.8s lookupMap: 2.7s My guess is that the specialize pragma on insert isn't getting triggered on the newtype (which I think it should) So I have a suggestion for a ghc optimization: Unwrap newtypes before specialization so that the specializer can take advantage of the underlying type. - Job On Sun, Aug 9, 2009 at 12:08 AM, Thomas DuBuisson < thomas.dubuisson@gmail.com> wrote:
Inflating the number of elements in the test, I see:
IntMap inserts: 5.3 seconds lookups: 2.0 seconds
EnumMap inserts: 6.1 sec (15% slower) lookups: 2.5 sec (25% slower)
EnumMap with SPECIALIZE of: {-# SPECIALIZE join :: Prefix -> EnumMap Int a -> Prefix -> EnumMap Int a -> EnumMap Int a #-} {-# SPECIALIZE lookup :: Int -> EnumMap Int a -> Maybe a #-} {-# SPECIALIZE lookupN :: Nat -> EnumMap Int a -> Maybe a #-} {-# SPECIALIZE insert :: Int -> a -> EnumMap Int a -> EnumMap Int a #-} inserts: 5.3 seconds (dead on) lookups: 2.6 seconds (owch!)
Additionally specializing the functions used in lookup{,N} doesn't help. I tried inlining (via INLINE) a couple things but that only made performance notably worse.
Thomas
On Sat, Aug 8, 2009 at 8:29 PM, John Van Enk
wrote: How bad is the lookup compared to normal?
On Sat, Aug 8, 2009 at 9:02 PM, Thomas DuBuisson
wrote: On Sat, Aug 8, 2009 at 5:30 PM, Felipe Lessa
wrote: On Sat, Aug 08, 2009 at 04:14:15PM -0700, Thomas DuBuisson wrote:
There exists a small but measurable performance hit for at least one test case (using Int as keys, obviously). Perhaps the bias would be the other way if we were comparing EnumMap to an IntMap wrapped with to/from Enum.
Perhaps some SPECIALIZE pragmas would help here.
Actually I tried that by adding SPECIALIZE to insert, insertN and lookup. it seemed to make the insert benchmark competitive but not lookup.
Thomas _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Sun, Aug 9, 2009 at 11:44 AM, Job Vranish
Inlining natFromInt and intFromNat improves things considerably. EnumMap built with -O2 (not sure the implications of building libraries with -O2)
Nice work Job, for some reason it didn't occur to me that the .cabal omitted -O2. The remaining question I have about the specialize pragma is where does this end? Should it only be specialized for Int? Perhaps there are other common instance of Enum that people will end up using for which we should add a SPECIALIZE. With N types and M functions we are talking about N x M rules, which could get ugly fast. Thomas

I've been trying to implement EnumMap as a wrapper for IntMap. Here's the first problem I ran into:
IntMap.insertWithKey :: (Key -> a -> a -> a) -> Key -> a -> IntMap a -> IntMap a
I'd like to translate this to something like:
EnumMap.insertWithKey :: Enum k => (Key k -> a -> a -> a) -> Key k -> a -> EnumMap k a -> EnumMap k a
My initial thought was just to make it a normal wrapper:
EnumMap.insertWithKey f k v m = IntMap.insertWithKey f (unKey k) v (unEnumMap m)
The obvious problem here is the type of `f' expected by the wrapper
function and the internal function. Either we force the wrapper to
take a function that takes an Int as the first parameter, or we
rewrite the logic of insertWithKey to allow us to use the proper (Key
k) type.
I don't see an obvious way around this--am I missing something?
On Sat, Aug 8, 2009 at 4:41 PM, Henning
Thielemann
On Sat, 8 Aug 2009, John Van Enk wrote:
Hi List,
I've uploaded a first version of EnumMap to hackage.
EnumMap is a generalization of IntMap that constrains the key to Enum rather than forcing it to be Int. I have no idea what impact this has on performance, but it still passes all the tests that ship with IntMap. (My guess is that performance will be similar/identical unless I've missed something.)
Could that be implemented as wrapper around IntMap?

Allow me to answer my own question:
EnumMap.insertWithKey f k v m = IntMap.insertWithKey (f . Key . toEnum) (unKey k) v (unEnumMap m)
On Mon, Aug 10, 2009 at 2:32 PM, John Van Enk
I've been trying to implement EnumMap as a wrapper for IntMap. Here's the first problem I ran into:
IntMap.insertWithKey :: (Key -> a -> a -> a) -> Key -> a -> IntMap a -> IntMap a
I'd like to translate this to something like:
EnumMap.insertWithKey :: Enum k => (Key k -> a -> a -> a) -> Key k -> a -> EnumMap k a -> EnumMap k a
My initial thought was just to make it a normal wrapper:
EnumMap.insertWithKey f k v m = IntMap.insertWithKey f (unKey k) v (unEnumMap m)
The obvious problem here is the type of `f' expected by the wrapper function and the internal function. Either we force the wrapper to take a function that takes an Int as the first parameter, or we rewrite the logic of insertWithKey to allow us to use the proper (Key k) type.
I don't see an obvious way around this--am I missing something?
On Sat, Aug 8, 2009 at 4:41 PM, Henning Thielemann
wrote: On Sat, 8 Aug 2009, John Van Enk wrote:
Hi List,
I've uploaded a first version of EnumMap to hackage.
EnumMap is a generalization of IntMap that constrains the key to Enum rather than forcing it to be Int. I have no idea what impact this has on performance, but it still passes all the tests that ship with IntMap. (My guess is that performance will be similar/identical unless I've missed something.)
Could that be implemented as wrapper around IntMap?

For the sake of testing both options, here's a branch with the wrapper
code instead of the replacement/integrated code:
http://github.com/sw17ch/EnumMap/tree/wrapper
On Sat, Aug 8, 2009 at 4:41 PM, Henning
Thielemann
On Sat, 8 Aug 2009, John Van Enk wrote:
Hi List,
I've uploaded a first version of EnumMap to hackage.
EnumMap is a generalization of IntMap that constrains the key to Enum rather than forcing it to be Int. I have no idea what impact this has on performance, but it still passes all the tests that ship with IntMap. (My guess is that performance will be similar/identical unless I've missed something.)
Could that be implemented as wrapper around IntMap?

John Van Enk
EnumMap is a generalization of IntMap that constrains the key to Enum rather than forcing it to be Int. I have no idea what impact this has on performance,
Will it have an impact on correctness? There are some funky Enum instances around: Prelude> map fromEnum [1,1.5,2] [1,1,2] Prelude Data.Int> fromEnum (10000000000 :: Int64) *** Exception: Enum.fromEnum{Int64}: value (10000000000) is outside of Int's bounds (-2147483648,2147483647) -k -- If I haven't seen further, it is by standing in the footprints of giants

On Mon, Aug 10, 2009 at 03:24:36PM +0200, Ketil Malde wrote:
John Van Enk
writes: EnumMap is a generalization of IntMap that constrains the key to Enum rather than forcing it to be Int. I have no idea what impact this has on performance,
Will it have an impact on correctness? There are some funky Enum instances around:
IMO it's implicit that keys overwrite eachother whenever their 'fromEnum' is equal, however that may be spoken in the docs. Depending on what you want you may or may not desire to have this behaviour, but this is how this data structure works. So, yes, it's correct, just different. :) -- Felipe.

Felipe Lessa
There are some funky Enum instances around:
IMO it's implicit that keys overwrite eachother whenever their 'fromEnum' is equal, however that may be spoken in the docs.
I couldn't find anything explicit in the documentation. I'd suggest a clear note at the top, dismissing the (IMO natural) notion that "EnumMap k v" behaves like "Map k v" (which was true for IntMap and Map Int, I believe). And perhaps also note that you will get exceptions for values outside the Enum range. It strikes me that using Bits instead of Enum might be more likely to be what people want in many cases - but perhaps that would be too slow? Also that Enum really should map to Integer, but again, that's a speed issue.¹ One could also question the sanity of using e.g. floating point values as keys, but Map supports this, so who am I to judge. (Also, a minor documentation niggle is that Haskell only guarantees 30 bits for an Int, it's GHC that uses Ints of 32 and 64 bits. One could argue that it is the report that should be fixed here, unless one can imagine a program that depends on correct modulo arithmetic with an unknown quotient.) -k ¹ This of course migth give the careless reader of the Report the impression that the Haskell community values speed over correctness, and thus that we actually are aiming for popularity and mainstream recognition after all. -- If I haven't seen further, it is by standing in the footprints of giants

On Wed, Aug 12, 2009 at 2:09 AM, Ketil Malde
Felipe Lessa
writes: There are some funky Enum instances around:
IMO it's implicit that keys overwrite eachother whenever their 'fromEnum' is equal, however that may be spoken in the docs.
I couldn't find anything explicit in the documentation. I'd suggest a clear note at the top, dismissing the (IMO natural) notion that "EnumMap k v" behaves like "Map k v" (which was true for IntMap and Map Int, I believe).
I haven't updated any of the documentation from IntMap over s/IntMap/EnumMap/g.
And perhaps also note that you will get exceptions for values outside the Enum range.
I'd think that part is obvious.
It strikes me that using Bits instead of Enum might be more likely to be what people want in many cases - but perhaps that would be too slow? Also that Enum really should map to Integer, but again, that's a speed issue.¹
I think Enum is a little more natural--perhaps I'm wrong.

On Wed, Aug 12, 2009 at 9:16 AM, John Van Enk
On Wed, Aug 12, 2009 at 2:09 AM, Ketil Malde
wrote: And perhaps also note that you will get exceptions for values outside the Enum range.
I'd think that part is obvious.
That depends on what "outside the Enum range" means. You'll get an
exception if you somehow get an Int key in the map which doesn't
correspond to any value in the enum, but you don't get an exception if
you try to pass in, say, a large Integer.
Prelude> fromEnum (2^32)
0
In essence, you're using Enum as a hash function, but not making any
provision for hash collisions.
--
Dave Menendez

On Wed, Aug 12, 2009 at 11:34 AM, David Menendez
On Wed, Aug 12, 2009 at 9:16 AM, John Van Enk
wrote: On Wed, Aug 12, 2009 at 2:09 AM, Ketil Malde
wrote: And perhaps also note that you will get exceptions for values outside the Enum range.
I'd think that part is obvious.
That depends on what "outside the Enum range" means. You'll get an exception if you somehow get an Int key in the map which doesn't correspond to any value in the enum...
We should be protected by the type system against unmatched Int's in the map as long as you have sane Enum instances.
... but you don't get an exception if you try to pass in, say, a large Integer. Prelude> fromEnum (2^32) 0
In essence, you're using Enum as a hash function, but not making any provision for hash collisions.
Unless I'm mistaken, the Enum typeclass _is_ a hash function who's keyspace is the range of Int that doesn't make any provisions for collisions.

On Wed, Aug 12, 2009 at 12:07 PM, John Van Enk
On Wed, Aug 12, 2009 at 11:34 AM, David Menendez
wrote: On Wed, Aug 12, 2009 at 9:16 AM, John Van Enk
wrote: On Wed, Aug 12, 2009 at 2:09 AM, Ketil Malde
wrote: And perhaps also note that you will get exceptions for values outside the Enum range.
I'd think that part is obvious.
That depends on what "outside the Enum range" means. You'll get an exception if you somehow get an Int key in the map which doesn't correspond to any value in the enum...
We should be protected by the type system against unmatched Int's in the map as long as you have sane Enum instances.
One would hope so.
... but you don't get an exception if you try to pass in, say, a large Integer. Prelude> fromEnum (2^32) 0
In essence, you're using Enum as a hash function, but not making any provision for hash collisions.
Unless I'm mistaken, the Enum typeclass _is_ a hash function who's keyspace is the range of Int that doesn't make any provisions for collisions.
Hash functions traditionally map integers to integers, so I would
describe fromEnum as "like" a hash function, but that's not important.
And yes, hash functions collide, which is why hash tables employ
various methods for distinguishing keys that hash to the same values.
EnumMap silently passes this responsibility to the user, without even
a note in the documentation.
--
Dave Menendez

EnumMap silently passes this responsibility to the user, without even a note in the documentation.
Like I've said, I made no modifications to the documentation other
than replacing IntMap with EnumMap. Should the community show more
interest in the EnumMap, such a change will show up in the docs.
On Wed, Aug 12, 2009 at 12:58 PM, David Menendez
On Wed, Aug 12, 2009 at 12:07 PM, John Van Enk
wrote: On Wed, Aug 12, 2009 at 11:34 AM, David Menendez
wrote: On Wed, Aug 12, 2009 at 9:16 AM, John Van Enk
wrote: On Wed, Aug 12, 2009 at 2:09 AM, Ketil Malde
wrote: And perhaps also note that you will get exceptions for values outside the Enum range.
I'd think that part is obvious.
That depends on what "outside the Enum range" means. You'll get an exception if you somehow get an Int key in the map which doesn't correspond to any value in the enum...
We should be protected by the type system against unmatched Int's in the map as long as you have sane Enum instances.
One would hope so.
... but you don't get an exception if you try to pass in, say, a large Integer. Prelude> fromEnum (2^32) 0
In essence, you're using Enum as a hash function, but not making any provision for hash collisions.
Unless I'm mistaken, the Enum typeclass _is_ a hash function who's keyspace is the range of Int that doesn't make any provisions for collisions.
Hash functions traditionally map integers to integers, so I would describe fromEnum as "like" a hash function, but that's not important.
And yes, hash functions collide, which is why hash tables employ various methods for distinguishing keys that hash to the same values. EnumMap silently passes this responsibility to the user, without even a note in the documentation.
-- Dave Menendez
http://www.eyrie.org/~zednenem/

On Wed, Aug 12, 2009 at 01:03:55PM -0400, John Van Enk wrote:
EnumMap silently passes this responsibility to the user, without even a note in the documentation.
Like I've said, I made no modifications to the documentation other than replacing IntMap with EnumMap. Should the community show more interest in the EnumMap, such a change will show up in the docs.
I'm showing interest in EnumMap :). I have some IntMap's that provide no type safety about what is inserted and EnumMap solves that problem as nicely as possible. Right now I don't really care about the doumentation problems, but I would like to see the SPECIALIZE and INLINE improvements, when will there be a new version? Thanks a lot! -- Felipe.

I'll hunt down those changes and push something new to hackage. :)
On Mon, Sep 14, 2009 at 12:27 PM, Felipe Lessa
On Wed, Aug 12, 2009 at 01:03:55PM -0400, John Van Enk wrote:
EnumMap silently passes this responsibility to the user, without even a note in the documentation.
Like I've said, I made no modifications to the documentation other than replacing IntMap with EnumMap. Should the community show more interest in the EnumMap, such a change will show up in the docs.
I'm showing interest in EnumMap :). I have some IntMap's that provide no type safety about what is inserted and EnumMap solves that problem as nicely as possible.
Right now I don't really care about the doumentation problems, but I would like to see the SPECIALIZE and INLINE improvements, when will there be a new version?
Thanks a lot!
-- Felipe. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

http://hackage.haskell.org/package/EnumMap
Changes pushed. Job Vranish added the SPECIALIZE pragmas, and I believe he
has more data concerning how much this helps.
Note, the git repo is here: http://github.com/sw17ch/EnumMap
/jve
On Mon, Sep 14, 2009 at 12:27 PM, Felipe Lessa
On Wed, Aug 12, 2009 at 01:03:55PM -0400, John Van Enk wrote:
EnumMap silently passes this responsibility to the user, without even a note in the documentation.
Like I've said, I made no modifications to the documentation other than replacing IntMap with EnumMap. Should the community show more interest in the EnumMap, such a change will show up in the docs.
I'm showing interest in EnumMap :). I have some IntMap's that provide no type safety about what is inserted and EnumMap solves that problem as nicely as possible.
Right now I don't really care about the doumentation problems, but I would like to see the SPECIALIZE and INLINE improvements, when will there be a new version?
Thanks a lot!
-- Felipe. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Mon, Sep 14, 2009 at 11:32:04PM -0400, John Van Enk wrote:
http://hackage.haskell.org/package/EnumMap
Changes pushed. Job Vranish added the SPECIALIZE pragmas, and I believe he has more data concerning how much this helps.
Thanks a lot! :) -- Felipe.

David Menendez
That depends on what "outside the Enum range" means. You'll get an exception if you somehow get an Int key in the map which doesn't correspond to any value in the enum, but you don't get an exception if you try to pass in, say, a large Integer.
Prelude> fromEnum (2^32) 0
Yes, but: Prelude Data.Int> fromEnum (2^32 :: Int64) *** Exception: Enum.fromEnum{Int64}: value (4294967296) is outside of Int's bounds (-2147483648,2147483647) so apparently, different Enum instances deal with this differently.
From GHC.Num:
instance Enum Integer where [...] fromEnum n = I# (toInt# n)
From GHC.Int:
instance Enum Int64 where [...] fromEnum x@(I64# x#) | x >= fromIntegral (minBound::Int) && x <= fromIntegral (maxBound::Int) = I# (int64ToInt# x#) | otherwise = fromEnumError "Int64" x -k -- If I haven't seen further, it is by standing in the footprints of giants

It strikes me that using Bits instead of Enum might be more likely to be what people want in many cases
It wouldn't hurt for you (or someone) to implement this. I'm sure it
would be useful.
On Wed, Aug 12, 2009 at 2:09 AM, Ketil Malde
Felipe Lessa
writes: There are some funky Enum instances around:
IMO it's implicit that keys overwrite eachother whenever their 'fromEnum' is equal, however that may be spoken in the docs.
I couldn't find anything explicit in the documentation. I'd suggest a clear note at the top, dismissing the (IMO natural) notion that "EnumMap k v" behaves like "Map k v" (which was true for IntMap and Map Int, I believe).
And perhaps also note that you will get exceptions for values outside the Enum range.
It strikes me that using Bits instead of Enum might be more likely to be what people want in many cases - but perhaps that would be too slow? Also that Enum really should map to Integer, but again, that's a speed issue.¹
One could also question the sanity of using e.g. floating point values as keys, but Map supports this, so who am I to judge.
(Also, a minor documentation niggle is that Haskell only guarantees 30 bits for an Int, it's GHC that uses Ints of 32 and 64 bits. One could argue that it is the report that should be fixed here, unless one can imagine a program that depends on correct modulo arithmetic with an unknown quotient.)
-k
¹ This of course migth give the careless reader of the Report the impression that the Haskell community values speed over correctness, and thus that we actually are aiming for popularity and mainstream recognition after all. -- If I haven't seen further, it is by standing in the footprints of giants _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (7)
-
David Menendez
-
Felipe Lessa
-
Henning Thielemann
-
Job Vranish
-
John Van Enk
-
Ketil Malde
-
Thomas DuBuisson