holy frijoles: constant factors matter.

these differences in jhc runtimes were the result of a single line change. < total time = 44.32 secs (2216 ticks @ 20 ms) < total alloc = 9,372,965,276 bytes (excludes profiling overheads) ---
total time = 36.74 secs (1837 ticks @ 20 ms) total alloc = 6,621,233,076 bytes (excludes profiling overheads)
the odd thing is, the change should not have changed the order of the algorithm at all. the change was (effectivly) Map.fromAscList . map f . Map.toAscList being changed to Map.map f both are O(n). but those constant factors matter. a lot. in any case, the reason I mention it is because if there were routines for converting between Set and Map that preserved the binary tree structure, I think their use could dramatically speed up many routines. In jhc (and other programs), I convert between them all the time and I'd hate to have to fall back on type Set a = Map a (). Map already has a keysSet, but it does the same inefficient thing building an intermediate list. all that is needed is a setToMap :: (a -> b) -> Set a -> Map a b the keysSet could be made more efficient in the next point release, but I guess adding setToMap would have to wait til the next major one. John -- John Meacham - ⑆repetae.net⑆john⑈

On Wed, Oct 12, 2005 at 07:22:07PM -0700, John Meacham wrote:
these differences in jhc runtimes were the result of a single line change.
< total time = 44.32 secs (2216 ticks @ 20 ms) < total alloc = 9,372,965,276 bytes (excludes profiling overheads) ---
total time = 36.74 secs (1837 ticks @ 20 ms) total alloc = 6,621,233,076 bytes (excludes profiling overheads)
the odd thing is, the change should not have changed the order of the algorithm at all. the change was (effectivly)
Map.fromAscList . map f . Map.toAscList being changed to Map.map f
Note that fromAscList is misnamed: it actually assumes a non-descending list and checks for duplicates. It's fromDistinctAscList that assumes an ascending list.
in any case, the reason I mention it is because if there were routines for converting between Set and Map that preserved the binary tree structure, I think their use could dramatically speed up many routines.
The old Data.Set was a wrapper around the finite map type, which made that sort of thing easy to do. The new one is a specialized copy for efficiency, at the cost of extra code. It's possible, but messy. Will keysSet and setToMap be enough, or will we want intersection and two versions each of difference and isSubmapOf too?
Map already has a keysSet, but it does the same inefficient thing building an intermediate list. all that is needed is a
setToMap :: (a -> b) -> Set a -> Map a b
This would be a useful abstraction even with the simplistic definition setToMap :: (a -> b) -> Set a -> Map a b setToMap f s = Data.Map.fromDistinctAscList [(k, f k) | k <- Data.Set.toAscList s]

On 13/10/05, Ross Paterson
On Wed, Oct 12, 2005 at 07:22:07PM -0700, John Meacham wrote:
these differences in jhc runtimes were the result of a single line change.
< total time = 44.32 secs (2216 ticks @ 20 ms) < total alloc = 9,372,965,276 bytes (excludes profiling overheads) ---
total time = 36.74 secs (1837 ticks @ 20 ms) total alloc = 6,621,233,076 bytes (excludes profiling overheads)
the odd thing is, the change should not have changed the order of the algorithm at all. the change was (effectivly)
Map.fromAscList . map f . Map.toAscList being changed to Map.map f
Note that fromAscList is misnamed: it actually assumes a non-descending list and checks for duplicates. It's fromDistinctAscList that assumes an ascending list.
in any case, the reason I mention it is because if there were routines for converting between Set and Map that preserved the binary tree structure, I think their use could dramatically speed up many routines.
The old Data.Set was a wrapper around the finite map type, which made that sort of thing easy to do. The new one is a specialized copy for efficiency, at the cost of extra code. It's possible, but messy. Will keysSet and setToMap be enough, or will we want intersection and two versions each of difference and isSubmapOf too?
Map already has a keysSet, but it does the same inefficient thing building an intermediate list. all that is needed is a
setToMap :: (a -> b) -> Set a -> Map a b
This would be a useful abstraction even with the simplistic definition
setToMap :: (a -> b) -> Set a -> Map a b setToMap f s = Data.Map.fromDistinctAscList [(k, f k) | k <- Data.Set.toAscList s]
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
I'd just like to mention that I proposed these a while back :) (The post was titled "Data.Map, Data.Set working together") Here are some things which I'd like to see: Restrict a function to a finite map: restrict :: (Ord a) => (a -> b) -> (Set a) -> (Map a b) Restrict one map to another: restrictMap :: (Ord a) => (Map a b) -> (Set a) -> (Map a b) (maybe a typeclass for the above two?) Domain of a map: domain :: (Ord a) => (Map a b) -> (Set a) Range of a map: range :: (Ord a, Ord b) => (Map a b) -> (Set b) It would also be nice to have things like join :: (Ord a) => Set (Set a) -> Set a (this is union, will correspond to monadic join once that's possible :) and powerSet :: Set a -> Set (Set a). - Cale

On Thu, Oct 13, 2005 at 05:28:05AM -0400, Cale Gibbard wrote:
I'd just like to mention that I proposed these a while back :) (The post was titled "Data.Map, Data.Set working together")
Here are some things which I'd like to see:
Restrict a function to a finite map: restrict :: (Ord a) => (a -> b) -> (Set a) -> (Map a b)
Restrict one map to another: restrictMap :: (Ord a) => (Map a b) -> (Set a) -> (Map a b) (maybe a typeclass for the above two?)
Domain of a map: domain :: (Ord a) => (Map a b) -> (Set a)
Range of a map: range :: (Ord a, Ord b) => (Map a b) -> (Set b)
It would also be nice to have things like join :: (Ord a) => Set (Set a) -> Set a (this is union, will correspond to monadic join once that's possible :) and powerSet :: Set a -> Set (Set a).
These all seem useful too. I was thinking that the actual data declarations for Set and Map could be split off to their own module Data.Containers.Private? GHC.SetMap? and Data.Set and Data.Map could export their fully abstract versions but would be able to get at each others internal structure. another thing that is sorely missing is monadic versions of the mapping functions and a FunctorM instance. John -- John Meacham - ⑆repetae.net⑆john⑈
participants (3)
-
Cale Gibbard
-
John Meacham
-
Ross Paterson