containers and maps

Hello, The recent discussion regarding One Container Class To Rule Them All got me to thinking about container classes and map functions. ListLike (which is really a pretty nice package, and I would like to see more people use it) provides two map functions, a regular map with type
map :: (item -> item') -> full -> full'
and also rigidMap:
rigidMap :: (item -> item) -> full -> full
I think the utility of rigidMap is clear for anyone who's tried to solve the container class problem. Anyway, I had the idea of taking both maps out of ListLike entirely, and replace them with this:
class (ListLike full item, ListLike full' item') => Map full item full' item' where map :: (item -> item') -> full -> full'
Wouldn't this be better? I see the following benefits: -instances for polymorphic types like List are trivial -instances for monomorphic types (e.g. ByteString) can use the native map implementation. -Instances for types with class restrictions on the elements (e.g. UVector) can also be written. On the downside, the class context is much longer, but you'd only need to write the whole thing when you're actually using map, and then it'll include the ListLike instance automatically. Obviously this isn't so much one container class to rule them all as it is a set of classes that all work together, but as long as it's sensible that's fine with me. Thoughts? John Lato

The monoids package offers something similar to this: mapReduce :: (Generator c, Reducer e m) => (Elem c -> e) -> c -> m If we take (Elem c) to be (item), (e) to be (item'), (c) to be (full), and (m) to be (full'), it's basically the same thing, and offers the same advantages as the ones you listed, as far as I can tell. - Jake

Jake McArthur wrote:
The monoids package offers something similar to this:
mapReduce :: (Generator c, Reducer e m) => (Elem c -> e) -> c -> m
If we take (Elem c) to be (item), (e) to be (item'), (c) to be (full), and (m) to be (full'), it's basically the same thing, and offers the same advantages as the ones you listed, as far as I can tell.
Your example about uvector inspired me to try writing out the necessary instances for uvector: instance UA a => Monoid (UArr a) where mempty = emptyU mappend = appendU instance UA a => Reducer a (UArr a) where unit = singletonU snoc = snocU cons = consU instance UA a => Generator (UArr a) where type Elem (UArr a) = a mapTo f = foldlU (\a -> snoc a . f) - Jake

On 8/13/09, Jake McArthur
Jake McArthur wrote:
The monoids package offers something similar to this:
mapReduce :: (Generator c, Reducer e m) => (Elem c -> e) -> c -> m
If we take (Elem c) to be (item), (e) to be (item'), (c) to be (full), and (m) to be (full'), it's basically the same thing, and offers the same advantages as the ones you listed, as far as I can tell.
Your example about uvector inspired me to try writing out the necessary instances for uvector:
instance UA a => Monoid (UArr a) where mempty = emptyU mappend = appendU
instance UA a => Reducer a (UArr a) where unit = singletonU snoc = snocU cons = consU
instance UA a => Generator (UArr a) where type Elem (UArr a) = a mapTo f = foldlU (\a -> snoc a . f)
This looks to be essentially the same as the 'map' function in ListLike, and suffers from the same problem. It won't have the performance characteristics of the native map functions. Using e.g. ByteStrings, you're recreating a ByteString by snoc'ing elements. This might work with UVector (I intend to try it this evening); I don't know how well the fusion framework will hold up in class dictionaries. Still, the monoids package is very powerful (and I'd completely forgotten it). Perhaps there's another approach that would work? John

John Lato wrote:
This looks to be essentially the same as the 'map' function in ListLike, and suffers from the same problem. It won't have the performance characteristics of the native map functions. Using e.g. ByteStrings, you're recreating a ByteString by snoc'ing elements.
Oh, I see now what you are after. You're right. This wouldn't play nice when creating ByteStrings (which is probably why there is no instance for Reducer Char ByteString).
This might work with UVector (I intend to try it this evening); I don't know how well the fusion framework will hold up in class dictionaries.
Do report back, as I am curious as well.
Still, the monoids package is very powerful (and I'd completely forgotten it). Perhaps there's another approach that would work?
Yay, something to mull over! :) - Jake

On Thu, Aug 13, 2009 at 1:51 PM, Jake McArthur
John Lato wrote:
This might work with UVector (I intend to try it this evening); I don't know how well the fusion framework will hold up in class dictionaries.
Do report back, as I am curious as well.
I have just recently hacked together a small test. The code is at http://inmachina.net/~jwlato/haskell/testUVector.hs The task is to generate a list of 1 million Ints (really 1e6-1), map a multiply by 2, and check to see if any values equal 0. The uvector code is (essentially):
let res = anyU (== ( 0:: Int)) . mapU (* 2) . enumFromToU 1 $ 1000000
and by itself runs in a small fraction of a second. Technically the start and stop values are specified on the command line, but these are the values I predominantly used. Here are results for some other modes: ListLike.map: significantly slower, and runs out of memory (memory request failed) ListLike.rigidMap: appears to be the same as using mapU directly mapReduce: significantly slower (the 1e6 element test completed after about an hour), but ran in a very small amount of memory. It looks like GHC was able to see through the rigidMap function at least, but I don't know if this would still work with the instance defined in another file. Cheers, John

Yeah, my answer on the ByteString front is to make a bunch of Reducers for a
Builder data type that works like the one in Data.Binary.Builder. It
generates a much more suitable Reducer and can reduce Chars, Strings, Strict
and Lazy UTF8-encoded Bytestrings, etc. I'm using that inside of my toy
compiler right now when I need to generate output, like a Lazy ByteString of
source. (My variant Builder -- ok, my mostly cut and pasted Builder, has
been made slightly smarter and is now tuned to work with UTF8 encoded data,
which is what I've been feeding it lately.) I may migrate it into the
monoids library from my current project if there is interest, but I'm trying
to avoid ballooning that up too far in size again and re-entering dependency
hell.
The monoids library drew a pretty hard distinction between things that build
up nicely (Monoids/Reducers) and things that tear down easily (Generators)
but only works with a few things that do both efficiently (i.e.
Seq/FingerTree) For my purposes this has worked rather well, but raw strict
(and to a lesser degree, lazy) ByteStrings make abysmal Reducers. ;)
In the end, bolting a list-like interface on something that can be _made_ to
implement all of the operations but can't be made to implement them remotely
efficiently, seems like it is asking for trouble, bug reports about
performance, and pain.
On the other hand, FingerTrees of wrapped strict ByteStrings work nicely as
a monoidal lazy bytestring-like structure that provides an efficient snoc,
indexing operation, and very efficient slicing for extracting token values
with maximal sharing. I'll be giving a talk at the next Boston Haskell User
Group in a few days about my abuse of those in a terrible hodge-podge
solution of Iteratees, Parsec 3 and monoids, which yields an efficient
parallel/incremental lexer.
I'll see about posting the slides afterwards.
-Edward Kmett
On Thu, Aug 13, 2009 at 4:43 PM, John Lato
On Thu, Aug 13, 2009 at 1:51 PM, Jake McArthur
wrote: John Lato wrote:
This might work with UVector (I intend to try it this evening); I don't know how well the fusion framework will hold up in class dictionaries.
Do report back, as I am curious as well.
I have just recently hacked together a small test. The code is at http://inmachina.net/~jwlato/haskell/testUVector.hs
The task is to generate a list of 1 million Ints (really 1e6-1), map a multiply by 2, and check to see if any values equal 0. The uvector code is (essentially):
let res = anyU (== ( 0:: Int)) . mapU (* 2) . enumFromToU 1 $ 1000000
and by itself runs in a small fraction of a second. Technically the start and stop values are specified on the command line, but these are the values I predominantly used.
Here are results for some other modes:
ListLike.map: significantly slower, and runs out of memory (memory request failed) ListLike.rigidMap: appears to be the same as using mapU directly mapReduce: significantly slower (the 1e6 element test completed after about an hour), but ran in a very small amount of memory.
It looks like GHC was able to see through the rigidMap function at least, but I don't know if this would still work with the instance defined in another file.
Cheers, John _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (3)
-
Edward Kmett
-
Jake McArthur
-
John Lato