Can containers depend on Template Haskell?

Hi all, I'm working on a proposal for adding a Data.HashMap and Data.HashSet type to containers. These types could share most of their implementation with Data.IntMap (and Data.IntSet) but can be (somewhat) more efficiently implemented by specializing IntMap from data IntMap a = Nil | Tip {-# UNPACK #-} !Key a | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a) to data FullList k v = FL !k v (List k v) data List k v = Nil | Cons !k v (List k v) data IntMap a = Nil | Tip {-# UNPACK #-} !Hash {-# UNPACK #-} !(FullList k v) | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a) Unpacking the FullList into the Tip constructor saves one indirection and two words of overhead per key/value pair. One way to achieve this specialization would be to duplicate the implementation of Data.IntMap, but it would be nice to avoid the duplication. Template Haskell would allow most of the implementation to be shared without sacrificing performance. Can containers use Template Haskell? I think the answer is no, but I thought I'd check anyway. Johan

Hi,
Hi all,
I'm working on a proposal for adding a Data.HashMap and Data.HashSet type to containers. These types could share most of their implementation with Data.IntMap (and Data.IntSet) but can be (somewhat) more efficiently implemented by specializing IntMap from
data IntMap a = Nil | Tip {-# UNPACK #-} !Key a | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a)
to
data FullList k v = FL !k v (List k v)
data List k v = Nil | Cons !k v (List k v)
data IntMap a = Nil | Tip {-# UNPACK #-} !Hash {-# UNPACK #-} !(FullList k v) | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a)
Unpacking the FullList into the Tip constructor saves one indirection and two words of overhead per key/value pair.
One way to achieve this specialization would be to duplicate the implementation of Data.IntMap, but it would be nice to avoid the duplication. Template Haskell would allow most of the implementation to be shared without sacrificing performance.
Can containers use Template Haskell? I think the answer is no, but I thought I'd check anyway.
Well, there would be another benefit -- if the containers exports the implementation of also Map and Set as a Q [Dec] (that is what you would need for IntMap, right?), then the clients could easily create specialised datatypes for Set and Map, ie. SetWord working as Set Word, containing unboxed Words. Generally it would just need some tweaking in the Q [Dec]. I was considering doing this in some containers-exposed-by-th package, but if the containers could do it by themselves, that would be awesome. Of course the package could be still compiled by other compilers, it would just not export the implementations. Cheers, Milan

On 11/23/10 4:37 AM, Milan Straka wrote:
I'm working on a proposal for adding a Data.HashMap and Data.HashSet type to containers. These types could share most of their implementation with Data.IntMap (and Data.IntSet) but can be (somewhat) more efficiently implemented by specializing IntMap from [...]
Well, there would be another benefit -- if the containers exports the implementation of also Map and Set as a Q [Dec] (that is what you would need for IntMap, right?), then the clients could easily create specialised datatypes for Set and Map, ie. SetWord working as Set Word, containing unboxed Words. Generally it would just need some tweaking in the Q [Dec].
It sounds like TH is a no-go, but it would be nice to try to make the situation a bit better for this sort of thing. For example, the bytestring-trie package[1] has to duplicate all the bit-twiddling logic of IntMap for (essentially) defining a Word8Map, because Data.IntMap doesn't make that logic public. By and large the logic[2] is independent of the actual size of the thing being tried on, and it features some non-trivial optimizations like using shiftRL# and the benchmarking behind the highestBitMask implementation (which is one of the size-dependent parts). Though perhaps updating containers to make the bit-twiddling functions public should be a different proposal. One thing I'd like to see in the proposed Data.HashMap/HashSet implementations is something sufficiently general that the underlying Data.Map/Set can be abstracted out. For instance, Data.Trie is more performant than Map ByteString; and while HashMap ByteString is more performant still, it ends up failing over to Map ByteString which is suboptimal. It'd be nice to have the hooks set up so that it's trivial to implement HashTrie which is a HashMap ByteString that fails over to Trie instead. HashTrie couldn't take advantage of any of the trie structure for manipulating the map, but it could still share the equality/ordering testing to make lookup/insertion faster. [1] Which needs some loving in order to incorporate the latest optimizations from containers. Hopefully this will happen soon :) [2] http://hackage.haskell.org/packages/archive/bytestring-trie/0.2.2/doc/html/s... -- Live well, ~wren

Milan Straka schrieb:
I was considering doing this in some containers-exposed-by-th package, but if the containers could do it by themselves, that would be awesome. Of course the package could be still compiled by other compilers, it would just not export the implementations.
That's not possible - a package at a specific version must export the same API independent of the compiler, operating system and so on.

On Dec 4, 2010, at 3:07 AM, Henning Thielemann wrote:
That's not possible - a package at a specific version must export the same API independent of the compiler, operating system and so on.
I agree that that is desirable, but I don't understand why it is a "must". I can easily imagine packages where, one or two extra functions are available depending on the arch, os, or compiler version. We currently have the functionality in cabal and GHC to have conditional compilation based on such factors. So, we should recognize that it is possible, and at times, a reasonable design compromise. Or at the very least, even if we believe it is a bad design choice, we should recognize that other library developers may think otherwise. - Mark

Mark Lentczner schrieb:
On Dec 4, 2010, at 3:07 AM, Henning Thielemann wrote:
That's not possible - a package at a specific version must export the same API independent of the compiler, operating system and so on.
I agree that that is desirable, but I don't understand why it is a "must". I can easily imagine packages where, one or two extra functions are available depending on the arch, os, or compiler version. We currently have the functionality in cabal and GHC to have conditional compilation based on such factors. So, we should recognize that it is possible, and at times, a reasonable design compromise. Or at the very least, even if we believe it is a bad design choice, we should recognize that other library developers may think otherwise.
How can I safely import a package A with a specific version, if it exports different APIs according to a bunch of other conditions? I have to copy all those conditions from A.cabal into my own cabal file, even more the conditions of all such packages with an API that depends on more than the version. This is really a FAQ, but I cannot find earlier discussion easily.

On Dec 4, 2010, at 3:25 PM, Henning Thielemann wrote:
How can I safely import a package A with a specific version, if it exports different APIs according to a bunch of other conditions? I have to copy all those conditions from A.cabal into my own cabal file, even more the conditions of all such packages with an API that depends on more than the version.
Yes, that's how. I'm not being dense here. I understand the implications of such a choice in a library. But I can easily imagine libraries where the developers will decide to offer slightly different APIs on different platforms, Since we have template haskell and cpp preprocessing available, these scenarios will arise. - Mark

On Sat, 4 Dec 2010, Mark Lentczner wrote:
I'm not being dense here. I understand the implications of such a choice in a library. But I can easily imagine libraries where the developers will decide to offer slightly different APIs on different platforms, Since we have template haskell and cpp preprocessing available, these scenarios will arise.
Why not putting those things into separate packages like ghc (compiler dependent), unix, alsa (OS dependent), cpuid (processor dependent)? Clean and easy, no need for hacks or complicated Cabal files.

On Sat, 4 Dec 2010, Mark Lentczner wrote:
I'm not being dense here. I understand the implications of such a choice in a library. But I can easily imagine libraries where the developers will decide to offer slightly different APIs on different platforms, Since we have template haskell and cpp preprocessing available, these scenarios will arise.
Unfortunately it happened too often already. Especially if a package depends on user flags, the importer of that package has no chance to find out, what API of the actual installation of the package is. I have written this into http://www.haskell.org/haskellwiki/Cabal/Developer-FAQ#Enabling_additional_f... Given the problems people already have with Cabal and conflicting versions, I really really plead for simple packages. One package for each task. If a package can be compiled on different platforms, that's great. But it should provide a unified API in order to simplify the work for other package authors.

On 23/11/2010, at 09:25, Johan Tibell wrote:
Can containers use Template Haskell? I think the answer is no, but I thought I'd check anyway.
I don't think so. When bulding GHC, the libraries are compiled by ghc-stage1 but TH isn't available until stage 2. Moreover, template-haskell depends on containers. Roman

containers compiles cleanly with jhc right now, so if you make it depend on
non h2010 features then it would be good to protect them with an #ifdef or
some other mechanism.
John
On Tue, Nov 23, 2010 at 1:25 AM, Johan Tibell
Hi all,
I'm working on a proposal for adding a Data.HashMap and Data.HashSet type to containers. These types could share most of their implementation with Data.IntMap (and Data.IntSet) but can be (somewhat) more efficiently implemented by specializing IntMap from
data IntMap a = Nil | Tip {-# UNPACK #-} !Key a | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a)
to
data FullList k v = FL !k v (List k v)
data List k v = Nil | Cons !k v (List k v)
data IntMap a = Nil | Tip {-# UNPACK #-} !Hash {-# UNPACK #-} !(FullList k v) | Bin {-# UNPACK #-} !Prefix {-# UNPACK #-} !Mask !(IntMap a) !(IntMap a)
Unpacking the FullList into the Tip constructor saves one indirection and two words of overhead per key/value pair.
One way to achieve this specialization would be to duplicate the implementation of Data.IntMap, but it would be nice to avoid the duplication. Template Haskell would allow most of the implementation to be shared without sacrificing performance.
Can containers use Template Haskell? I think the answer is no, but I thought I'd check anyway.
Johan -- Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Johan Tibell schrieb:
Can containers use Template Haskell? I think the answer is no, but I thought I'd check anyway.
I repeat my idea that it would be nice to have a static meta-programming or generic programming feature. That is, you write a piece of meta-code in a pragma and using a tool, the meta-code is expanded to Haskell code. This way you will need the metaprogramming feature only when you change the meta-code. The complete source code however may be even Haskell 98 and can be compiled on a lot of compilers. http://www.haskell.org/pipermail/haskell-cafe/2010-January/072139.html
participants (8)
-
Henning Thielemann
-
Henning Thielemann
-
Johan Tibell
-
John Meacham
-
Mark Lentczner
-
Milan Straka
-
Roman Leshchinskiy
-
wren ng thornton