
On Thu, Aug 12, 2010 at 11:28 AM, Simon Marlow
On 11/08/2010 17:03, Johan Tibell wrote:
Inspired by the generic maps example at
http://www.haskell.org/haskellwiki/GHC/Indexed_types
I tried to use associated data types to create a generic finite map that unpacks both the key and value into the leaf data constructor.
What you're trying to do is have the compiler generate a whole module for you, including a datatype specialised to certain type paramters, and operations over that type. Just defining a few of the operations isn't enough: they need to be inlined everywhere, essentially you need to recompile Data.Map for each instance.
There needs to be some amount of code generation, but much of the implementation can still be shared. I previously tried to defined the type class as {-# LANGUAGE MultiParamTypeClasses, TypeFamilies #-} module Ex2 where import Prelude hiding (lookup) data MapView k v = TipView | BinView {-# UNPACK #-} !Size !k !v !(Map k v) !(Map k v) class Unbox k v where data Map k v :: * tip :: Map k v bin :: Size -> k -> v -> Map k v -> Map k v -> Map k v view :: Map k v -> MapView k v type Size = Int lookup :: (Ord k, Unbox k v) => k -> Map k v -> Maybe v lookup k m = case view m of TipView -> Nothing BinView _ kx x l r -> case compare k kx of LT -> lookup k l GT -> lookup k r EQ -> Just x {-# INLINE lookup #-} Calling lookup from a different module at a know type gives exactly the Core you'd like to see (unpacked types, no MapView constructors). The compiler would still have to generate the associated data type instance and the method implementations. So I agree it would be nice if this happened automatically, behind the
scenes, by virtue of just mentioning "Map Int Double" (though it would still have to be a typeclass of course, so that you can write polymorphic functions over Maps). Automatic specialisation of this kind can be done by JIT runtimes (e.g. the .NET CLR), because there the code generation and caching of instances can be put under control of the runtime. Here we would have to do it in the compiler, and the difficulty is that the compiler needs to support separate compilation.
C++ supports automatic instantiation and separate compilation. We'd have to included the needed information in the .hi files so we can generate the right instances at the usage site. Perhaps there are other problems (that are somehow solved by the C++ compiler) that I'm not considering.
Rather than try to solve this problem in one go, I would go for a low-tech approach for now: write a TH library to generate the code, and ask the user to declare the versions they need. To make a particular version, the user would say something like
module MapIntDouble (module MapIntDouble) where import TibbeMagicMapGenerator make_me_a_map ...
there's no type class of course, so you can't write functions that work over all specialised Maps. But this at least lets you generate optimised maps for only a little boilerplate, and get the performance boost you were after.
This doesn't quite work though as two MapIntDouble defined in two different libraries are incompatible. This is essentially the same problem as with instance collisions. Cheers, Johan