implementing python-style dictionary in Haskell

Dear all, I am trying to implement the python-style dictionary in Haskell. Python dictionary is a data structure that maps one key to one value. For instance, a python dictionary d = {'a':1, 'b':2 } maps key 'a' to 1, 'b' to 2. Python dictionary allows for update. e.g. the statement d['a'] = 3 changes the value pointed by 'a' from 1 to 3. Internally, python dictionary is implemented using hash table. My first attempt is to use Data.HashTable. However it was immediately abandoned, as I realize the memory usage is unreasonably huge. == SECOND ATTEMPT == My second attempt is to use Data.Map {-# OPTIONS_GHC -fglasgow-exts #-} module Main where import qualified Data.HashTable as HT import qualified Data.IntMap as IM import qualified Data.Map as DM import qualified Data.ByteString.Char8 as S import Data.Char -- the Dict type class class Dict d k v | d -> k v where empty :: d insert :: k -> v -> d -> d lookup :: k -> d -> Maybe v update :: k -> v -> d -> d -- Let's use string as key type Key = String -- insert key-value pairs into a dictionary fromList :: Dict d k a => [(k,a)] -> d fromList l = foldl (\d (key,val) -> insert key val d) empty l instance Dict (DM.Map S.ByteString a) Key a where empty = DM.empty insert key val dm = let packed_key = S.pack key in DM.insert packed_key val dm lookup key dm = let packed_key = S.pack key in DM.lookup packed_key dm update key val dm = let packed_key = S.pack key in DM.update (\x -> Just val) packed_key dm Which kinda works, however since Map is implemented using a balanced tree, therefore, when as the dictionary grows, it takes a long time to insert new key-value pair. == THIRD ATTEMPT == My third attempt is to use Data.IntMap -- an implementation of Dict using IntMap instance Dict (IM.IntMap a) Key a where empty = IM.empty insert key val im = let int_key = fromIntegral (HT.hashString key) in IM.insert int_key val im lookup key im = let int_key = fromIntegral (HT.hashString key) in IM.lookup int_key im update key val im = let int_key = fromIntegral (HT.hashString key) in IM.update (\x -> Just val) int_key im This implementation is faster than the Map approach, however this implementation can't handle collision well, two keys which are hashed into the same integer will overwrite each other. == FOURTH ATTEMPT == My fourth implementation is to use Trie. The idea is to split a string (a key) into a list of 4-character chunks. Each chunk can be mapped into a 32-bit integer without collision. We then insert the value with this list of chunks into the Trie. -- an implementation of Dict using Trie instance Dict (Trie a) Key a where empty = emptyTrie insert key val trie = let key_chain = chain key in insertTrie key_chain val trie lookup key trie = let key_chain = chain key in lookupTrie key_chain trie update key val trie = let key_chain = chain key in updateTrie key_chain val trie -- an auxillary function that "splits" string into small pieces, -- 4 characters per piece, 4 chars = 32 bit chain :: Key -> [Key] chain k | length k > 4 = let (k',ks) = splitAt 4 k in (k':chain ks) | otherwise = [k] -- a collision-free hash function which turns four chars into Int32 safehash :: [Char] -> Int safehash cs | length cs > 4 = error "safehash failed." | otherwise = sum [ (ord c)*(256^i) | (c,i) <- zip cs [0..3] ] -- a trie datatype data Trie a = Trie [a] (IM.IntMap (Trie a)) -- the empty trie emptyTrie = Trie [] (IM.empty) -- insert value into the trie insertTrie :: [String] -> a -> Trie a -> Trie a insertTrie [] i (Trie is maps) = Trie (i:is) maps insertTrie (word:words) i (Trie is maps) = let key = safehash word in case IM.lookup key maps of { Just trie -> let trie' = insertTrie words i trie maps' = IM.update (\x -> Just trie') key maps in Trie is maps' ; Nothing -> let trie = emptyTrie trie' = insertTrie words i trie maps' = IM.insert key trie' maps in Trie is maps' } -- lookup value from the trie lookupTrie :: [String] -> Trie a -> Maybe a lookupTrie [] (Trie vs _) = case vs of [] -> Nothing (x:_) -> Just x lookupTrie (word:words) (Trie is maps) = let key = safehash word in case IM.lookup key maps of Just trie -> lookupTrie words trie Nothing -> Nothing -- update the trie with the given value. updateTrie :: [String] -> a -> Trie a -> Trie a -- we only update the first value and leave the rest unchanged. updateTrie [] y (Trie (x:xs) maps) = Trie (y:xs) maps updateTrie (word:words) v (Trie is maps) = let key = safehash word in case IM.lookup key maps of Just trie -> let trie' = updateTrie words v trie maps' = IM.update (\x -> Just trie') key maps in Trie is maps' Nothing -> Trie is maps == BENCH MARK == I have a main function which builds a dictionary from a text file. Each line of the file is a key-value pair separated by a space. e.g. key1 1 key2 2 ... main :: IO () main = do { content <- readFile "in.txt" ; let -- change this following type annotation -- to change different type of the dictionary -- dict :: DM.Map S.ByteString Int -- dict :: IM.IntMap Int dict :: Trie Int dict = fromList (map parse_a_line (lines content)) ; case Main.lookup "key256" dict of { Just v -> putStrLn (show v) ; Nothing -> putStrLn "Not found" } -- read a line here so that we can pause the program -- and look at the memory usage. ; v <- readLn ; putStrLn v } where parse_a_line :: String -> (Key,Int) parse_a_line line = case words line of [key,val] -> (key,read val) _ -> error " parse error. " I tested all three implementations by building a dictionary of size 1000000. The result shows that the Map and the Trie approaches handle collision well, but the IntMap approach does not. Here is a comparison of memory usage Map : 345 MB IntMap : 146 MB Trie : 282 MB Python : 94 MB Here is a comparison of execution time (on an intel dual core 2.0G) Map: 26 sec IntMap: 9 sec Trie: 12 sec Python: 2.24 sec The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python. Can some one point out what's wrong with my implementations? I've attached my code in the tgz file. Cheers, Kenny

Hello kenny, Tuesday, November 18, 2008, 1:37:36 PM, you wrote:
The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python.
thanks, interesting code 1. why you think that your code should be faster? pythob implementation is probably written in C ince it's one of its core data structures 2. you can solve IntMap problem by storing list of values with the same hash in tree's nodes -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hi Bulat,
1. why you think that your code should be faster? pythob implementation is probably written in C ince it's one of its core data structures
I am not hoping that my code should be faster, but at least not as slow as what it gets. Basically I am looking for an implementation which is close to the one in python.
2. you can solve IntMap problem by storing list of values with the same hash in tree's nodes
Yeah, that would probably speed up the building time of the dictionary. However, storing the list of values in the tree nodes requires storing their original keys, so that it maintains the one-key-to-one-value semantics. This would takes up more space compared to the Trie approach. Regards, Kenny

Hello kenny, Tuesday, November 18, 2008, 2:34:25 PM, you wrote:
I am not hoping that my code should be faster, but at least not as slow as what it gets. Basically I am looking for an implementation which is close to the one in python.
well, if haskell will allow to produce code not slower than C, it will be world's best language :) unfortunately, you should pay a lot for it's elegance -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

main :: IO () main = do { content <- readFile "in.txt" ; let -- change this following type annotation -- to change different type of the dictionary -- dict :: DM.Map S.ByteString Int -- dict :: IM.IntMap Int dict :: Trie Int dict = fromList (map parse_a_line (lines content)) .. where parse_a_line :: String -> (Key,Int) parse_a_line line = case words line of [key,val] -> (key,read val) _ -> error " parse error. "
Maps tend to be strict in their keys, but not in their values. You might be storing a lot of thunks with unparsed Strings instead of plain Int values. Something like this might make a difference wrt memory usage: [key,val] -> ((,) key) $! (read val) Hth, Claus
Here is a comparison of memory usage
Map : 345 MB IntMap : 146 MB Trie : 282 MB Python : 94 MB

kenny lu wrote:
Internally, python dictionary is implemented using hash table.
My first attempt is to use Data.HashTable. However it was immediately abandoned, as I realize the memory usage is unreasonably huge.
Why should a Haskell hash table need more memory then a Python hash table? I've heard that Data.HashTable is bad, so maybe writing a good one could be an option.
Python dictionary allows for update. e.g. the statement d['a'] = 3 changes the value pointed by 'a' from 1 to 3.
I understand "changes" in the sense of an destructive update: The hash table stays the same (in terms of "object identity"), but the content of the memory cell storing the value of d['a'] is changed in place. That means that the old hash table, with d['a'] == 1, doesn't exist anymore.
-- the Dict type class class Dict d k v | d -> k v where empty :: d insert :: k -> v -> d -> d lookup :: k -> d -> Maybe v update :: k -> v -> d -> d
But here you want to create a new d, e.g. a whole new hash table, which contains mostly the same content, but one memory cell different. The old hash table still exists in memory. That is a totally different operation which is quite likely to need more memory. Tillmann

Hello Tillmann, Tuesday, November 18, 2008, 2:46:47 PM, you wrote:
Why should a Haskell hash table need more memory then a Python hash table? I've heard that Data.HashTable is bad, so maybe writing a good one could be an option.
about Data.HashTable: it uses one huge array to store all the entries. the catch is that GC need to scan entire array on every (major) GC. using array of hashtables may improve situation a lot plus check GC times for every version: +RTS -Soutfile -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Nov 18, 2008, at 7:03 AM, Bulat Ziganshin wrote:
Hello Tillmann,
Tuesday, November 18, 2008, 2:46:47 PM, you wrote:
Why should a Haskell hash table need more memory then a Python hash table? I've heard that Data.HashTable is bad, so maybe writing a good one could be an option.
about Data.HashTable: it uses one huge array to store all the entries. the catch is that GC need to scan entire array on every (major) GC.
Actually, the scan on every major (full) GC is unavoidable. What *can* be avoided is a scan on every *minor* GC that occurs after an update. I forget what the exact strategy is here, but I know that one write used to cause the entire array to be re-scanned; what I don't remember is when/if the array transitions back to a state where it isn't being scanned by minor GC anymore.
using array of hashtables may improve situation a lot
Yes, this would be worth trying. Understanding the current GC strategy would make it easier to make the right tradeoffs here; we expect n insertions will touch O(n) subtables, so repeated insertion will make life worse if we're not careful. -Jan-Willem Maessen
plus check GC times for every version: +RTS -Soutfile
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Tillmann Rendel
Why should a Haskell hash table need more memory then a Python hash table? I've heard that Data.HashTable is bad, so maybe writing a good one could be an option.
One problem is that Haskell collections are lazy by default. I'm aware of a few use cases where laziness lets you formulate a very elegant recursive population of a collection, but I think that in general, strictness is what you want, and further, that if you want lazy, you store your data as one-tuples: data Lazy a = Lazy a (If there's a workaround/solution in the other direction, I'd really like to hear it). I'm therefore tempted to suggest that collections should be strict by default, and in particular, that there should be strict arrays for arbitrary types, not just the ones that happen to be unboxable. Unboxing should be an optimization for (some) strict arrays. -k -- If I haven't seen further, it is by standing in the footprints of giants

One problem is that Haskell collections are lazy by default. I'm aware of a few use cases where laziness lets you formulate a very elegant recursive population of a collection, but I think that in general, strictness is what you want,
While I strongly agree with the gist of your post, I never liked this particular kind of argument - different people have different needs.
and further, that if you want lazy, you store your data as one-tuples: data Lazy a = Lazy a
(If there's a workaround/solution in the other direction, I'd really like to hear it).
That is the crunch point. If we can't choose between strict and non-strict usage, the default becomes the only option, and it simply doesn't fit all use cases. Workarounds are currently somewhat random: - for indexed collections (which tend to be strict in the keys), one can use strict pairs when building from lists of pairs, but that doesn't work for all operations (at least not with the current APIs) - some collections, for a few operations, provide strict alternatives, but that doesn't cover much of the API, and isn't even consistent over variations of the same kind of collection - for some types, strict alternatives are provided as separate packages That means that not only are the usage patterns different, for the same problem, in different contexts (making switching types harder), but one can only get the solutions for some operations in some types: one is limited to the spartan and incomplete subset of the data type API that supports switching between strict and non-strict (check which operations in Data.Map support both modes, then check what happens when you want to move from Data.Map to Data.IntMap..). Not to mention that the strict alternatives aren't made obvious (you have to know that you should look for '-ed variants of operation names to get them, sometimes; that goes back all the way to the popular foldl with strict accumulator trap) and are documented in names instead of specified in types. Using a single set of packages and operations, with standardized ways of instantiating a collection/type as either strict or non-strict, would be much better (well, usable, for a start;-), IMHO. - using a strict default, optionally disabled via non-strict one-tuples, sounds workable (perhaps the one-tuple should be standardized, to give the compiler an indication that this is really an annotation, and to get similar benefits as from newtype deriving). - using strictness annotations in types seems simpler, but has the drawback that the annotations are really not part of the types; types just happen to be a nice place to put annotations that amount to invariants ('!t' in some context saying: "I always want whnf in this position, not unevaluated thunks") that the compiler can use for modifying strictness analysis and code generation. - an in-between step might be to parameterize collections by a type constructor, with two pre-defined constructors ("Strict" and "Id") that effectively play the role of annotations while being proper parts of types (type constructors), thereby documenting intent and enabling (pre-)specialization of code without creating programming or efficiency overheads (at least not for the library author - not so sure about library users). One important point is composition of contexts: if one wants, say, a "Data.Map.Map Key (Data.Map.Map Key type)", it should be possible to specify that one wants both Maps element-strict, and hence the composed structure strict in "type", without obfuscating the code with 'seq's or 'Lazy's all over the place. I can see that working with an annotation + code-generation/specialization approach, but how close would a type-tag approach come to supporting this?
I'm therefore tempted to suggest that collections should be strict by default, and in particular, that there should be strict arrays for arbitrary types, not just the ones that happen to be unboxable. Unboxing should be an optimization for (some) strict arrays.
Another suggestion I find myself agreeing with: when I'm looking for strict variants of standard data types, unboxing is a natural follow-on optimization, but if unboxing isn't possible, that shouldn't keep me from getting the benefits of strictness where I need them. Claus

2008/11/18 kenny lu
Dear all,
I am trying to implement the python-style dictionary in Haskell.
Python dictionary is a data structure that maps one key to one value. For instance, a python dictionary d = {'a':1, 'b':2 } maps key 'a' to 1, 'b' to 2. Python dictionary allows for update. e.g. the statement d['a'] = 3 changes the value pointed by 'a' from 1 to 3.
Internally, python dictionary is implemented using hash table.
My first attempt is to use Data.HashTable. However it was immediately abandoned, as I realize the memory usage is unreasonably huge.
...
I tested all three implementations by building a dictionary of size 1000000. The result shows that the Map and the Trie approaches handle collision well, but the IntMap approach does not.
Here is a comparison of memory usage
Map : 345 MB IntMap : 146 MB Trie : 282 MB Python : 94 MB
Here is a comparison of execution time (on an intel dual core 2.0G)
Map: 26 sec IntMap: 9 sec Trie: 12 sec Python: 2.24 sec
The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python.
Can some one point out what's wrong with my implementations?
First of all, you use Strings. That's a very bad thing when you care about memory restrictions. Fire up ghci type something like this:
let aas = replicate (1024*1024*10) 'a' -- 22 Mb memory usage length aas 10485760 -- 270 Mb memory usage 10 Mb string caused as much as 250 Mb increase in ghci's memory consumption.
My guess? Use hashtables with ByteStrings. I rewrote part of your code. Results are quite promising. Haskell: 121 Mb total memory in use INIT time 0.02s ( 0.00s elapsed) MUT time 0.84s ( 1.00s elapsed) GC time 1.97s ( 2.02s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 2.83s ( 3.02s elapsed) %GC time 69.6% (66.8% elapsed) Python: $ time python dict.py 256 real 0m2.278s user 0m2.233s sys 0m0.078s memory: 101 Mb (as reported by Windows' Task Manager). The code: --- cut --- import qualified Data.ByteString.Lazy.Char8 as BS import Data.Int import Data.Bits (...) parse_a_line_BS :: BS.ByteString -> (BS.ByteString,Int) parse_a_line_BS line = case BS.words line of [key,val] -> (key,(read . BS.unpack) val) _ -> error " parse error. " main :: IO () main = do dict <- HT.new (==) hashByteString indata <- (map parse_a_line_BS `fmap` BS.lines `fmap` BS.readFile "in.txt") mapM_ (\ (k,v) -> HT.insert dict k v) indata HT.lookup dict (BS.pack "key256") >>= \v -> case v of Just vv -> putStrLn (show vv) Nothing -> putStrLn ("Not found") -- derived from Data.HashTable.hashString hashByteString :: BS.ByteString -> Int32 hashByteString = BS.foldl' f golden where f m c = fromIntegral (ord c) * magic + hashInt32 m magic = 0xdeadbeef hashInt32 :: Int32 -> Int32 hashInt32 x = mulHi x golden + x mulHi :: Int32 -> Int32 -> Int32 mulHi a b = fromIntegral (r `shiftR` 32) where r :: Int64 r = fromIntegral a * fromIntegral b golden :: Int32 golden = 1013904242 -- = round ((sqrt 5 - 1) * 2^32) :: Int32 --- cut --- I had to rewrite hashString to work for ByteStrings - basically it's just using different foldl'. All best Christopher Skrzętnicki

Which version of GHC and which version of the Data.ByteString library? There was an inlining bug related to Data.Map /Data.IntMap performance fixed between the 6.8.x release and the current bytestring release. In testing, Data.Map with strict bytestring keys matched the python (C implemented) dictionary, after I fixed the inlining for word lookups. You'll need to be using bytestring 0.9.1.x though. -- Don haskellmail:
Dear all,
I am trying to implement the python-style dictionary in Haskell.
Python dictionary is a data structure that maps one key to one value. For instance, a python dictionary d = {'a':1, 'b':2 } maps key 'a' to 1, 'b' to 2. Python dictionary allows for update. e.g. the statement d['a'] = 3 changes the value pointed by 'a' from 1 to 3.
Internally, python dictionary is implemented using hash table.
My first attempt is to use Data.HashTable. However it was immediately abandoned, as I realize the memory usage is unreasonably huge.
== SECOND ATTEMPT == My second attempt is to use Data.Map
{-# OPTIONS_GHC -fglasgow-exts #-} module Main where
import qualified Data.HashTable as HT import qualified Data.IntMap as IM import qualified Data.Map as DM import qualified Data.ByteString.Char8 as S import Data.Char
-- the Dict type class class Dict d k v | d -> k v where empty :: d insert :: k -> v -> d -> d lookup :: k -> d -> Maybe v update :: k -> v -> d -> d
-- Let's use string as key type Key = String
-- insert key-value pairs into a dictionary fromList :: Dict d k a => [(k,a)] -> d fromList l = foldl (\d (key,val) -> insert key val d) empty l
instance Dict (DM.Map S.ByteString a) Key a where empty = DM.empty insert key val dm = let packed_key = S.pack key in DM.insert packed_key val dm lookup key dm = let packed_key = S.pack key in DM.lookup packed_key dm update key val dm = let packed_key = S.pack key in DM.update (\x -> Just val) packed_key dm
Which kinda works, however since Map is implemented using a balanced tree, therefore, when as the dictionary grows, it takes a long time to insert new key-value pair.
== THIRD ATTEMPT == My third attempt is to use Data.IntMap
-- an implementation of Dict using IntMap instance Dict (IM.IntMap a) Key a where empty = IM.empty insert key val im = let int_key = fromIntegral (HT.hashString key) in IM.insert int_key val im lookup key im = let int_key = fromIntegral (HT.hashString key) in IM.lookup int_key im update key val im = let int_key = fromIntegral (HT.hashString key) in IM.update (\x -> Just val) int_key im
This implementation is faster than the Map approach, however this implementation can't handle collision well, two keys which are hashed into the same integer will overwrite each other.
== FOURTH ATTEMPT ==
My fourth implementation is to use Trie. The idea is to split a string (a key) into a list of 4-character chunks. Each chunk can be mapped into a 32-bit integer without collision. We then insert the value with this list of chunks into the Trie.
-- an implementation of Dict using Trie instance Dict (Trie a) Key a where empty = emptyTrie insert key val trie = let key_chain = chain key in insertTrie key_chain val trie lookup key trie = let key_chain = chain key in lookupTrie key_chain trie update key val trie = let key_chain = chain key in updateTrie key_chain val trie
-- an auxillary function that "splits" string into small pieces, -- 4 characters per piece, 4 chars = 32 bit chain :: Key -> [Key] chain k | length k > 4 = let (k',ks) = splitAt 4 k in (k':chain ks) | otherwise = [k]
-- a collision-free hash function which turns four chars into Int32 safehash :: [Char] -> Int safehash cs | length cs > 4 = error "safehash failed." | otherwise = sum [ (ord c)*(256^i) | (c,i) <- zip cs [0..3] ]
-- a trie datatype data Trie a = Trie [a] (IM.IntMap (Trie a))
-- the empty trie emptyTrie = Trie [] (IM.empty)
-- insert value into the trie insertTrie :: [String] -> a -> Trie a -> Trie a insertTrie [] i (Trie is maps) = Trie (i:is) maps insertTrie (word:words) i (Trie is maps) = let key = safehash word in case IM.lookup key maps of { Just trie -> let trie' = insertTrie words i trie maps' = IM.update (\x -> Just trie') key maps in Trie is maps' ; Nothing -> let trie = emptyTrie trie' = insertTrie words i trie maps' = IM.insert key trie' maps in Trie is maps' }
-- lookup value from the trie lookupTrie :: [String] -> Trie a -> Maybe a lookupTrie [] (Trie vs _) = case vs of [] -> Nothing (x:_) -> Just x lookupTrie (word:words) (Trie is maps) = let key = safehash word in case IM.lookup key maps of Just trie -> lookupTrie words trie Nothing -> Nothing
-- update the trie with the given value. updateTrie :: [String] -> a -> Trie a -> Trie a -- we only update the first value and leave the rest unchanged. updateTrie [] y (Trie (x:xs) maps) = Trie (y:xs) maps updateTrie (word:words) v (Trie is maps) = let key = safehash word in case IM.lookup key maps of Just trie -> let trie' = updateTrie words v trie maps' = IM.update (\x -> Just trie') key maps in Trie is maps' Nothing -> Trie is maps
== BENCH MARK ==
I have a main function which builds a dictionary from a text file. Each line of the file is a key-value pair separated by a space.
e.g.
key1 1 key2 2 ...
main :: IO () main = do { content <- readFile "in.txt" ; let -- change this following type annotation -- to change different type of the dictionary -- dict :: DM.Map S.ByteString Int -- dict :: IM.IntMap Int dict :: Trie Int dict = fromList (map parse_a_line (lines content)) ; case Main.lookup "key256" dict of { Just v -> putStrLn (show v) ; Nothing -> putStrLn "Not found" } -- read a line here so that we can pause the program -- and look at the memory usage. ; v <- readLn ; putStrLn v } where parse_a_line :: String -> (Key,Int) parse_a_line line = case words line of [key,val] -> (key,read val) _ -> error " parse error. "
I tested all three implementations by building a dictionary of size 1000000. The result shows that the Map and the Trie approaches handle collision well, but the IntMap approach does not.
Here is a comparison of memory usage
Map : 345 MB IntMap : 146 MB Trie : 282 MB Python : 94 MB
Here is a comparison of execution time (on an intel dual core 2.0G)
Map: 26 sec IntMap: 9 sec Trie: 12 sec Python: 2.24 sec
The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python.
Can some one point out what's wrong with my implementations?
I've attached my code in the tgz file.
Cheers, Kenny
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Dear Don,
I am using GHC 6.8.1
Regards,
Kenny
On Tue, Nov 18, 2008 at 11:33 PM, Don Stewart
Which version of GHC and which version of the Data.ByteString library? There was an inlining bug related to Data.Map /Data.IntMap performance fixed between the 6.8.x release and the current bytestring release.
In testing, Data.Map with strict bytestring keys matched the python (C implemented) dictionary, after I fixed the inlining for word lookups.
You'll need to be using bytestring 0.9.1.x though.
-- Don
Dear all,
I am trying to implement the python-style dictionary in Haskell.
Python dictionary is a data structure that maps one key to one value. For instance, a python dictionary d = {'a':1, 'b':2 } maps key 'a' to 1, 'b' to 2. Python dictionary allows for update. e.g. the statement d['a'] = 3 changes the value pointed by 'a' from 1 to 3.
Internally, python dictionary is implemented using hash table.
My first attempt is to use Data.HashTable. However it was immediately abandoned, as I realize the memory usage is unreasonably huge.
== SECOND ATTEMPT == My second attempt is to use Data.Map
{-# OPTIONS_GHC -fglasgow-exts #-} module Main where
import qualified Data.HashTable as HT import qualified Data.IntMap as IM import qualified Data.Map as DM import qualified Data.ByteString.Char8 as S import Data.Char
-- the Dict type class class Dict d k v | d -> k v where empty :: d insert :: k -> v -> d -> d lookup :: k -> d -> Maybe v update :: k -> v -> d -> d
-- Let's use string as key type Key = String
-- insert key-value pairs into a dictionary fromList :: Dict d k a => [(k,a)] -> d fromList l = foldl (\d (key,val) -> insert key val d) empty l
instance Dict (DM.Map S.ByteString a) Key a where empty = DM.empty insert key val dm = let packed_key = S.pack key in DM.insert packed_key val dm lookup key dm = let packed_key = S.pack key in DM.lookup packed_key dm update key val dm = let packed_key = S.pack key in DM.update (\x -> Just val) packed_key dm
Which kinda works, however since Map is implemented using a balanced
haskellmail: tree,
therefore, when as the dictionary grows, it takes a long time to insert new key-value pair.
== THIRD ATTEMPT == My third attempt is to use Data.IntMap
-- an implementation of Dict using IntMap instance Dict (IM.IntMap a) Key a where empty = IM.empty insert key val im = let int_key = fromIntegral (HT.hashString key) in IM.insert int_key val im lookup key im = let int_key = fromIntegral (HT.hashString key) in IM.lookup int_key im update key val im = let int_key = fromIntegral (HT.hashString key) in IM.update (\x -> Just val) int_key im
This implementation is faster than the Map approach, however this implementation can't handle collision well, two keys which are hashed into the same integer will overwrite each other.
== FOURTH ATTEMPT ==
My fourth implementation is to use Trie. The idea is to split a string (a key) into a list of 4-character chunks. Each chunk can be mapped into a 32-bit integer without collision. We then insert the value with this list of chunks into the Trie.
-- an implementation of Dict using Trie instance Dict (Trie a) Key a where empty = emptyTrie insert key val trie = let key_chain = chain key in insertTrie key_chain val trie lookup key trie = let key_chain = chain key in lookupTrie key_chain trie update key val trie = let key_chain = chain key in updateTrie key_chain val trie
-- an auxillary function that "splits" string into small pieces, -- 4 characters per piece, 4 chars = 32 bit chain :: Key -> [Key] chain k | length k > 4 = let (k',ks) = splitAt 4 k in (k':chain ks) | otherwise = [k]
-- a collision-free hash function which turns four chars into Int32 safehash :: [Char] -> Int safehash cs | length cs > 4 = error "safehash failed." | otherwise = sum [ (ord c)*(256^i) | (c,i) <- zip cs [0..3] ]
-- a trie datatype data Trie a = Trie [a] (IM.IntMap (Trie a))
-- the empty trie emptyTrie = Trie [] (IM.empty)
-- insert value into the trie insertTrie :: [String] -> a -> Trie a -> Trie a insertTrie [] i (Trie is maps) = Trie (i:is) maps insertTrie (word:words) i (Trie is maps) = let key = safehash word in case IM.lookup key maps of { Just trie -> let trie' = insertTrie words i trie maps' = IM.update (\x -> Just trie') key maps in Trie is maps' ; Nothing -> let trie = emptyTrie trie' = insertTrie words i trie maps' = IM.insert key trie' maps in Trie is maps' }
-- lookup value from the trie lookupTrie :: [String] -> Trie a -> Maybe a lookupTrie [] (Trie vs _) = case vs of [] -> Nothing (x:_) -> Just x lookupTrie (word:words) (Trie is maps) = let key = safehash word in case IM.lookup key maps of Just trie -> lookupTrie words trie Nothing -> Nothing
-- update the trie with the given value. updateTrie :: [String] -> a -> Trie a -> Trie a -- we only update the first value and leave the rest unchanged. updateTrie [] y (Trie (x:xs) maps) = Trie (y:xs) maps updateTrie (word:words) v (Trie is maps) = let key = safehash word in case IM.lookup key maps of Just trie -> let trie' = updateTrie words v trie maps' = IM.update (\x -> Just trie') key maps in Trie is maps' Nothing -> Trie is maps
== BENCH MARK ==
I have a main function which builds a dictionary from a text file. Each line of the file is a key-value pair separated by a space.
e.g.
key1 1 key2 2 ...
main :: IO () main = do { content <- readFile "in.txt" ; let -- change this following type annotation -- to change different type of the dictionary -- dict :: DM.Map S.ByteString Int -- dict :: IM.IntMap Int dict :: Trie Int dict = fromList (map parse_a_line (lines content)) ; case Main.lookup "key256" dict of { Just v -> putStrLn (show v) ; Nothing -> putStrLn "Not found" } -- read a line here so that we can pause the program -- and look at the memory usage. ; v <- readLn ; putStrLn v } where parse_a_line :: String -> (Key,Int) parse_a_line line = case words line of [key,val] -> (key,read val) _ -> error " parse error. "
I tested all three implementations by building a dictionary of size 1000000. The result shows that the Map and the Trie approaches handle collision well, but the IntMap approach does not.
Here is a comparison of memory usage
Map : 345 MB IntMap : 146 MB Trie : 282 MB Python : 94 MB
Here is a comparison of execution time (on an intel dual core 2.0G)
Map: 26 sec IntMap: 9 sec Trie: 12 sec Python: 2.24 sec
The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python.
Can some one point out what's wrong with my implementations?
I've attached my code in the tgz file.
Cheers, Kenny
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Great. Assuming you're following the advice to use bytestrings, please install the newest bytestring library version, here, http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bytestring Data.Map or Data.IntMap with bytestrings should be quite efficient. (or use a trie if more precision is needed) -- Don haskellmail:
Dear Don, I am using GHC 6.8.1
Regards, Kenny
On Tue, Nov 18, 2008 at 11:33 PM, Don Stewart <[1]dons@galois.com> wrote:
Which version of GHC and which version of the Data.ByteString library? There was an inlining bug related to Data.Map /Data.IntMap performance fixed between the 6.8.x release and the current bytestring release.
In testing, Data.Map with strict bytestring keys matched the python (C implemented) dictionary, after I fixed the inlining for word lookups.
You'll need to be using bytestring 0.9.1.x though.
-- Don
haskellmail: > Dear all, > > I am trying to implement the python-style dictionary in Haskell. > > Python dictionary is a data structure that maps one key to one value. > For instance, a python dictionary > d = {'a':1, 'b':2 } > maps key 'a' to 1, 'b' to 2. > Python dictionary allows for update. e.g. the statement > d['a'] = 3 > changes the value pointed by 'a' from 1 to 3. > > Internally, python dictionary is implemented using hash table. > > My first attempt is to use Data.HashTable. However it was immediately > abandoned, as I realize the memory usage is unreasonably huge. > > == SECOND ATTEMPT == > My second attempt is to use Data.Map > > {-# OPTIONS_GHC -fglasgow-exts #-} > module Main where > > import qualified Data.HashTable as HT > import qualified Data.IntMap as IM > import qualified Data.Map as DM > import qualified Data.ByteString.Char8 as S > import Data.Char > > -- the Dict type class > class Dict d k v | d -> k v where > empty :: d > insert :: k -> v -> d -> d > lookup :: k -> d -> Maybe v > update :: k -> v -> d -> d > > -- Let's use string as key > type Key = String > > -- insert key-value pairs into a dictionary > fromList :: Dict d k a => [(k,a)] -> d > fromList l = > foldl (\d (key,val) -> insert key val d) empty l > > instance Dict (DM.Map S.ByteString a) Key a where > empty = DM.empty > insert key val dm = > let packed_key = S.pack key > in DM.insert packed_key val dm > lookup key dm = > let packed_key = S.pack key > in DM.lookup packed_key dm > update key val dm = > let packed_key = S.pack key > in DM.update (\x -> Just val) packed_key dm > > Which kinda works, however since Map is implemented using a balanced tree, > therefore, > when as the dictionary grows, it takes a long time to insert new key-value > pair. > > == THIRD ATTEMPT == > My third attempt is to use Data.IntMap > > -- an implementation of Dict using IntMap > instance Dict (IM.IntMap a) Key a where > empty = IM.empty > insert key val im = > let int_key = fromIntegral (HT.hashString key) > in IM.insert int_key val im > lookup key im = > let int_key = fromIntegral (HT.hashString key) > in IM.lookup int_key im > update key val im = > let int_key = fromIntegral (HT.hashString key) > in IM.update (\x -> Just val) int_key im > > This implementation is faster than the Map approach, however this > implementation > can't handle collision well, two keys which are hashed into the same > integer will overwrite each other. > > == FOURTH ATTEMPT == > > My fourth implementation is to use Trie. The idea is to split a string (a > key) into > a list of 4-character chunks. Each chunk can be mapped into a 32-bit > integer without collision. > We then insert the value with this list of chunks into the Trie. > > -- an implementation of Dict using Trie > instance Dict (Trie a) Key a where > empty = emptyTrie > insert key val trie = > let key_chain = chain key > in insertTrie key_chain val trie > lookup key trie = > let key_chain = chain key > in lookupTrie key_chain trie > update key val trie = > let key_chain = chain key > in updateTrie key_chain val trie > > -- an auxillary function that "splits" string into small pieces, > -- 4 characters per piece, 4 chars = 32 bit > chain :: Key -> [Key] > chain k | length k > 4 = let (k',ks) = splitAt 4 k > in (k':chain ks) > | otherwise = [k] > > -- a collision-free hash function which turns four chars into Int32 > safehash :: [Char] -> Int > safehash cs | length cs > 4 = error "safehash failed." > | otherwise = > sum [ (ord c)*(256^i) | (c,i) <- zip cs [0..3] ] > > -- a trie datatype > data Trie a = Trie [a] (IM.IntMap (Trie a)) > > -- the empty trie > emptyTrie = Trie [] (IM.empty) > > -- insert value into the trie > insertTrie :: [String] -> a -> Trie a -> Trie a > insertTrie [] i (Trie is maps) = Trie (i:is) maps > insertTrie (word:words) i (Trie is maps) = > let key = safehash word > in case IM.lookup key maps of > { Just trie -> let trie' = insertTrie words i trie > maps' = IM.update (\x -> Just trie') key maps > in Trie is maps' > ; Nothing -> let trie = emptyTrie > trie' = insertTrie words i trie > maps' = IM.insert key trie' maps > in Trie is maps' > } > > -- lookup value from the trie > lookupTrie :: [String] -> Trie a -> Maybe a > lookupTrie [] (Trie vs _) = > case vs of > [] -> Nothing > (x:_) -> Just x > lookupTrie (word:words) (Trie is maps) = > let key = safehash word > in case IM.lookup key maps of > Just trie -> lookupTrie words trie > Nothing -> Nothing > > -- update the trie with the given value. > updateTrie :: [String] -> a -> Trie a -> Trie a > -- we only update the first value and leave the rest unchanged. > updateTrie [] y (Trie (x:xs) maps) = Trie (y:xs) maps > updateTrie (word:words) v (Trie is maps) = > let key = safehash word > in case IM.lookup key maps of > Just trie -> let trie' = updateTrie words v trie > maps' = IM.update (\x -> Just trie') key maps > in Trie is maps' > Nothing -> Trie is maps > > == BENCH MARK == > > I have a main function which builds a dictionary from a text file. > Each line of the file is a key-value pair separated by a space. > > e.g. > > key1 1 > key2 2 > ... > > main :: IO () > main = do { content <- readFile "in.txt" > ; let -- change this following type annotation > -- to change different type of the dictionary > -- dict :: DM.Map S.ByteString Int > -- dict :: IM.IntMap Int > dict :: Trie Int > dict = fromList (map parse_a_line (lines content)) > ; case Main.lookup "key256" dict of > { Just v -> putStrLn (show v) > ; Nothing -> putStrLn "Not found" > } > -- read a line here so that we can pause the program > -- and look at the memory usage. > ; v <- readLn > ; putStrLn v > } > where parse_a_line :: String -> (Key,Int) > parse_a_line line = case words line of > [key,val] -> (key,read val) > _ -> error " parse error. " > > I tested all three implementations by building a dictionary of size > 1000000. > The result shows that the Map and the Trie approaches handle collision > well, but > the IntMap approach does not. > > Here is a comparison of memory usage > > Map : 345 MB > IntMap : 146 MB > Trie : 282 MB > Python : 94 MB > > Here is a comparison of execution time (on an intel dual core 2.0G) > > Map: 26 sec > IntMap: 9 sec > Trie: 12 sec > Python: 2.24 sec > > The above number shows that my implementations of python style dictionary > are space/time in-efficient as compared to python. > > Can some one point out what's wrong with my implementations? > > I've attached my code in the tgz file. > > Cheers, > Kenny
> _______________________________________________ > Haskell-Cafe mailing list > [2]Haskell-Cafe@haskell.org > [3]http://www.haskell.org/mailman/listinfo/haskell-cafe
References
Visible links 1. mailto:dons@galois.com 2. mailto:Haskell-Cafe@haskell.org 3. http://www.haskell.org/mailman/listinfo/haskell-cafe

2008/11/18 kenny lu
Here is a comparison of memory usage
Map : 345 MB IntMap : 146 MB Trie : 282 MB Python : 94 MB
Here is a comparison of execution time (on an intel dual core 2.0G)
Map: 26 sec IntMap: 9 sec Trie: 12 sec Python: 2.24 sec
The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python.
Can some one point out what's wrong with my implementations?
This isn't really a fair comparison. Map, IntMap, and Trie are
persistent data structures, and Python dictionaries are ephemeral.
(That is, when you "add" a key to a Map, you actually create a new one
that shares structure with the old one, and both can be used in
subsequent code. In Python, you would have to copy the dictionary.)
--
Dave Menendez

On Tue, Nov 18, 2008 at 10:37 AM, David Menendez
This isn't really a fair comparison. Map, IntMap, and Trie are persistent data structures, and Python dictionaries are ephemeral. (That is, when you "add" a key to a Map, you actually create a new one that shares structure with the old one, and both can be used in subsequent code. In Python, you would have to copy the dictionary.)
But when these persistent data structures are used in a single-threaded way, why should we not hope for the performance to be comparable? It may not be easy, but just saying "they are persistent" is not really an excuse. Luke

On Tue, Nov 18, 2008 at 3:46 PM, Luke Palmer
On Tue, Nov 18, 2008 at 10:37 AM, David Menendez
wrote: This isn't really a fair comparison. Map, IntMap, and Trie are persistent data structures, and Python dictionaries are ephemeral. (That is, when you "add" a key to a Map, you actually create a new one that shares structure with the old one, and both can be used in subsequent code. In Python, you would have to copy the dictionary.)
But when these persistent data structures are used in a single-threaded way, why should we not hope for the performance to be comparable?
It may not be easy, but just saying "they are persistent" is not really an excuse.
I guess that depends on what you mean by "comparable". Chris Okasaki
demonstrated that, for some data structures, a persistent
implementation could be made with the same asymptotic bounds as an
ephemeral implementation, but I would still expect the persistent
version to be worse by a constant factor when used ephemerally.
Ephemeral data structures are naturally optimized for ephemeral use
cases. (I would also expect the reverse to be true.)
--
Dave Menendez

On Tue, Nov 18, 2008 at 12:46 PM, Luke Palmer
But when these persistent data structures are used in a single-threaded way, why should we not hope for the performance to be comparable?
If you can guarantee single-threaded use, then you can just use ST and implement the ephemeral structure, right?
It may not be easy, but just saying "they are persistent" is not really an excuse.
You can generally make a persistent data structure with the same asymptotic bounds as the ephemeral structure, but the constant hidden inside the O() will generally be worse.

On Tue, Nov 18, 2008 at 8:51 PM, Ryan Ingram
On Tue, Nov 18, 2008 at 12:46 PM, Luke Palmer
wrote: But when these persistent data structures are used in a single-threaded way, why should we not hope for the performance to be comparable?
If you can guarantee single-threaded use, then you can just use ST and implement the ephemeral structure, right?
But that requires a special reimplementation.
It may not be easy, but just saying "they are persistent" is not really an excuse.
You can generally make a persistent data structure with the same asymptotic bounds as the ephemeral structure, but the constant hidden inside the O() will generally be worse.
I say this as a goal. If we're in a performance competition, we can't say "well, it's okay that Haskell is slower because its data structures can be used persistently". Python's dictionaries can also, by inserting explicit copies. In this use case Python performs better, and we should strive to perform as well as it does. Persistence has no bearing on this, because the persistence is not used. I'm not saying it's always possible to perform just as well. But persistence *by itself* is not a valid argument for poor performance. Luke

Ryan Ingram wrote:
On Tue, Nov 18, 2008 at 12:46 PM, Luke Palmer
wrote: But when these persistent data structures are used in a single-threaded way, why should we not hope for the performance to be comparable?
If you can guarantee single-threaded use, then you can just use ST and implement the ephemeral structure, right?
It may not be easy, but just saying "they are persistent" is not really an excuse.
You can generally make a persistent data structure with the same asymptotic bounds as the ephemeral structure, ...
I would be very careful with the "generally" here. At least, I am not aware that this has been proved to always be possible. Also, in assertions about "the same asymptotic bounds", in this and a previous post in this thread, a distinction is important between worst-case and amortized costs. Just to complete the picture... Ciao, Janis. -- Dr. Janis Voigtlaender http://wwwtcs.inf.tu-dresden.de/~voigt/ mailto:voigt@tcs.inf.tu-dresden.de

On Sat, Nov 22, 2008 at 5:33 AM, Janis Voigtlaender
You can generally make a persistent data structure with the same asymptotic bounds as the ephemeral structure, ...
I would be very careful with the "generally" here. At least, I am not aware that this has been proved to always be possible.
Here's an informal proof: You can use an intmap to emulate pointers, to turn any ephemeral data structure into a persistent one. That adds at most a log-n factor to lookups and updates. For many structures, this is enough to prove asymptotic bounds equivalence. However, the standard 'pointer' model for ephemeral structures makes the assumption that memory size is limited; otherwise you have to add a log(n) factor there anyways, both to hold the large pointer values that get generated and to actually send the bits to the memory bus. Given this assumption you can take log(memory size) as a constant and argue that pointer lookup and update via the map is O(1).
Also, in assertions about "the same asymptotic bounds", in this and a previous post in this thread, a distinction is important between worst-case and amortized costs. Just to complete the picture...
That's true, but I think the more important distinction is the constant attached to the big O; persistent data structures tend to have much worse constant factors and those factors translate to a general 2x-3x slowdown. It's often true that a worse asymptotic cost algorithm is better because the constant factors are much better and the expected N in your program is small enough. -- ryan

On Sat, Nov 22, 2008 at 1:20 PM, Jason Dusek
Ryan Ingram
wrote: ...persistent data structures tend to have much worse constant factors and those factors translate to a general 2x-3x slowdown.
Can you explain why that is, or provide a citation for it? It's not something I've found easy to Google.
Consider insertion into a simple binary tree (no balancing condition). The persistent algorithm is: insert :: Key -> Tree -> Tree insert k Tip = Node k Nil Nil insert k (Node k' l r) | k < k' = Node k' (insert k l) r | otherwise = Node k' l (insert k r) The ephemeral algorithm is: insert :: Key -> IORef Tree -> IO () insert k p = do t <- readIORef p case t of Tip -> do l <- newIORef Tip r <- newIORef Tip writeIORef p (Node k l r) Node k' l r -> insert k $ if k < k' then l else r The big difference between these two algorithms is the amount of allocation and copying going on. Both dereference basically the same number of pointers. The ephemeral algorithm allocates exactly one new node and modifies exactly one pointer in memory. The persistent algorithm, on the other hand, copies the entire path being traversed down the tree, allocating that many nodes as well. (All of the "Tip" nodes can be shared; it can be treated like "NULL" in C) Unfortunately, I don't have any references; the 2-3x is an intuitive number from my past experience. It's worse for algorithms where you need to explicitly simulate pointers with maps because the structure is inherently ephemeral, and better for simple structures like the aforementioned binary tree. -- ryan

Ryan Ingram wrote:
On Sat, Nov 22, 2008 at 5:33 AM, Janis Voigtlaender
wrote: You can generally make a persistent data structure with the same asymptotic bounds as the ephemeral structure, ...
I would be very careful with the "generally" here. At least, I am not aware that this has been proved to always be possible.
Here's an informal proof:
You can use an intmap to emulate pointers, to turn any ephemeral data structure into a persistent one. That adds at most a log-n factor to lookups and updates. For many structures, this is enough to prove asymptotic bounds equivalence.
However, the standard 'pointer' model for ephemeral structures makes the assumption that memory size is limited; otherwise you have to add a log(n) factor there anyways, both to hold the large pointer values that get generated and to actually send the bits to the memory bus. Given this assumption you can take log(memory size) as a constant and argue that pointer lookup and update via the map is O(1).
Ah, that makes sense. -- Dr. Janis Voigtlaender http://wwwtcs.inf.tu-dresden.de/~voigt/ mailto:voigt@tcs.inf.tu-dresden.de

dave:
2008/11/18 kenny lu
: Here is a comparison of memory usage
Map : 345 MB IntMap : 146 MB Trie : 282 MB Python : 94 MB
Here is a comparison of execution time (on an intel dual core 2.0G)
Map: 26 sec IntMap: 9 sec Trie: 12 sec Python: 2.24 sec
The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python.
Can some one point out what's wrong with my implementations?
This isn't really a fair comparison. Map, IntMap, and Trie are persistent data structures, and Python dictionaries are ephemeral. (That is, when you "add" a key to a Map, you actually create a new one that shares structure with the old one, and both can be used in subsequent code. In Python, you would have to copy the dictionary.)
Strings, not ByteStrings. that's the difference. -- Don

On Tue, Nov 18, 2008 at 3:52 PM, Don Stewart
dave:
2008/11/18 kenny lu
: The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python.
Can some one point out what's wrong with my implementations?
This isn't really a fair comparison. Map, IntMap, and Trie are persistent data structures, and Python dictionaries are ephemeral. (That is, when you "add" a key to a Map, you actually create a new one that shares structure with the old one, and both can be used in subsequent code. In Python, you would have to copy the dictionary.)
Strings, not ByteStrings. that's the difference.
Is that in response to what I wrote, or to the original question?
Speaking of ByteStrings and tries, has anyone implemented a Patricia
Trie for ByteStrings? I started putting one together a while back, but
I got distracted and never finished it.
--
Dave Menendez

dave:
On Tue, Nov 18, 2008 at 3:52 PM, Don Stewart
wrote: dave:
2008/11/18 kenny lu
: The above number shows that my implementations of python style dictionary are space/time in-efficient as compared to python.
Can some one point out what's wrong with my implementations?
This isn't really a fair comparison. Map, IntMap, and Trie are persistent data structures, and Python dictionaries are ephemeral. (That is, when you "add" a key to a Map, you actually create a new one that shares structure with the old one, and both can be used in subsequent code. In Python, you would have to copy the dictionary.)
Strings, not ByteStrings. that's the difference.
Is that in response to what I wrote, or to the original question?
Speaking of ByteStrings and tries, has anyone implemented a Patricia Trie for ByteStrings? I started putting one together a while back, but I got distracted and never finished it.
I started putting one together a while back but I got distracted and never finished it. I think its a couple of days polishing. -- Don
participants (13)
-
Bulat Ziganshin
-
Claus Reinke
-
David Menendez
-
Don Stewart
-
Jan-Willem Maessen
-
Janis Voigtlaender
-
Jason Dusek
-
kenny lu
-
Ketil Malde
-
Krzysztof Skrzętnicki
-
Luke Palmer
-
Ryan Ingram
-
Tillmann Rendel