
I'm trying to create a hash table. Yeah, I know, don't use hash tables, but I need to create something I'm familiar with, not something I've never worked with before. What's wrong with this code? Michael ==================== import Prelude hiding (lookup) import Data.HashTable data MyHashTable = HashTable String Int dummy:: String -> Int dummy s = 7 ht = MyHashTable.new (==) dummy ==================== [michael@localhost ~]$ ghci hash1 GHCi, version 6.10.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer ... linking ... done. Loading package base ... linking ... done. [1 of 1] Compiling Main ( hash1.hs, interpreted ) hash1.hs:9:5: Not in scope: `MyHashTable.new' Failed, modules loaded: none. Prelude>

Hello michael, Tuesday, November 17, 2009, 10:16:50 PM, you wrote:
I'm trying to create a hash table. Yeah, I know, don't use hash tables, but I need to create something I'm familiar with, not something I've never worked with before. What's wrong with this code?
ht = MyHashTable.new (==) dummy
MyHashTable should be a name of module. seems that you try to use your OOP instinct here :) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Am Dienstag 17 November 2009 20:16:50 schrieb michael rice:
I'm trying to create a hash table. Yeah, I know, don't use hash tables, but I need to create something I'm familiar with, not something I've never worked with before. What's wrong with this code?
Michael
====================
import Prelude hiding (lookup) import Data.HashTable
data MyHashTable = HashTable String Int
dummy:: String -> Int dummy s = 7
ht = MyHashTable.new (==) dummy
====================
MyHashTable.new is parsed as a qualified function, 'new' from the module MyHashTable. But there's no module MyHashTable imported, hence there's no function 'new' from that module in scope.
[michael@localhost ~]$ ghci hash1 GHCi, version 6.10.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer ... linking ... done. Loading package base ... linking ... done. [1 of 1] Compiling Main ( hash1.hs, interpreted )
hash1.hs:9:5: Not in scope: `MyHashTable.new' Failed, modules loaded: none. Prelude>
If we look at the type of Data.HashTable.new: new :: (key -> key -> Bool) -> (key -> GHC.Int.Int32) -> IO (HashTable key val) we see that new (==) dummy (or Data.HashTable.new (==) dummy, but we don't need to qualify new) has type IO (HashTable String val), so is an IO-action returning a hashtable. What you probably wanted was type MyHashTable = HashTable String Int -- not data MyHashTable ht <- new (==) dummy :: IO MyHashTable then ht is a hashtable of type MyHashTable.

Am Dienstag 17 November 2009 20:36:46 schrieb Daniel Fischer:
What you probably wanted was
type MyHashTable = HashTable String Int -- not data MyHashTable
Just in case it's not clear:
ht <- new (==) dummy :: IO MyHashTable
only works at the prompt or in an IO do-block, not at the top level of the module.
then ht is a hashtable of type MyHashTable.

Hi Daniel,
Thanks for the IO monad reminder.
What is GHC.Int.Int32? Can't find it on Hoogle.
Michael
==================
*Main> ht <- new (==) dummy :: IO MyHashTable
<interactive>:1:15:
Couldn't match expected type `GHC.Int.Int32'
against inferred type `Int'
In the second argument of `new', namely `dummy'
In a stmt of a 'do' expression:
ht <- new (==) dummy :: IO MyHashTable
*Main> :t dummy
dummy :: String -> Int
*Main>
--- On Tue, 11/17/09, Daniel Fischer
What you probably wanted was
type MyHashTable = HashTable String Int -- not data MyHashTable
Just in case it's not clear:
ht <- new (==) dummy :: IO MyHashTable
only works at the prompt or in an IO do-block, not at the top level of the module.
then ht is a hashtable of type MyHashTable.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Am Dienstag 17 November 2009 21:09:26 schrieb michael rice:
What is GHC.Int.Int32? Can't find it on Hoogle.
Int32 is the guaranteed-to-be-32-bits integer type, it's defined in the module GHC.Int, typically it's imported via Data.Int (because, naturally, using GHC's own modules isn't portable). Due to the way ghci's :type command works, such types are printed qualified. The hash function must have a return type of fixed and specified width, or porting the app between 32-bit and 64-bit platforms could have enormous performance impact, so it can't be plain Int.

Hello Daniel, Tuesday, November 17, 2009, 11:21:18 PM, you wrote:
The hash function must have a return type of fixed and specified width, or porting the app between 32-bit and 64-bit platforms could have enormous performance impact, so it can't be plain Int.
i think that the problem with use of Int instead of Int32 here would be different results with incorrect comparison functions, nothing more -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Look in Data.Int for a list of the Int types. Basically you generally only use Int unless you are working at a lower- level which has an explicit requirement for the number of bits. In this case, HashTable has such a requirement, so you have two choices: you can either change the type annotation on dummy: import Data.Int dummy:: String -> Int32 dummy s = 7 or you can just leave it off entirely, and GHC will automatically infer the correct type (without you needing to import Data.Int): dummy s = 7 On Nov 17, 2009, at 12:09 PM, michael rice wrote:
Hi Daniel,
Thanks for the IO monad reminder.
What is GHC.Int.Int32? Can't find it on Hoogle.
Michael
==================
*Main> ht <- new (==) dummy :: IO MyHashTable
<interactive>:1:15: Couldn't match expected type `GHC.Int.Int32' against inferred type `Int' In the second argument of `new', namely `dummy' In a stmt of a 'do' expression: ht <- new (==) dummy :: IO MyHashTable *Main> :t dummy dummy :: String -> Int *Main>
--- On Tue, 11/17/09, Daniel Fischer
wrote: From: Daniel Fischer
Subject: Re: [Haskell-cafe] Simple hash table creation To: haskell-cafe@haskell.org Date: Tuesday, November 17, 2009, 2:45 PM Am Dienstag 17 November 2009 20:36:46 schrieb Daniel Fischer:
What you probably wanted was
type MyHashTable = HashTable String Int -- not data MyHashTable
Just in case it's not clear:
ht <- new (==) dummy :: IO MyHashTable
only works at the prompt or in an IO do-block, not at the top level of the module.
then ht is a hashtable of type MyHashTable.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi Gregory,
I was wondering about that, because of the following:
[1 of 1] Compiling Main ( hash1.hs, interpreted )
Ok, modules loaded: Main.
*Main> ht <- new (==) dummy :: IO MyHashTable
*Main> dummy "mike"
7
*Main> dummy "michael"
7
*Main> insert ht "mike" 1
*Main> toList ht
[("mike",1)]
*Main> insert ht "michael" 2
*Main> toList ht
[("michael",2),("mike",1)]
*Main> insert ht "miguel" 3
*Main> toList ht
[("miguel",3),("michael",2),("mike",1)]
*Main> :t dummy "miguel"
dummy "miguel" :: Int32
*Main>
It seems my dummy function is being ignored. I figured I would only be able to store a single value with a hash function that always returns 7. Why ask for a hash function and not use it?
Also, it's said that it is good programming practice to include type
information in function definitions, so I always try to do that, even
though it usually leads to my code being rejected. I need to step back
and use the :t and module functions defs to figure out what types are
returned and required as arguments, instead of trying to puzzle it out
myself.
Like juggling, there's a lot of balls in the air w/Haskell, lots of things to remember, but it's the most intriguing computer language I've looked at in a long time.
Thanks for your input.
Michael
--- On Tue, 11/17/09, Gregory Crosswhite
What you probably wanted was
type MyHashTable = HashTable String Int -- not data MyHashTable
Just in case it's not clear:
ht <- new (==) dummy :: IO MyHashTable
only works at the prompt or in an IO do-block, not at the top level of the module.
then ht is a hashtable of type MyHashTable.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Tue, Nov 17, 2009 at 4:00 PM, michael rice
Hi Gregory,
I was wondering about that, because of the following:
[1 of 1] Compiling Main ( hash1.hs, interpreted ) Ok, modules loaded: Main. *Main> ht <- new (==) dummy :: IO MyHashTable *Main> dummy "mike" 7 *Main> dummy "michael" 7 *Main> insert ht "mike" 1 *Main> toList ht [("mike",1)] *Main> insert ht "michael" 2 *Main> toList ht [("michael",2),("mike",1)] *Main> insert ht "miguel" 3 *Main> toList ht [("miguel",3),("michael",2),("mike",1)] *Main> :t dummy "miguel" dummy "miguel" :: Int32 *Main>
It seems my dummy function is being ignored. I figured I would only be able to store a single value with a hash function that always returns 7. Why ask for a hash function and not use it?
[...] Most hash tables deal with collisions, i.e. the case when two keys stored in the table hash to the same value. In the case of your 'dummy' hash function, which always returns 7, every key hashes to the same value, hence collisions galore. One way to deal with collisions is to hash a key to a bucket (i.e. list) of items, then walk down the list looking for the given key. In such an implementation (and I believe for hash tables in general), the quality of the hash function greatly affects the performance of the hash table operations. I am not sure what implementation Data.HashTable uses. However, I believe Data.HashTable is not all that good: for multi-million element tables from Int to Int, Data.IntMap runs many times faster than Data.HashTable. I have no wish to start a flame war here (this topic has in the past), but the state of affairs regarding hash tables in Haskell is not good. Sincerely, Brad

So, what you're telling me is, my dummy hash function IS being used, and because of collisions the other values are placed in different locations?
Michael
--- On Tue, 11/17/09, Brad Larsen
Hi Gregory,
I was wondering about that, because of the following:
[1 of 1] Compiling Main ( hash1.hs, interpreted ) Ok, modules loaded: Main. *Main> ht <- new (==) dummy :: IO MyHashTable *Main> dummy "mike" 7 *Main> dummy "michael" 7 *Main> insert ht "mike" 1 *Main> toList ht [("mike",1)] *Main> insert ht "michael" 2 *Main> toList ht [("michael",2),("mike",1)] *Main> insert ht "miguel" 3 *Main> toList ht [("miguel",3),("michael",2),("mike",1)] *Main> :t dummy "miguel" dummy "miguel" :: Int32 *Main>
It seems my dummy function is being ignored. I figured I would only be able to store a single value with a hash function that always returns 7. Why ask for a hash function and not use it?
[...] Most hash tables deal with collisions, i.e. the case when two keys stored in the table hash to the same value. In the case of your 'dummy' hash function, which always returns 7, every key hashes to the same value, hence collisions galore. One way to deal with collisions is to hash a key to a bucket (i.e. list) of items, then walk down the list looking for the given key. In such an implementation (and I believe for hash tables in general), the quality of the hash function greatly affects the performance of the hash table operations. I am not sure what implementation Data.HashTable uses. However, I believe Data.HashTable is not all that good: for multi-million element tables from Int to Int, Data.IntMap runs many times faster than Data.HashTable. I have no wish to start a flame war here (this topic has in the past), but the state of affairs regarding hash tables in Haskell is not good. Sincerely, Brad

On Tue, Nov 17, 2009 at 4:22 PM, michael rice
So, what you're telling me is, my dummy hash function IS being used, and because of collisions the other values are placed in different locations?
Michael
[...] If Data.HashTable is implemented using separate chaining, then all the key/value pairs would be hashed to the same bucket (hash value 7). If a different scheme is used, then hash collisions would be resolved in a different way, e.g., through linear probing. Regardless of the collision resolution scheme used, excessive hash collisions are bad, and are what can cause the worst-case time complexity of hash table operations (in most implementations) to be O(n). The Wikipedia page on hash tables isn't bad: http://en.wikipedia.org/wiki/Hash_table. Sincerely, Brad

Hello michael, Wednesday, November 18, 2009, 12:00:58 AM, you wrote: *Main>> toList ht
[("miguel",3),("michael",2),("mike",1)]
It seems my dummy function is being ignored.
i wonder why you think so? your ht has all 3 pairs you ever inserted inside, they all are inside the same bucket since hash function returns the same value for them ... hmm, i guess that you expect something like low-level hashtable from other languages - i.e. it should have just one slot for all values hashed to 7 and save here last value it works other way - it has just one slot for all your values but it stores here LIST of your pairs. so nothing is lost. if you want to save only the last value, replace (==) to (\a b -> dummy a==dummy b) the whole story is that hash function used to select slot, and then key part of pair is compared using (==) with keys of all values in the slot. it will replace existing pair if key1==key2, otherwise added to list so, hashing allows to quickly find pair by its key, but it still full map as far as you pass a usual (==) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Hi Bulat,
I was just looking for a simple hashing function. I was going to use the length function but a constant int is even simpler. I'm supposing that had I used the length function "mike" and "fred" would end up in the same bucket.
This is the first time I've tried to do anything in Haskell with data structures other than lists, so I needed to see how things work.
Thanks, everyone, for the help.
Michael
--- On Tue, 11/17/09, Bulat Ziganshin
[("miguel",3),("michael",2),("mike",1)]
It seems my dummy function is being ignored.
i wonder why you think so? your ht has all 3 pairs you ever inserted inside, they all are inside the same bucket since hash function returns the same value for them ... hmm, i guess that you expect something like low-level hashtable from other languages - i.e. it should have just one slot for all values hashed to 7 and save here last value it works other way - it has just one slot for all your values but it stores here LIST of your pairs. so nothing is lost. if you want to save only the last value, replace (==) to (\a b -> dummy a==dummy b) the whole story is that hash function used to select slot, and then key part of pair is compared using (==) with keys of all values in the slot. it will replace existing pair if key1==key2, otherwise added to list so, hashing allows to quickly find pair by its key, but it still full map as far as you pass a usual (==) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Thanks all,
Got it! type rather than data and <- rather than = (should have remembered this from monad stuff). Also, don't need the qualification.
Onward and upward.
Thanks,
Michael
--- On Tue, 11/17/09, Daniel Fischer
I'm trying to create a hash table. Yeah, I know, don't use hash tables, but I need to create something I'm familiar with, not something I've never worked with before. What's wrong with this code?
Michael
====================
import Prelude hiding (lookup) import Data.HashTable
data MyHashTable = HashTable String Int
dummy:: String -> Int dummy s = 7
ht = MyHashTable.new (==) dummy
====================
MyHashTable.new is parsed as a qualified function, 'new' from the module MyHashTable. But there's no module MyHashTable imported, hence there's no function 'new' from that module in scope.
[michael@localhost ~]$ ghci hash1 GHCi, version 6.10.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer ... linking ... done. Loading package base ... linking ... done. [1 of 1] Compiling Main ( hash1.hs, interpreted )
hash1.hs:9:5: Not in scope: `MyHashTable.new' Failed, modules loaded: none. Prelude>
If we look at the type of Data.HashTable.new: new :: (key -> key -> Bool) -> (key -> GHC.Int.Int32) -> IO (HashTable key val) we see that new (==) dummy (or Data.HashTable.new (==) dummy, but we don't need to qualify new) has type IO (HashTable String val), so is an IO-action returning a hashtable. What you probably wanted was type MyHashTable = HashTable String Int -- not data MyHashTable ht <- new (==) dummy :: IO MyHashTable then ht is a hashtable of type MyHashTable. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

You don't need to create a new type for a String -> Int hashtable, you already get it for free because HashTable is a parameterized type. Also, although you are apparently trying to make life simpler for yourself using a HashTable, you are actually making like more complicated because the Data.HashTable implementation can only be worked with inside the IO monad. So what you'd need is something like: ==================== import Prelude hiding (lookup) import Data.HashTable dummy s = 7 main = do ht <- new (==) dummy insert ht "Foo" 12 value <- lookup ht "Foo" putStrLn . show $ value ==================== Note that I didn't have to label any of the types; Haskell is smart enough to mostly infer all of them automatically. The main reason to include types is if there is some ambiguity. So for example, if you actually wanted to map Strings to Floats, then you would need to explicitly tell it somewhere that the values it is storing are floats. You could do this by either pinning down explicitly the HashTable type: ==================== import Prelude hiding (lookup) import Data.HashTable dummy s = 7 main = do ht <- new (==) dummy :: IO (HashTable String Float) insert ht "Foo" 12 value <- lookup ht "Foo" putStrLn . show $ value ==================== Or just by pinning down the type of a value that you insert into it: ==================== import Prelude hiding (lookup) import Data.HashTable dummy s = 7 main = do ht <- new (==) dummy insert ht "Foo" (12 :: Float) value <- lookup ht "Foo" putStrLn . show $ value ==================== Again, the downside though is that you can't work with HashTable outside of the IO monad. If what you want is just a map from strings to values, then you probably are better off using Data.Map: ==================== import Prelude hiding (lookup) import Data.Map my_map = empty :: Map String Int my_map_after_adding_key = insert "Foo" 12 my_map value_associated_with_Foo = lookup "Foo" my_map_after_adding_key main = putStrLn . show $ value_associated_with_Foo ==================== Cheers, Greg On Nov 17, 2009, at 11:16 AM, michael rice wrote:
I'm trying to create a hash table. Yeah, I know, don't use hash tables, but I need to create something I'm familiar with, not something I've never worked with before. What's wrong with this code?
Michael
====================
import Prelude hiding (lookup) import Data.HashTable
data MyHashTable = HashTable String Int
dummy:: String -> Int dummy s = 7
ht = MyHashTable.new (==) dummy
====================
[michael@localhost ~]$ ghci hash1 GHCi, version 6.10.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer ... linking ... done. Loading package base ... linking ... done. [1 of 1] Compiling Main ( hash1.hs, interpreted )
hash1.hs:9:5: Not in scope: `MyHashTable.new' Failed, modules loaded: none. Prelude>
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (5)
-
Brad Larsen
-
Bulat Ziganshin
-
Daniel Fischer
-
Gregory Crosswhite
-
michael rice