hashmap withdrawal and poor haskell style

Hello everyone, I just wrote my first haskell program. I started with a simple python program and tried to see if I could port it to haskell. The program reads text from stdin and prints out a histogram of all the letters: """ alphabet = 'abcdefghjiklmnopqrstuvwxyz' def letter_count(lines): res = {} for line in lines: for char in line.lower(): if char not in alphabet: continue if res.has_key(char): res[char] += 1 else: res[char] = 1 return res if __name__=="__main__": import sys hist = letter_count(sys.stdin) for letter in alphabet: print letter, "*" * hist.get(letter, 0) """ Kid stuff, but it took me about 10 hours to port it... :) Here's what I came up with: """ module Main where alphabet = "abcdefghijklmnopqrstuvwxyz" count ch str = length [c | c <- str , c == ch] hist str = [count letter str | letter <- alphabet] oneline ch str = [ch] ++ " " ++ stars (count ch str) stars x = if x == 0 then "" else "*" ++ stars ( x - 1 ) report str ch = do putStrLn ( oneline ch str ) loop f (h:t) = if t == [] then f h else do f h loop f t main = do content <- getContents let rpt letter = report content letter loop rpt alphabet """ Other than ignoring upper case letters, and being really really slow, it seems to work fine in hugs.... One thing I really missed was a hash / dictionary. I tried for about an hour to use Assoc following the examples from PLEAC: http://pleac.sourceforge.net/pleac_haskell/hashes.html ... But I never got it working: :> module Main where :> import Assoc (empty) :> main :: IO() :> main = do line <- getContents :> let w = length line :> count:: AssocDefault String Int :> count = w.foldl (\a s -> a.update s (+1)) empty :> print x -> ERROR "alphahist.hs":6 - Undefined type constructor "AssocDefault" (I couldn't get Assoc to work either, even with "Import Assoc (Assoc)" though I looked in the Assoc.hs file and could see it in there.. This was the most frustrating part of the experiment) Can anyone point me in the right direction here? Also, I'd really like to here anyone's thoughts on the code I have above, especially concercing what I could have done better. :) Thanks! Cheers, - Michal http://www.sabren.net/ sabren@manifestation.com ------------------------------------------------------------ Give your ideas the perfect home: http://www.cornerhost.com/ cvs - weblogs - php - linux shell - perl/python/cgi - java ------------------------------------------------------------

On Wed, 3 Apr 2002, Michal Wallace wrote:
module Main where alphabet = "abcdefghijklmnopqrstuvwxyz" count ch str = length [c | c <- str , c == ch] hist str = [count letter str | letter <- alphabet] oneline ch str = [ch] ++ " " ++ stars (count ch str) stars x = if x == 0 then "" else "*" ++ stars ( x - 1 ) report str ch = do putStrLn ( oneline ch str ) loop f (h:t) = if t == [] then f h else do f h loop f t main = do content <- getContents let rpt letter = report content letter loop rpt alphabet """
Other than ignoring upper case letters, and being really really slow, it seems to work fine in hugs....
I'm a bit confused how this can have worked... in Haskell `let' is used in the context of a `let ..<1>.. in ..<2>..' where the code ellided in <1> binds some names to values which are then used in the expression <2> (as in `let x=sqrt 2 in exp x') and so the contents of main isn't (unless I'm missing something) syntactically Haskell. Overall the code looks like a reasonable transliteration into Haskell; as you get more competent with Haskell you'll find various higher level functions that replace some of your stuff, e.g., stars can be written as stars x = take x (repeat '*'). Regarding Assoc, try just `import Assoc' without trying to cut down on which entities are imported; if it works then you've missed some needed elements from your import specification (the bit in ()'s after the module name); if it doesn't then it probably can't find Assoc at all. HTH ___cheers,_dave_________________________________________________________ www.cs.bris.ac.uk/~tweed/|`...heat generated by its microprocessors will email:tweed@cs.bris.ac.uk|slope upward exponentially, reaching the power work tel:(0117) 954-5250 |density of a nuclear reactor before 2010'-Intel

At 13:27 03-04-02 +0100, D. Tweed wrote:
On Wed, 3 Apr 2002, Michal Wallace wrote:
module Main where alphabet = "abcdefghijklmnopqrstuvwxyz" count ch str = length [c | c <- str , c == ch] hist str = [count letter str | letter <- alphabet] oneline ch str = [ch] ++ " " ++ stars (count ch str) stars x = if x == 0 then "" else "*" ++ stars ( x - 1 ) report str ch = do putStrLn ( oneline ch str ) loop f (h:t) = if t == [] then f h else do f h loop f t main = do content <- getContents let rpt letter = report content letter loop rpt alphabet """
Other than ignoring upper case letters, and being really really slow, it seems to work fine in hugs....
I'm a bit confused how this can have worked... in Haskell `let' is used in the context of a `let ..<1>.. in ..<2>..' where the code ellided in <1> binds some names to values which are then used in the expression <2> (as in `let x=sqrt 2 in exp x') and so the contents of main isn't (unless I'm missing something) syntactically Haskell.
It's correct Haskell. Have a look at http://www.haskell.org/onlinereport/exps.html#sect3.14 Rijk-Jan

On Wed, 3 Apr 2002, D. Tweed wrote:
main = do content <- getContents let rpt letter = report content letter loop rpt alphabet
I'm a bit confused how this can have worked... in Haskell `let' is used in the context of a `let ..<1>.. in ..<2>..' where the code ellided in <1> binds some names to values which are then used in the expression <2> (as in `let x=sqrt 2 in exp x') and so the contents of main isn't (unless I'm missing something) syntactically Haskell.
Beats me, but it was the only (or at least the first) way I could find to create a function inside a do block...
Overall the code looks like a reasonable transliteration into Haskell; as you get more competent with Haskell you'll find various higher level functions that replace some of your stuff, e.g., stars can be written as stars x = take x (repeat '*').
Neat! :)
Regarding Assoc, try just `import Assoc' without trying to cut down on which entities are imported; if it works then you've missed some needed elements from your import specification (the bit in ()'s after the module name); if it doesn't then it probably can't find Assoc at all.
In the latest version of hugs, if you try "import Assoc", you get: runhugs: Error occurred Reading file "assoc.hs": Reading file "/usr/share/hugs/lib/exts/Assoc.hs": Reading file "/usr/share/hugs/lib/exts/EdisonPrelude.hs": Reading file "/usr/share/hugs/lib/exts/Sequence.hs": Reading file "/usr/share/hugs/lib/Monad.hs": Reading file "/usr/share/hugs/lib/exts/Sequence.hs": Reading file "/usr/share/hugs/lib/exts/ListSeq.hs": Reading file "/usr/share/hugs/lib/exts/Assoc.hs": Parsing ERROR "/usr/share/hugs/lib/exts/Assoc.hs":51 - Haskell 98 does not support multiple parameter classes At least, *I* get that. :) And the -98 flag is just as bad: [~/work/haskell]: runhugs -98 assoc.hs runhugs: Error occurred Reading file "assoc.hs": Reading file "/usr/share/hugs/lib/exts/Assoc.hs": Reading file "/usr/share/hugs/lib/exts/EdisonPrelude.hs": Reading file "/usr/share/hugs/lib/exts/Sequence.hs": Reading file "/usr/share/hugs/lib/Monad.hs": Reading file "/usr/share/hugs/lib/exts/Sequence.hs": Reading file "/usr/share/hugs/lib/exts/ListSeq.hs": Reading file "/usr/share/hugs/lib/exts/Assoc.hs": Reading file "assoc.hs": ERROR "assoc.hs" - Entity "foldl1" imported from module "Assoc" already defined in module "Prelude" Which is why I restricted what was being imported. :) Anyway, thanks for the response! :) Cheers, - Michal http://www.sabren.net/ sabren@manifestation.com ------------------------------------------------------------ Give your ideas the perfect home: http://www.cornerhost.com/ cvs - weblogs - php - linux shell - perl/python/cgi - java ------------------------------------------------------------

It's correct Haskell. Have a look at http://www.haskell.org/onlinereport/exps.html#sect3.14
Thanks; serves me right for being lazy and not having actually read a version of the report since the various elements of monad syntax were introduced... ___cheers,_dave_________________________________________________________ www.cs.bris.ac.uk/~tweed/|`...heat generated by its microprocessors will email:tweed@cs.bris.ac.uk|slope upward exponentially, reaching the power work tel:(0117) 954-5250 |density of a nuclear reactor before 2010'-Intel

On Wed, 3 Apr 2002, D. Tweed wrote:
main = do content <- getContents let rpt letter = report content letter loop rpt alphabet
I'm a bit confused how this can have worked... in Haskell `let' is used in the context of a `let ..<1>.. in ..<2>..' where the code ellided in <1> binds some names to values which are then used in the expression <2> (as in `let x=sqrt 2 in exp x') and so the contents of main isn't (unless I'm missing something) syntactically Haskell.
Beats me, but it was the only (or at least the first) way I could find to create a function inside a do block...
You don't need to create a function, you could use partial parametrization instead: | main = | do content <- getContents | loop (report content) alphabet [snip] HTH, Jan de Wit

Hi Michal, Maybe you should use "accumArray" to get your histogram.
accumArray :: Ix a => (b -> c -> b) -> b -> (a,a) -> [(a,c)] -> Array a b
This functions takes a function to combine elements with the same index "a", an initial element "b", and the range of indices "(a,a)". We can now write a histogram function as follows:
import Array
type Histogram = Array Char Int
histogram :: String -> Histogram histogram input = accumArray (+) 0 (minBound,maxBound) [(c,1) | c <- input]
(The "minBound" and "maxBound" functions are overloaded from the "Bounded" class and give the minimal and maximal character). The histogram can be formatted as (leaving out zero entries):
format :: Histogram -> String format histogram = unlines [formatLine char count | (char,count) <- assocs histogram, count > 0]
formatLine :: Char -> Int -> String formatLine char count = [char] ++ " " ++ replicate count '*'
And tested:
test :: FilePath -> IO () test fname = do{ input <- readFile fname ; putStr (format (histogram input)) }
The "accumArray" can supposedly be implemented efficiently.
You can read more about that if you want at:
http://www.research.avayalabs.com/user/wadler/papers/array/array.ps
All the best,
Daan.
----- Original Message -----
From: "Michal Wallace"
On Wed, 3 Apr 2002, D. Tweed wrote:
main = do content <- getContents let rpt letter = report content letter loop rpt alphabet
I'm a bit confused how this can have worked... in Haskell `let' is used in the context of a `let ..<1>.. in ..<2>..' where the code ellided in <1> binds some names to values which are then used in the expression <2> (as in `let x=sqrt 2 in exp x') and so the contents of main isn't (unless I'm missing something) syntactically Haskell.
Beats me, but it was the only (or at least the first) way I could find to create a function inside a do block...
Overall the code looks like a reasonable transliteration into Haskell; as you get more competent with Haskell you'll find various higher level functions that replace some of your stuff, e.g., stars can be written as stars x = take x (repeat '*').
Neat! :)
Regarding Assoc, try just `import Assoc' without trying to cut down on which entities are imported; if it works then you've missed some needed elements from your import specification (the bit in ()'s after the module name); if it doesn't then it probably can't find Assoc at all.
In the latest version of hugs, if you try "import Assoc", you get:
runhugs: Error occurred Reading file "assoc.hs": Reading file "/usr/share/hugs/lib/exts/Assoc.hs": Reading file "/usr/share/hugs/lib/exts/EdisonPrelude.hs": Reading file "/usr/share/hugs/lib/exts/Sequence.hs": Reading file "/usr/share/hugs/lib/Monad.hs": Reading file "/usr/share/hugs/lib/exts/Sequence.hs": Reading file "/usr/share/hugs/lib/exts/ListSeq.hs": Reading file "/usr/share/hugs/lib/exts/Assoc.hs": Parsing ERROR "/usr/share/hugs/lib/exts/Assoc.hs":51 - Haskell 98 does not support multiple parameter classes
At least, *I* get that. :) And the -98 flag is just as bad:
[~/work/haskell]: runhugs -98 assoc.hs runhugs: Error occurred Reading file "assoc.hs": Reading file "/usr/share/hugs/lib/exts/Assoc.hs": Reading file "/usr/share/hugs/lib/exts/EdisonPrelude.hs": Reading file "/usr/share/hugs/lib/exts/Sequence.hs": Reading file "/usr/share/hugs/lib/Monad.hs": Reading file "/usr/share/hugs/lib/exts/Sequence.hs": Reading file "/usr/share/hugs/lib/exts/ListSeq.hs": Reading file "/usr/share/hugs/lib/exts/Assoc.hs": Reading file "assoc.hs": ERROR "assoc.hs" - Entity "foldl1" imported from module "Assoc" already defined in module "Prelude"
Which is why I restricted what was being imported. :)
Anyway, thanks for the response! :)
Cheers,
- Michal http://www.sabren.net/ sabren@manifestation.com ------------------------------------------------------------ Give your ideas the perfect home: http://www.cornerhost.com/ cvs - weblogs - php - linux shell - perl/python/cgi - java ------------------------------------------------------------
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Michal, As Daan Leijen has suggested, `accumArray` is probably the best solution to your simple histogramming problem. However, you might also be interested to learn about "finite maps", which are often the appropriate functional analogue to "hash maps". Finite maps are typically implemented with balanced trees and exhibit O(log n) time cost for insertion and lookup. (The extra cost over imperative languages' O(1) cost is the price paid for persistence.) The following version of your program runs on Hugs 98 and GHC. (For GHC you must specify "-package data" to access the `FiniteMap` module.)
import Char (isAlpha, toLower) import FiniteMap
main :: IO () main = interact (report . histogram)
type Histo = FiniteMap Char Int
histogram :: String -> Histo histogram = foldl tally emptyFM
tally :: Histo -> Char -> Histo tally histo ch = if isAlpha ch then addToFM_C (+) histo (toLower ch) 1 else histo
report :: Histo -> String report histo = unlines [ line ch | ch <- ['a'..'z'] ] where line ch = ch : " " ++ replicate (count ch) '*' count ch = lookupWithDefaultFM histo 0 ch
Dean Herington

ons 2002-04-03 klockan 15.51 skrev Daan Leijen:
import Array
type Histogram = Array Char Int
histogram :: String -> Histogram histogram input = accumArray (+) 0 (minBound,maxBound) [(c,1) | c <- input]
(The "minBound" and "maxBound" functions are overloaded from the "Bounded" class and give the minimal and maximal character).
Take care here though, since Char might be unicode, you may be allocating a BIG array. ghci: Ix.rangeSize (minBound :: Char, maxBound) 1114112 hugs: Ix.rangeSize (minBound :: Char, maxBound) 256 (1114112 is 17*2^16) Regards, Martin -- Martin Norbäck d95mback@dtek.chalmers.se Kapplandsgatan 40 +46 (0)708 26 33 60 S-414 78 GÖTEBORG http://www.dtek.chalmers.se/~d95mback/ SWEDEN OpenPGP ID: 3FA8580B

On Wed, Apr 03, 2002 at 07:13:03AM -0500, Michal Wallace wrote:
Hello everyone,
I just wrote my first haskell program. I started with a simple python program and tried to see if I could port it to haskell. The program reads text from stdin and prints out a histogram of all the letters:
Well, this is not my first Haskell-program, but I'm definitely still a Haskell-newbie :-D I had saved a few hours of work some time ago so I only had to adapt my countWords.hs to countLetters.hs to do The Right Thing(tm) It took me about five minutes (this time...) ;-p I hope the way I used FiniteMap (at least GHC has it...) isn't considered cheating :-) -------------------- module Main where import Char import FiniteMap printCount :: (Char, Integer) -> IO () printCount (letter, count) = putStrLn $ letter : replicate count '*' countLetters :: String -> [(Char, Integer)] countLetters letters = fmToList $ addListToFM_C (+) emptyFM $ zip letters (repeat 1) main = getContents >>= mapM_ printCount . countLetters . filter isAlpha -------------------- Happy Hacking Remi -- See the light and feel my warm desire, Run through your veins like the evening sun It will live but no eyes will see it, I'll bless your name before I die. Key fingerprint = CC90 A1BA CF6D 891C 5B88 C543 6C5F C469 8F20 70F4

Take care here though, since Char might be unicode, you may be allocating a BIG array.
Whooa, that would be bad. In the unicode case, a FiniteMap would do great
since it is extended on the fly with new entries.
-- Daan.
----- Original Message -----
From: "Martin Norbäck"
import Array
type Histogram = Array Char Int
histogram :: String -> Histogram histogram input = accumArray (+) 0 (minBound,maxBound) [(c,1) | c <- input]
(The "minBound" and "maxBound" functions are overloaded from the "Bounded" class and give the minimal and maximal character).
Take care here though, since Char might be unicode, you may be allocating a BIG array. ghci: Ix.rangeSize (minBound :: Char, maxBound) 1114112 hugs: Ix.rangeSize (minBound :: Char, maxBound) 256 (1114112 is 17*2^16) Regards, Martin -- Martin Norbäck d95mback@dtek.chalmers.se Kapplandsgatan 40 +46 (0)708 26 33 60 S-414 78 GÖTEBORG http://www.dtek.chalmers.se/~d95mback/ SWEDEN OpenPGP ID: 3FA8580B
participants (8)
-
D. Tweed
-
Daan Leijen
-
Dean Herington
-
Jan de Wit
-
Martin Norbäck
-
Michal Wallace
-
Remi Turk
-
Rijk J. C. van Haaften