[Newbie] What to improve in my code

13 Jul 2010

      First of all: I'm not sure if this question is allowed here. If not, I
apologize

I'm trying to solve the following problem: For each word in a text find the
number of occurences for each unique word in the text.

i've come up with the following steps to solve this:
 * remove all punctuation except for whitespace and make the text lowercase
 * find all unique words in the text
 * for each unique word, count the number of occurences.

This has resulted in the following code:
removePunctuation :: [Char] -> [Char]
removePunctuation str = filter (\c -> elem c (['a'..'z'] ++ ['A'..'Z'] ++
['\t', ' ', '\n'])) str

process :: [Char] -> [String]
process str = words (map toLower (removePunctuation str))

unique :: (Eq a) => [a] -> [a]
unique [] = []
unique (x:xs) = [x] ++ unique (filter (\s -> x /= s) xs)

occurenceCount :: (Eq a) => a -> [a] -> Int
occurenceCount _ [] = 0
occurenceCount x (y:ys)
	| x == y = 1 + occurenceCount x ys
	| otherwise = occurenceCount x ys

occurenceCount' :: [String] -> [String] -> [(String, Int)]
occurenceCount' [] _ = [("", 0)]
occurenceCount' (u:us) xs = [(u, occurenceCount u xs)] ++ occurenceCount' us
xs

Please remember i've only been playing with Haskell for three afternoons now
and i'm happy that the above code is working correctly.

However i've got three questions:
1) occurenceCount' [] _ = [("", 0)] is plain ugly and also adds a useless
tuple to the end result. Is there a better way to solve this?
2) I'm forcing elements into a singleton list on two occasions, both in my
unique function and in my occurenceCount' function. Once again this seems
ugly and I'm wondering if there is a better solution.
3) The whole process as i'm doing it now feels pretty imperatively (been
working for years as a Java / PHP programmer). I've got this feeling that
the occurenceCount' function could be implemented using a mapping function.
What ways are there to make this more "functional"?

-- 
View this message in context: http://old.nabble.com/-Newbie--What-to-improve-in-my-code-tp29156025p2915602...
Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Frank1981

Daniel Fischer

Ketil Malde

David Virebayre

Dougal Stanton

tags

participants (5)