really difficult for a beginner like me...

Hi everyone, this is my first posting on here and this was what drove me here and the quest to know more as i anticipate a lot of help and direction in this quite new and different environment haskell.I have this paper that i'm working on and need to solve these scenarios/ cases based on some sample codes: Scenarios/cases: 1) Allow words to be hyphenated and treat the hyphenated word as a single word (including the hyphen). 2) As for no. 2 but if the hyphen is the last character on a line treat the hyphenated word as a single word without the hyphen. 3) Treat a capitalised word (one or more capital letters) the same as lower case, i.e. only the lower case word appears in the index. 4) Treat a word ending in an ‘s’ as a plural and thus the same as the singular, i.e. only the singular appears in the index. 5) As for no. 5 but (a) treat suffix ‘ss’ as not a plural; and (b) treat the plural suffixes ‘sses’, ‘zzes’, ‘oes’, ‘xes’, ‘shes’, ‘ches’ the same as the singular, i.e. without the ‘es’, e.g. “branches” (except for 4- and 5-letter plurals with suffices ‘oes’ and ‘ches’, e.g. “floes”, and 4-letter plural suffix ‘xes’); and (c) treat the plural suffix ‘ies’ (except for 4-letter plurals, e.g. “pies”) as the singular suffix ‘y’. 6) Make the output more readable in the form of an index table. 7) Use an output file 8) Include a user-friendly menu by which the user can choose input and output file names. This is the code i'm supposed tomodify and in some cases create new functions to support.I also need some explanation as the various approaches in solving them.............................................................................................. The function makeIndex given a document produces a list of entries. Each entry is a word and a list of line numbers (for words > 4 letters) Type definitions: import Prelude -- hiding (Word) -- predefined Word hidden, so we can define ours -- type String = [Char] defined in Prelude type Doc = String type Line = String type Word = String -- our version makeIndex :: Doc -> [ ([Int], Word) ] A data-directed design considers a sequence of functions (i.e using composition operator ‘.’) to transform the document of type, Doc, into an index of type, [ ([Int], Word) ]. splitUp the document, doc, into a list of lines, [Line]. numLines pairs each line with a line number, [(Int, Line)]. allNumWords splits lines into words and line no., [(Int, Word)]. sortLs sorts words into alphabetical order, [(Int, Word)]. makeLists makes a list for each line number, [([Int], Word)]. amalgamate nos. into a list of nos. for each word, [([Int], Word)]. shorten into a list for words > 4 letters, [([Int], Word)]. makeIndex = shorten . -- [([Int], Word)] -> [([Int], Word)] amalgamate . -- [([Int], Word)] -> [([Int], Word)] makeLists . -- [(Int, Word)] -> [([Int], Word)] sortLs . -- [(Int, Word)] -> [(Int, Word)] allNumWords . -- [(Int, Line)] -> [(Int, Word)] numLines . -- [Line] -> [(Int, Line)] splitUp -- Doc -> [Line] Last -- [a] -> [a ] splitUp function splitUp :: Doc -> [Line] splitUp [] = [] splitUp text = takeWhile (/='\n') text : -- first line (splitUp . -- splitup other lines dropWhile (==’\n’) . -- delete 1st newline(s) dropWhile (/='\n')) text -- other lines Example: splitUp “hello world\n\nnext world” => [“hello world”, “next world”] numLines function: numLines :: [Line] -> [(Int, Line)] numLines lines -- list of pairs of = zip [1 .. length lines] lines -- line no. & line Example: numLines [“hello world”, “next world”] => [(1, “hello world”), (2, “next world”)] splitWords function: -- for each line -- a) split into words -- b) attach line no. to each word splitWords :: Line -> [Word] -- a) splitWords [ ] = [ ] splitWords line = takeWhile isLetter line : -- first word in line (splitWords . -- split other words dropWhile (not.isLetter && Last ==’-’) . -- delete separators dropWhile isLetter) line -- other words where isLetter ch = (‘a’<=ch) && (ch<=’z’) || (‘A’<=ch) && (ch<=’Z’) Example: splitWords “hello world” => [“hello”, “world”] allNumWords function: numWords :: (Int, Line) -> [(Int, Word)] -- b) numWords (number, line) = map addLineNum ( splitWords line) -- all line pairs where addLineNum word = (number, word) -- a pair allNumWords :: [(Int, Line)] -> [(Int, Word)] allNumWords = concat . map numWords -- doc pairs Examples: addLineNum “hello” => (1, “hello”) numWords (1, “hello world”) => [(1, “hello”), (1, “world”)] allNumWords [(1, “hello world”), (2, “next world”)] => [(1, “hello”), (1, “world”), (2, “next”), (2, “world”)] SortLs function: sortLs :: [(Int, Word)] -> [(Int, Word)] sortLs [ ] = [ ] sortLs (a:x) = sortLs [b | b <- x, compare b a] -- sort 1st half ++ [a] ++ -- 1st in middle sortLs [b | b <- x, compare a b] -- sort 2nd half where compare (n1, w1) (n2, w2) = (w1 < w2) -- 1st word less || (w1 == w2 && n1 < n2) -- check no. Example: sortLs [(1, “hello”), (1, “world”), (2, “next”), (2, “world”)] => [(1, “hello”), (2, “next”), (1, “world”), (2, “world”)] makeLists function: makeLists :: [(Int, Word)] -> [([Int], Word)] makeLists = map mk -- all pairs where mk (num, word) = ([num], word) -- list of single no. Examples: mk (1, “hello”) => ([1], “hello”) makeLists [(1, “hello”), (2, “next”), (1, “world”), (2, “world”)] => [([1], “hello”), ([2], “next”), ([1], “world”), ([2], “world”)] Amalgamate function: amalgamate :: [([Int], Word)] -> [([Int], Word)] amalgamate [ ] = [ ] amalgamate [a] = [a] amalgamate ((n1, w1) : (n2, w2) : rest) -- pairs of pairs | w1 /= w2 = (n1, w1) : amalgamate ((n2, w2) : rest) | otherwise = amalgamate ((n1 ++ n2, w1) : rest) -- if words are same grow list of numbers Example: amalgamate [([1], “hello”), ([2], “next”), ([1], “world”), ([2], “world”)] => [([1], “hello”), ([2], “next”), ([1, 2], “world”)] Shorten function: shorten :: [([Int], Word)] -> [([Int], Word)] shorten = filter long -- keep pairs >4 where long (num, word) = length word > 4 -- check word >4 Example: shorten [([1], “hello”), ([2], “next”), ([1, 2], “world”)] => [([1], “hello”), ([1, 2], “world”)] _________________________________________________________________ Connect to the next generation of MSN Messenger http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline

2008/5/4 Ivan Amarquaye
Hi everyone,
this is my first posting on here and this was what drove me here and the quest to know more as i anticipate a lot of help and direction in this quite new and different environment haskell.I have this paper that i'm working on and need to solve these scenarios/ cases based on some sample codes:
You might receive better help if you asked smaller, more specific questions. This looks like homework, and even if it's not, we are a homework-friendly crowd, meaning: nobody is going to write code for you. We will answer questions in words, point you to useful library functions, give you feedback on approaches you outline to us, etc. So break up the problem. Try no. 1 by yourself, and if you can't do it, then describe what you tried and how it didn't work. Giving an outline of how you think you should approach the problem from a purely functional perspective will help, so we can help you modify and correct that idea. The more thought you put in by yourself before asking us, the more you will get out of our responses. But be much more specific. I cannot answer a question this large. Luke

thanks for the tip there....its been four gruesome days and i just don't seem to make any understanding of how to implement some changes or create some new functions due to the fact that im so new to Haskell and functional programming. For the very first case of allowing hyphenated words to be treated as single words i manged to successfully do that by adding to the definition of the splitWords function to also accept characters such as "-" and it worked perfectly after running it. The next case posed a headache for me as i have been on it for 3 days now. >From my understanding, it means in situations where your writing a sentence and you get to the end of the line while writing a word, you decide to put a hyphen there and continue on the other line. So the case demands that i allow sentences that end with hyphens and continue on the next line to drop the hyphen and be a single word on that same line without having to continue on the next line so this was how i foresee the input it in hugs: Input: makeIndex "these are the very same stuff they tell each-\nother" output: should be this: [[1]these],[[1]eachother]. 1 indicates they are on the same line and the others are left out as the index takes words greater than 4 characters and i have been struggling with this since. i tried on several counts to include in the splitwords function to dropWhile "-" is found in the words but it turned out an error.I also tried creating a new function to do that didnt succeed either can anybody help me out in this regard..... _________________________________________________________________ Explore the seven wonders of the world http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-US&form=QBRE

2008/5/4 Ivan Amarquaye
thanks for the tip there....its been four gruesome days and i just don't seem to make any understanding of how to implement some changes or create some new functions due to the fact that im so new to Haskell and functional programming.
For the very first case of allowing hyphenated words to be treated as single words i manged to successfully do that by adding to the definition of the splitWords function to also accept characters such as "-" and it worked perfectly after running it.
The next case posed a headache for me as i have been on it for 3 days now.
From my understanding, it means in situations where your writing a sentence and you get to the end of the line while writing a word, you decide to put a hyphen there and continue on the other line. So the case demands that i allow sentences that end with hyphens and continue on the next line to drop the hyphen and be a single word on that same line without having to continue on the next line so this was how i foresee the input it in hugs:
Input: makeIndex "these are the very same stuff they tell each-\nother"
output: should be this: [[1]these],[[1]eachother]. 1 indicates they are on the same line and the others are left out as the index takes words greater than 4 characters and i have been struggling with this since. i tried on several counts to include in the splitwords function to dropWhile "-" is found in the words but it turned out an error.I also tried creating a new function to do that didnt succeed either can anybody help me out in this regard.....
There are many ways of doing this of course. Perhaps you need to write a function like so: -- fixes up hyphenated words fixupHyphens :: [ (Int, Word) ] -> [ (Int, Word ) ] fixupHyphens ( (line1, word1):(line2:word2):xs ) | ... check if word1 ends with hyphen and line2 /= line1 ... = ( line1, ... something .. ) : fixupHyphens xs | otherwise = (line1, word1):(line2:word2): fixupHyphens xs fixupHyphens xs = xs Then you can insert this function in the appropriate place in the makeIndex function (probably before sorting, as you depend on the words showing up in order). -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Ivan Amarquaye
thanks for the tip there....its been four gruesome days and i just don't
For the very first case of allowing hyphenated words to be treated as single words i manged to successfully do that by adding to the definition of the splitWords function to also accept characters such as "-" and it worked
The next case posed a headache for me as i have been on it for 3 days now. From my understanding, it means in situations where your writing a sentence and you get to the end of the line while writing a word, you decide to put a hyphen there and continue on the other line. So the case demands that i allow sentences that end with hyphens and continue on the next line to drop the hyphen and be a single word on that same line without having to continue on
Input: makeIndex "these are the very same stuff they tell each-\nother" output: should be this: [[1]these],[[1]eachother]. 1 indicates they are on the same
seem to make any understanding of how to implement some changes or create some new functions due to the fact that im so new to Haskell and functional programming. perfectly after running it. the next line so this was how i foresee the input it in hugs: line and the others are left out as the index takes words greater than 4 characters and i have been struggling with this since. i tried on several counts to include in the splitwords function to dropWhile "-" is found in the words but it turned out an error.I also tried creating a new function to do that didnt succeed either can anybody help me out in this regard.....
U happen to find a way for your problem? I tried a lot for more than a week now, but cant do it.

saaJamal

-----Ursprüngliche Nachricht----- Von: saaj [ Gesendet: 24.03.2010 13:13:29 An: haskell-cafe@haskell.org Betreff: [Haskell-cafe] Re: really difficult for a beginner like me...
saaJamal [ hotmail.com> writes:
U happen to find a way for your problem? I tried a lot for more than a week now, but cant do it.
I tried many tutorials but wasnt of any good.
as per the above case study, I need to do is:
1) Allow words to be hyphenated and treat a hyphenated word as a single word. However, for those words which are split over two lines, treat a split word as a single word without the hyphen.
for this i tried: fixupHyphens :: [ (Int, Word) ] -> [ (Int, Word ) ] fixupHyphens ( (line1, word1):(line2:word2):xs ) | if (word1, line2) /= line1 = ( line1,word2 ) : fixupHyphens xs | otherwise = (line1, word1):(line2:word2): fixupHyphens xs fixupHyphens xs = xs
It's probably easier to treat the hyphens before pairing the words with the line-numbers. If you needn't care about words containing a hyphen (like line-numbers above), a simple preprocessor fixHyphens :: String -> String fixHyphens ('-':'\n':more) = (move rest of the word before the newline and remove hyphen) fixHyphens (c:cs) = c: fixHyphens cs fixHyphens "" = "" should do the trick.
and for including hiphens i added this to the code:
splitWords :: Line -> [Word] -- a)
splitWords [] = [] splitWords line = takeWhile isLetter line : -- first word in line (splitWords . -- split other words dropWhile (not.isLetter) . -- delete separators dropWhile isLetter) line -- other words where isLetter ch = (('a'<=ch) && (ch<='z')) || (('A'<=ch) && (ch<='Z')) || ('-' = ch)
2)Treat a capitalised word (one or more capitals) as being different from the word in all lower case (but they should still be sorted alphabetically)unless it is at the start of a sentence with only the initial letter capitalised.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Thank you. I will try it What about the second part, capitalisation thing? can you help me with that as well? "Treat a capitalised word (one or more capitals) as being different from the word in all lower case unless it is at the start of a sentence with only the initial letter capitalised."

-----Ursprüngliche Nachricht----- Von: saaj [ Gesendet: 24.03.2010 15:17:19 An: haskell-cafe@haskell.org Betreff: [Haskell-cafe] Re: really difficult for a beginner like me...
Thank you. I will try it
What about the second part, capitalisation thing? can you help me with that as well?
"Treat a capitalised word (one or more capitals) as being different from the word in all lower case unless it is at the start of a sentence with only the initial letter capitalised."
Well, the obvious idea is to do that by lower-casing the words which fit these criteria before further processing. Identifying the start of a sentence is difficult if you have to take all possibilities for dots, question marks and exclamation marks into account, but probably treating just the semi-obvious cases is enough (check whether you have e.g. an abbreviation instead of an end-of-sentence dot), that's not too hard.
participants (6)
-
Daniel Fischer
-
Ivan Amarquaye
-
Luke Palmer
-
saaj
-
saaJamal
-
Sebastian Sylvan