1) Allow words to be hyphenated and treat the hyphenated word as a single word (including the hyphen).
2) As for no. 2 but if the hyphen is the last character on a line treat the hyphenated word as a single word without the hyphen.
3) Treat a capitalised word (one or more capital letters) the same as lower case, i.e. only the lower case word appears in the index.
4) Treat a word ending in an ‘s’ as a plural and thus the same as the singular, i.e. only the singular appears in the index.
5) As for no. 5 but (a) treat suffix ‘ss’ as not a plural; and (b) treat the plural suffixes ‘sses’, ‘zzes’, ‘oes’, ‘xes’, ‘shes’, ‘ches’ the same as the singular, i.e. without the ‘es’, e.g. “branches” (except for 4- and 5-letter plurals with suffices ‘oes’ and ‘ches’, e.g. “floes”, and 4-letter plural suffix ‘xes’); and (c) treat the plural suffix ‘ies’ (except for 4-letter plurals, e.g. “pies”) as the singular suffix ‘y’.
8) Include a user-friendly menu by which the user can choose input and output file names.
This is the code i'm supposed tomodify and in some cases create new functions to support.I also need some explanation as the various approaches in solving them..............................................................................................
The function makeIndex given a document produces a list of entries.
Each entry is a word and a list of line numbers (for words > 4 letters)
Type definitions:
import Prelude -- hiding (Word)
-- predefined Word hidden, so we can define ours
-- type String = [Char] defined in Prelude
type Doc = String
type Line = String
type Word = String -- our version
makeIndex :: Doc -> [ ([Int], Word) ]
A data-directed design considers a sequence of functions (i.e using composition operator ‘.’) to transform the document of type, Doc, into an index of type, [ ([Int], Word) ].
splitUp the document, doc, into a list of lines, [Line].
numLines pairs each line with a line number, [(Int, Line)].
allNumWords splits lines into words and line no., [(Int, Word)].
sortLs sorts words into alphabetical order, [(Int, Word)].
makeLists makes a list for each line number, [([Int], Word)].
amalgamate nos. into a list of nos. for each word, [([Int], Word)].
shorten into a list for words > 4 letters, [([Int], Word)].
makeIndex
= shorten . -- [([Int], Word)] -> [([Int], Word)]
amalgamate . -- [([Int], Word)] -> [([Int], Word)]
makeLists . -- [(Int, Word)] -> [([Int], Word)]
sortLs . -- [(Int, Word)] -> [(Int, Word)]
allNumWords . -- [(Int, Line)] -> [(Int, Word)]
numLines . -- [Line] -> [(Int, Line)]
splitUp -- Doc -> [Line]
Last -- [a] -> [a ]
splitUp function
splitUp :: Doc -> [Line]
splitUp [] = []
splitUp text
= takeWhile (/='\n') text : -- first line
(splitUp . -- splitup other lines
dropWhile (==’\n’) . -- delete 1st newline(s)
dropWhile (/='\n')) text -- other lines
Example:
splitUp “hello world\n\nnext world”
=> [“hello world”, “next world”]
numLines function:
numLines :: [Line] -> [(Int, Line)]
numLines lines -- list of pairs of
= zip [1 .. length lines] lines -- line no. & line
Example:
numLines [“hello world”, “next world”]
=> [(1, “hello world”), (2, “next world”)]
splitWords function:
-- for each line
-- a) split into words
-- b) attach line no. to each word
splitWords :: Line -> [Word] -- a)
splitWords [ ] = [ ]
splitWords line
= takeWhile isLetter line : -- first word in line
(splitWords . -- split other words
dropWhile (not.isLetter && Last ==’-’) . -- delete separators
dropWhile isLetter) line -- other words
where
isLetter ch
= (‘a’<=ch) && (ch<=’z’)
|| (‘A’<=ch) && (ch<=’Z’)
Example:
splitWords “hello world” => [“hello”, “world”]
allNumWords function:
numWords :: (Int, Line) -> [(Int, Word)] -- b)
numWords (number, line)
= map addLineNum ( splitWords line) -- all line pairs
where
addLineNum word = (number, word) -- a pair
allNumWords :: [(Int, Line)] -> [(Int, Word)]
allNumWords = concat . map numWords -- doc pairs
Examples:
addLineNum “hello” => (1, “hello”)
numWords (1, “hello world”) => [(1, “hello”), (1, “world”)]
allNumWords [(1, “hello world”), (2, “next world”)]
=> [(1, “hello”), (1, “world”), (2, “next”), (2, “world”)]
SortLs function:
sortLs :: [(Int, Word)] -> [(Int, Word)]
sortLs [ ] = [ ]
sortLs (a:x)
= sortLs [b | b <- x, compare b a] -- sort 1st half
++ [a] ++ -- 1st in middle
sortLs [b | b <- x, compare a b] -- sort 2nd half
where
compare (n1, w1) (n2, w2)
= (w1 < w2) -- 1st word less
|| (w1 == w2 && n1 < n2) -- check no.
Example:
sortLs [(1, “hello”), (1, “world”), (2, “next”), (2, “world”)]
=> [(1, “hello”), (2, “next”), (1, “world”), (2, “world”)]
makeLists function:
makeLists :: [(Int, Word)] -> [([Int], Word)]
makeLists
= map mk -- all pairs
where mk (num, word) = ([num], word)
-- list of single no.
Examples:
mk (1, “hello”) => ([1], “hello”)
makeLists [(1, “hello”), (2, “next”), (1, “world”), (2, “world”)]
=> [([1], “hello”), ([2], “next”), ([1], “world”), ([2], “world”)]
Amalgamate function:
amalgamate :: [([Int], Word)] -> [([Int], Word)]
amalgamate [ ] = [ ]
amalgamate [a] = [a]
amalgamate ((n1, w1) : (n2, w2) : rest) -- pairs of pairs
| w1 /= w2 = (n1, w1) : amalgamate ((n2, w2) : rest)
| otherwise = amalgamate ((n1 ++ n2, w1) : rest)
-- if words are same grow list of numbers
Example:
amalgamate [([1], “hello”), ([2], “next”), ([1], “world”), ([2], “world”)]
=> [([1], “hello”), ([2], “next”), ([1, 2], “world”)]
Shorten function:
shorten :: [([Int], Word)] -> [([Int], Word)]
shorten
= filter long -- keep pairs >4
where
long (num, word) = length word > 4 -- check word >4
Example:
shorten [([1], “hello”), ([2], “next”), ([1, 2], “world”)]
=> [([1], “hello”), ([1, 2], “world”)]