Defining 'words' in terms of 'span'

I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html And I have been trying some questions from them. I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.ps A word is a sequence of alphabetic characters, which you can recognise using the standard function isAlpha :: Char -> Bool Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example, words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"] words "2+3" == [] words "1 by 1" == ["by"] Can anyone give me a clue how to start? -- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ========================

You might start with looking into
spanhttp://haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Prelude.html#...
.
span :: (a -> Bool) -> [a] -> ([a], [a])
For example try these,
span ('a'==) "abcd"
span ('a'==) "aaabcd"
span ('c'==) "abcd"
span ('c'/=) "abcd"
span ('c'/=) "abcdcdcd"
On 16 March 2010 17:08, Roger Whittaker
I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html
And I have been trying some questions from them.
I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.ps
A word is a sequence of alphabetic characters, which you can recognise using the standard function
isAlpha :: Char -> Bool
Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example,
words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"]
words "2+3" == []
words "1 by 1" == ["by"]
Can anyone give me a clue how to start?
-- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ======================== _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- Ozgur Akgun

Am Dienstag 16 März 2010 18:23:33 schrieb Ozgur Akgun:
You might start with looking into span<http://haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Prel ude.html#v:span> .
span :: (a -> Bool) -> [a] -> ([a], [a])
And then, using span, write a function that splits the first word off a String. And a function to use when the String doesn't start with a word (like "", " made glorious summer by this son of York.", "2+3", ...).
For example try these,
span ('a'==) "abcd" span ('a'==) "aaabcd" span ('c'==) "abcd" span ('c'/=) "abcd" span ('c'/=) "abcdcdcd"
On 16 March 2010 17:08, Roger Whittaker
wrote: I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html
And I have been trying some questions from them.
I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.p s
A word is a sequence of alphabetic characters, which you can recognise using the standard function
isAlpha :: Char -> Bool
Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example,
words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"]
words "2+3" == []
words "1 by 1" == ["by"]
Can anyone give me a clue how to start?
-- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ======================== _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

"Roger Whittaker"
I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html
And I have been trying some questions from them.
I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.ps
A word is a sequence of alphabetic characters, which you can recognise using the standard function
isAlpha :: Char -> Bool
Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example,
words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"]
words "2+3" == []
words "1 by 1" == ["by"]
Can anyone give me a clue how to start?
words' = filter (all isAlpha) . words ...but since they specifically want span in the answer I'd guess they want some sort of recursive function words' s = wGen s [] where wGen s a = let (n,ws) = span (\x -> not (isAlpha x)) s (w,ss) = span (isAlpha) ws in case (w=="",ss="") of True, _ -> reverse a False, True -> reverse (w:a) False, False -> wGen ss (w:a) something ugly like that maybe.

Am Dienstag 16 März 2010 21:54:24 schrieb Tim Attwood:
"Roger Whittaker"
wrote in message news:20100316170815.GA19787@disruptive.org.uk... I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html
And I have been trying some questions from them.
I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.p s
A word is a sequence of alphabetic characters, which you can recognise using the standard function
isAlpha :: Char -> Bool
Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example,
words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"]
words "2+3" == []
words "1 by 1" == ["by"]
Can anyone give me a clue how to start?
words' = filter (all isAlpha) . words
Prelude> words' "Now is the winter of our discontent!" ["Now","is","the","winter","of","our"] Maybe you thought of words'' = map (filter isAlpha) . words , but that doesn't do the required thing for e.g. "oops,forgot the space after the comma!"
...but since they specifically want span in the answer I'd guess they want some sort of recursive function
Sure. At the bottom, such a function must use recursion. The question is, can you employ some combinators like fold*, or do you have to recur explicitly.
words' s = wGen s [] where wGen s a = let (n,ws) = span (\x -> not (isAlpha x)) s (w,ss) = span (isAlpha) ws in case (w=="",ss="") of True, _ -> reverse a False, True -> reverse (w:a) False, False -> wGen ss (w:a)
something ugly like that maybe.
More like words' s = case span isAlpha (dropWhile (not . isAlpha) s) of ("",_) -> [] (w,rest) -> w:words' rest

Thanks for the suggestions - I think the idea was that you had to use a recursive definition in terms of span. I eventually did this, but didn't feel it was at all nice or elegant: import Data.Char mywords :: [Char] -> [[Char]] mywords l | ((snd (span isAlpha l)) == "" && (fst (span isAlpha l)) == "") = [] | ((snd (span isAlpha l)) == "" && (fst (span isAlpha l)) /= "") = [fst (span isAlpha l)] | (fst (span isAlpha l)) == "" = mywords (tail (snd (span isAlpha l))) | otherwise = [fst (span isAlpha l)] ++ mywords (tail (snd (span isAlpha l))) -- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ========================

Roger Whittaker wrote:
Thanks for the suggestions - I think the idea was that you had to use a recursive definition in terms of span.
I eventually did this, but didn't feel it was at all nice or elegant:
import Data.Char
mywords :: [Char] -> [[Char]]
mywords l | ((snd (span isAlpha l)) == "" && (fst (span isAlpha l)) == "") = [] | ((snd (span isAlpha l)) == "" && (fst (span isAlpha l)) /= "") = [fst (span isAlpha l)] | (fst (span isAlpha l)) == "" = mywords (tail (snd (span isAlpha l))) | otherwise = [fst (span isAlpha l)] ++ mywords (tail (snd (span isAlpha l)))
You can perform a few simple steps to improve this definition. First of all, introduce a variable for the the common span isAlpha part and use pattern matching instead of fst and snd . mywords xs | r == "" && l == "" = [] | r == "" && l /= "" = [l] | l == "" = mywords (tail r) | otherwise = [l] ++ mywords (tail r) where (l,r) = span isAlpha xs There, much better already. Then, you can group the four different cases, noting that they are actually independent of each other: the result is always a concatenation of either [] or [l] with either [] or mywords (tail r) mywords xs = y ++ ys where (l,r) = span isAlpha xs y = if l == "" then [] else [l] ys = if r == "" then [] else mywords (tail r) Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

On Wed, Mar 17, 2010 at 10:10:27AM +0100, Heinrich Apfelmus wrote:
Then, you can group the four different cases, noting that they are actually independent of each other: the result is always a concatenation of either [] or [l] with either [] or mywords (tail r)
mywords xs = y ++ ys where (l,r) = span isAlpha xs y = if l == "" then [] else [l] ys = if r == "" then [] else mywords (tail r)
That's a great help - thank you. -- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ========================
participants (5)
-
Daniel Fischer
-
Heinrich Apfelmus
-
Ozgur Akgun
-
Roger Whittaker
-
Tim Attwood