Defining 'words' in terms of 'span'

Roger Whittaker

16 Mar 2010 16 Mar '10

5:08 p.m.

I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html And I have been trying some questions from them. I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.ps A word is a sequence of alphabetic characters, which you can recognise using the standard function isAlpha :: Char -> Bool Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example, words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"] words "2+3" == [] words "1 by 1" == ["by"] Can anyone give me a clue how to start? -- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ========================

Show replies by date

Ozgur Akgun

16 Mar 16 Mar

5:23 p.m.

You might start with looking into spanhttp://haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Prelude.html#... . span :: (a -> Bool) -> [a] -> ([a], [a]) For example try these, span ('a'==) "abcd" span ('a'==) "aaabcd" span ('c'==) "abcd" span ('c'/=) "abcd" span ('c'/=) "abcdcdcd" On 16 March 2010 17:08, Roger Whittaker wrote:

...

I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html

And I have been trying some questions from them.

I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.ps

A word is a sequence of alphabetic characters, which you can recognise using the standard function

isAlpha :: Char -> Bool

Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example,

words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"]

words "2+3" == []

words "1 by 1" == ["by"]

Can anyone give me a clue how to start?

-- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ======================== _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

-- Ozgur Akgun

Daniel Fischer

6:36 p.m.

Am Dienstag 16 März 2010 18:23:33 schrieb Ozgur Akgun:

...

You might start with looking into span<http://haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Prel ude.html#v:span> .

span :: (a -> Bool) -> [a] -> ([a], [a])

And then, using span, write a function that splits the first word off a String. And a function to use when the String doesn't start with a word (like "", " made glorious summer by this son of York.", "2+3", ...).

...

For example try these,

span ('a'==) "abcd" span ('a'==) "aaabcd" span ('c'==) "abcd" span ('c'/=) "abcd" span ('c'/=) "abcdcdcd"

On 16 March 2010 17:08, Roger Whittaker wrote:

...
I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html

And I have been trying some questions from them.

I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.p s

A word is a sequence of alphabetic characters, which you can recognise using the standard function

isAlpha :: Char -> Bool

Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example,

words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"]

words "2+3" == []

words "1 by 1" == ["by"]

Can anyone give me a clue how to start?

-- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ======================== _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

Tim Attwood

8:54 p.m.

"Roger Whittaker" wrote in message news:20100316170815.GA19787@disruptive.org.uk...

...

I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html

And I have been trying some questions from them.

I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.ps

A word is a sequence of alphabetic characters, which you can recognise using the standard function

isAlpha :: Char -> Bool

Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example,

words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"]

words "2+3" == []

words "1 by 1" == ["by"]

Can anyone give me a clue how to start?

words' = filter (all isAlpha) . words ...but since they specifically want span in the answer I'd guess they want some sort of recursive function words' s = wGen s [] where wGen s a = let (n,ws) = span (\x -> not (isAlpha x)) s (w,ss) = span (isAlpha) ws in case (w=="",ss="") of True, _ -> reverse a False, True -> reverse (w:a) False, False -> wGen ss (w:a) something ugly like that maybe.

Daniel Fischer

9:22 p.m.

Am Dienstag 16 März 2010 21:54:24 schrieb Tim Attwood:

...

"Roger Whittaker" wrote in message news:20100316170815.GA19787@disruptive.org.uk...

...
I found some exam papers linked from this page: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/external.html

And I have been trying some questions from them.

I'm currently baffled by question 2(b) on this one: http://www.cs.chalmers.se/Cs/Grundutb/Kurser/d1pt/d1pta/tenta2000-04.p s

A word is a sequence of alphabetic characters, which you can recognise using the standard function

isAlpha :: Char -> Bool

Using span, define a function words :: String -> [String] which finds a list of the words occurring in a string. For example,

words "Now is the winter of our discontent!" == ["Now","is","the","winter","of","our","discontent"]

words "2+3" == []

words "1 by 1" == ["by"]

Can anyone give me a clue how to start?

words' = filter (all isAlpha) . words

Prelude> words' "Now is the winter of our discontent!" ["Now","is","the","winter","of","our"] Maybe you thought of words'' = map (filter isAlpha) . words , but that doesn't do the required thing for e.g. "oops,forgot the space after the comma!"

...

...but since they specifically want span in the answer I'd guess they want some sort of recursive function

Sure. At the bottom, such a function must use recursion. The question is, can you employ some combinators like fold*, or do you have to recur explicitly.

...

words' s = wGen s [] where wGen s a = let (n,ws) = span (\x -> not (isAlpha x)) s (w,ss) = span (isAlpha) ws in case (w=="",ss="") of True, _ -> reverse a False, True -> reverse (w:a) False, False -> wGen ss (w:a)

something ugly like that maybe.

More like words' s = case span isAlpha (dropWhile (not . isAlpha) s) of ("",_) -> [] (w,rest) -> w:words' rest

Roger Whittaker

10:02 p.m.

Thanks for the suggestions - I think the idea was that you had to use a recursive definition in terms of span. I eventually did this, but didn't feel it was at all nice or elegant: import Data.Char mywords :: [Char] -> [[Char]] mywords l | ((snd (span isAlpha l)) == "" && (fst (span isAlpha l)) == "") = [] | ((snd (span isAlpha l)) == "" && (fst (span isAlpha l)) /= "") = [fst (span isAlpha l)] | (fst (span isAlpha l)) == "" = mywords (tail (snd (span isAlpha l))) | otherwise = [fst (span isAlpha l)] ++ mywords (tail (snd (span isAlpha l))) -- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ========================

Heinrich Apfelmus

17 Mar 17 Mar

9:10 a.m.

Roger Whittaker wrote:

...

Thanks for the suggestions - I think the idea was that you had to use a recursive definition in terms of span.

I eventually did this, but didn't feel it was at all nice or elegant:

import Data.Char

mywords :: [Char] -> [[Char]]

mywords l | ((snd (span isAlpha l)) == "" && (fst (span isAlpha l)) == "") = [] | ((snd (span isAlpha l)) == "" && (fst (span isAlpha l)) /= "") = [fst (span isAlpha l)] | (fst (span isAlpha l)) == "" = mywords (tail (snd (span isAlpha l))) | otherwise = [fst (span isAlpha l)] ++ mywords (tail (snd (span isAlpha l)))

You can perform a few simple steps to improve this definition. First of all, introduce a variable for the the common span isAlpha part and use pattern matching instead of fst and snd . mywords xs | r == "" && l == "" = [] | r == "" && l /= "" = [l] | l == "" = mywords (tail r) | otherwise = [l] ++ mywords (tail r) where (l,r) = span isAlpha xs There, much better already. Then, you can group the four different cases, noting that they are actually independent of each other: the result is always a concatenation of either [] or [l] with either [] or mywords (tail r) mywords xs = y ++ ys where (l,r) = span isAlpha xs y = if l == "" then [] else [l] ys = if r == "" then [] else mywords (tail r) Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Roger Whittaker

6:03 p.m.

On Wed, Mar 17, 2010 at 10:10:27AM +0100, Heinrich Apfelmus wrote:

...

Then, you can group the four different cases, noting that they are actually independent of each other: the result is always a concatenation of either [] or [l] with either [] or mywords (tail r)

mywords xs = y ++ ys where (l,r) = span isAlpha xs y = if l == "" then [] else [l] ys = if r == "" then [] else mywords (tail r)

That's a great help - thank you. -- ======================== Roger Whittaker roger@disruptive.org.uk http://disruptive.org.uk ========================

5590

Age (days ago)

5591

Last active (days ago)

List overview

Download

7 comments

5 participants

participants (5)

Daniel Fischer
Heinrich Apfelmus
Ozgur Akgun
Roger Whittaker
Tim Attwood