
On 22 Oct 2007, at 10:50 pm, Neil Mitchell wrote:
wordsBy :: (a -> Bool) -> [a] -> [[a]] wordsBy p s = case dropWhile p s of [] -> [] s':rest -> (s':w) : wordsBy p (drop 1 s'') where (w, s'') = break p rest
For things like this, I think it may be easier to just write down a finite state automaton, so that it is *obvious* that each character is fetched and tested exactly once. wordsBy :: (a -> Bool) -> [a] -> [[a]] wordsBy sep str = s_skip str where s_skip [] = [] s_skip (c:cs) = if sep c then s_skip cs else s_word cs [c] s_word [] w = [reverse w] s_word (c:cs) w = if sep c then reverse w : s_skip cs else s_word cs (c:w) -- s_skip state: not in a word, skipping separators -- s_word state: in a word, accumulating "letters" Of course the s_skip state corresponds to the dropWhile call, but is there any guarantee about how often dropWhile or break do their tests? The FSA approach remains easy to use in more complex cases where dropWhile and break are awkward.