Optimising words

9 Jul 2007

      Hi

While benchmarking a word count program I found that it wasn't running
as fast as it could. I traced this back to the original definition of
words, which isn't as good as it could be:

This is the version in Base:

words			:: String -> [String]
words s			=  case dropWhile isSpace s of
				"" -> []
				s' -> w : words s''
				      where (w, s'') =
                                             break isSpace s'

We can instantiate s' with s:ss, since we already pattern match:

words			:: String -> [String]
words s			=  case dropWhile isSpace s of
				"" -> []
				s:ss -> w : words s''
				      where (w, s'') =
                                             break isSpace (s:ss)

Now we know that s is not a space, since if it was the dropWhile would
have removed it. This means that we don't need to test it, and can put
it straight on to w:

words			:: String -> [String]
words s			=  case dropWhile isSpace s of
				"" -> []
				s:ss -> (s:w) : words s''
				      where (w, s'') = break isSpace ss

Now we have eliminated an extra isSpace test per character, and as it
happens, isSpace is very slow.

Is my reasoning correct? If so, can we make this optimisation?

Thanks

Neil

Neil Mitchell

Neil Mitchell

Duncan Coutts

Neil Mitchell

Duncan Coutts

dons＠cse.unsw.edu.au

tags

participants (3)