
On 28 March 2011 17:55, malcolm.wallace
Does anyone else think it odd that Prelude.words will break a string at a non-breaking space?
Prelude> words "abc def\xA0ghi" ["abc","def","ghi"]
I think it's predictable, isSpace (which words is based on) is based on generalCategory, which returns the proper Unicode category: λ> generalCategory '\xa0' Space So: -- | Selects white-space characters in the Latin-1 range.-- (In Unicode terms, this includes spaces and some control characters.)isSpace :: Char -> Bool-- isSpace includes non-breaking space-- Done with explicit equalities both for efficiency, and to avoid a tiresome-- recursion with GHC.List elemisSpace c = c == ' ' || c == '\t' || c == '\n' || c == '\r' || c == '\f' || c == '\v' || c == '\xa0' || iswspace (fromIntegral (ord c)) /= 0