
Hello, I'm trying out X.A.Search and as I can see it doesn't work with non-latin characters out of the box. For instance cyrillic text gets converted into some hex-looking mess. The following fixes it for me, but I'm not sure if we can add a dependency on utf8-string: + import Codec.Binary.UTF8.String ( utf8Encode ) ... - searchEngine name site = searchEngineF name (\s -> site ++ (escape s)) + searchEngine name site = searchEngineF name (\s -> site ++ (utf8Encode (escape s)))

On Tue, Dec 1, 2009 at 12:59 PM, Konstantin Sobolev
Hello,
I'm trying out X.A.Search and as I can see it doesn't work with non-latin characters out of the box
Doesn't surprise me at all.
For instance cyrillic text gets converted into some hex-looking mess. The following fixes it for me, but I'm not sure if we can add a dependency on utf8-string:
We can, since it's already there: http://hackage.haskell.org/package/xmonad-contrib
+ import Codec.Binary.UTF8.String ( utf8Encode ) ... - searchEngine name site = searchEngineF name (\s -> site ++ (escape s)) + searchEngine name site = searchEngineF name (\s -> site ++ (utf8Encode (escape s)))
The problem is that encoding is apparently not a cost-free solution, and we've run into many issues in the past: http://www.google.com/search?q=UTF-8+site:http://www.haskell.org/pipermail/xmonad/&start=10&sa=N
From what I remember of the spawn & prompt discussions, someone said using Encode can mess up still other strings and there's no way to know whether or not to use Encode.
-- gwern

+ import Codec.Binary.UTF8.String ( utf8Encode ) ... - searchEngine name site = searchEngineF name (\s -> site ++ (escape s)) + searchEngine name site = searchEngineF name (\s -> site ++ (utf8Encode (escape s)))
The problem is that encoding is apparently not a cost-free solution, and we've run into many issues in the past: http://www.google.com/search?q=UTF-8+site:http://www.haskell.org/pipermail/xmonad/&start=10&sa=N
Here's refined search: http://www.google.com/search?q=search+-patch+UTF-8+site:http://www.haskell.o...
From what I remember of the spawn & prompt discussions, someone said using Encode can mess up still other strings and there's no way to know whether or not to use Encode.
As I understand this might not work on the systems with non-unicode locale? I've ended up with this workaround: utf8SearchEngine :: SearchEngine -> SearchEngine utf8SearchEngine (SearchEngine name site) = searchEngineF name (utf8Encode . site) google' = utf8SearchEngine google wikipedia' = utf8SearchEngine wikipedia ...
participants (2)
-
Gwern Branwen
-
Konstantin Sobolev