Dealing with encodings

Hello!
How teach hxt to handle "KOI8-R" encoding of input file?
And it seems that so many great packages (like hxt, feed, curl) uses String.
Is it some work in progress to port them to Text?
--
Best regards, Dmitry Bogatov

On 5 August 2014 22:00, Dmitry Bogatov
Hello!
How teach hxt to handle "KOI8-R" encoding of input file?
http://hackage.haskell.org/package/text-icu ?
And it seems that so many great packages (like hxt, feed, curl) uses String. Is it some work in progress to port them to Text?
Because no-one has changed them to do so. Some of them probably also predate the rise in popularity of text.
-- Best regards, Dmitry Bogatov
, Free Software supporter, esperantisto and netiquette guardian. GPG: 54B7F00D _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com http://IvanMiljenovic.wordpress.com

Please, don't do that. The overabundance of cyrillic encodings caused great pain in the past; don't help this genie out of the bottle again. Especially since KOI8-R is the worst of cyrillic encodings.
On 05 Aug 2014, at 16:00, Dmitry Bogatov
Hello!
How teach hxt to handle "KOI8-R" encoding of input file?
And it seems that so many great packages (like hxt, feed, curl) uses String. Is it some work in progress to port them to Text?
-- Best regards, Dmitry Bogatov
, Free Software supporter, esperantisto and netiquette guardian. GPG: 54B7F00D _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

* MigMit
Please, don't do that. The overabundance of cyrillic encodings caused great pain in the past; don't help this genie out of the bottle again. Especially since KOI8-R is the worst of cyrillic encodings.
I totally agree that using anything, but utf-8 is crime. But fact is
fact -- I need to parse html page, that is koi8 encoded. What should I do?
--
Best regards, Dmitry Bogatov

use text-icu and text to decode it into Text, then use the standard tools
On Tue, Aug 5, 2014 at 1:06 PM, Dmitry Bogatov
* MigMit
[2014-08-05 19:12:19+0400] Please, don't do that. The overabundance of cyrillic encodings caused great pain in the past; don't help this genie out of the bottle again. Especially since KOI8-R is the worst of cyrillic encodings.
I totally agree that using anything, but utf-8 is crime. But fact is fact -- I need to parse html page, that is koi8 encoded. What should I do?
-- Best regards, Dmitry Bogatov
, Free Software supporter, esperantisto and netiquette guardian. GPG: 54B7F00D _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hello!
How teach hxt to handle "KOI8-R" encoding of input file?
And it seems that so many great packages (like hxt, feed, curl) uses String. Is it some work in progress to port them to Text? Prelude> :search encoding Prelude> :search encoding Searching for: encoding
05.08.2014 16:00, Dmitry Bogatov пишет: package encoding Data.Text.Encoding module Data.Text.Encoding Data.Text.Lazy.Encoding module Data.Text.Lazy.Encoding GHC.IO.Encoding module GHC.IO.Encoding System.IO hGetEncoding :: Handle -> IO (Maybe TextEncoding) GHC.IO.Handle hGetEncoding :: Handle -> IO (Maybe TextEncoding) System.IO hSetEncoding :: Handle -> TextEncoding -> IO () GHC.IO.Handle hSetEncoding :: Handle -> TextEncoding -> IO () System.IO localeEncoding :: TextEncoding GHC.IO.Encoding localeEncoding :: TextEncoding System.IO mkTextEncoding :: String -> IO TextEncoding GHC.IO.Encoding mkTextEncoding :: String -> IO TextEncoding You can read file in any encoding available in system and later convert it into Text.

* Danilov Alexander
Hello!
How teach hxt to handle "KOI8-R" encoding of input file?
And it seems that so many great packages (like hxt, feed, curl) uses String. Is it some work in progress to port them to Text? Prelude> :search encoding Prelude> :search encoding Searching for: encoding
05.08.2014 16:00, Dmitry Bogatov пишет: package encoding Data.Text.Encoding module Data.Text.Encoding Data.Text.Lazy.Encoding module Data.Text.Lazy.Encoding GHC.IO.Encoding module GHC.IO.Encoding System.IO hGetEncoding :: Handle -> IO (Maybe TextEncoding) GHC.IO.Handle hGetEncoding :: Handle -> IO (Maybe TextEncoding) System.IO hSetEncoding :: Handle -> TextEncoding -> IO () GHC.IO.Handle hSetEncoding :: Handle -> TextEncoding -> IO () System.IO localeEncoding :: TextEncoding GHC.IO.Encoding localeEncoding :: TextEncoding System.IO mkTextEncoding :: String -> IO TextEncoding GHC.IO.Encoding mkTextEncoding :: String -> IO TextEncoding
Problem is that I have XML file from unknown source. I do not know
encoding A-priori, it's specified in file itself. And to get it, I need
to parse XML.
--
Best regards, Dmitry Bogatov

06.08.2014 16:14, Dmitry Bogatov пишет:
* Danilov Alexander
[2014-08-06 09:25:54+0400] Hello!
How teach hxt to handle "KOI8-R" encoding of input file?
And it seems that so many great packages (like hxt, feed, curl) uses String. Is it some work in progress to port them to Text? Prelude> :search encoding Prelude> :search encoding Searching for: encoding
05.08.2014 16:00, Dmitry Bogatov пишет: package encoding Data.Text.Encoding module Data.Text.Encoding Data.Text.Lazy.Encoding module Data.Text.Lazy.Encoding GHC.IO.Encoding module GHC.IO.Encoding System.IO hGetEncoding :: Handle -> IO (Maybe TextEncoding) GHC.IO.Handle hGetEncoding :: Handle -> IO (Maybe TextEncoding) System.IO hSetEncoding :: Handle -> TextEncoding -> IO () GHC.IO.Handle hSetEncoding :: Handle -> TextEncoding -> IO () System.IO localeEncoding :: TextEncoding GHC.IO.Encoding localeEncoding :: TextEncoding System.IO mkTextEncoding :: String -> IO TextEncoding GHC.IO.Encoding mkTextEncoding :: String -> IO TextEncoding Problem is that I have XML file from unknown source. I do not know encoding A-priori, it's specified in file itself. And to get it, I need to parse XML.
Usually, encoding specified in xml file, and xml parser may recode text data itself. I show you API to recode text, it xml parser unable to recode.
participants (5)
-
Carter Schonwald
-
Danilov Alexander
-
Dmitry Bogatov
-
Ivan Lazar Miljenovic
-
MigMit