Another approach would be to use the Data.Text.IO.hGetContents function on a file handle that explicitly sets its character encoding to UTF-8. This is what we do in rio:

https://www.stackage.org/haddock/lts-13.10/rio-0.1.8.0/src/RIO.Prelude.IO.html#readFileUtf8

On Mon, Mar 4, 2019 at 3:36 PM David Fox <dsf@seereason.com> wrote:
This fixes it by forcing the evaluation of the decode where it can be caught:

return $ Right $! TE.decodeUtf8 bufferStrict

or 

Right <$> evaluate (TE.decodeUtf8 bufferStrict)



On Mon, Mar 4, 2019 at 3:50 AM Kees Bleijenberg <K.Bleijenberg@lijbrandt.nl> wrote:

Hi all,

 

The program reads lots of small Text files. readCDFile handles the encoding. Below is the simplest version of readCDFile.

If I call readCDFile "/home/kees/freeDB/inputError/" "blah" (the file blah does not exist) I get:

Left "MyError: /home/kees/freeDB/inputError/blah: openBinaryFile: does not exist (No such file or directory)". The exception is caught by exceptionHandler

If I call readCDFile "/home/kees/freeDB/inputError/" "67129209" I get freeDB: Cannot decode byte '\xa0': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream. The exception is not caught by exceptionHandler (No “MyError: ” in front). The file 67129209 is indeed bad encoded.

I’am using SomeException. Still, this ‘bad encoding exception’ is not caught. Why?

 

Kees

 

import qualified Data.Text as T

import System.FilePath.Posix

import qualified Data.Text.Encoding as TE

import qualified Data.ByteString.Lazy as B

import Prelude hiding (catch)

import Control.Exception

 

main :: IO ()

main = do

          res <- readCDFile "/home/kees/freeDB/inputError/" "67129209"

          print res

 

readCDFile :: FilePath -> FilePath -> IO (Either String T.Text)

readCDFile baseDir fn = do

  catch ( do

            buffer <- B.readFile (combine baseDir fn)

            let bufferStrict = B.toStrict buffer

            return $ Right $ TE.decodeUtf8 bufferStrict

         ) exceptionHandler

 

exceptionHandler :: SomeException -> IO (Either String T.Text)

exceptionHandler e = do let err = show e

                        return $ Left $ "MyError: " ++ err

 

 


Virusvrij. www.avast.com
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.
_______________________________________________
Haskell-Cafe mailing list
To (un)subscribe, modify options or view archives go to:
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Only members subscribed via the mailman list are allowed to post.