Hi,I'm trying to parse a tab-delimited file using cassava/Data.Csv in Haskell. However, I get problems if there are "strange" (Unicode) characters in my CSV file. I'll get a parse error (endOfInput) then.According to the command-line tool "file", my file has a "UTF-8 Unicode text" decoding. My Haskell code looks like this:{-# LANGUAGE ScopedTypeVariables #-}{-# LANGUAGE OverloadedStrings #-}import qualified Data.ByteString as Cimport qualified System.IO.UTF8 as Uimport qualified Data.ByteString.UTF8 as UBimport qualified Data.ByteString.Lazy.Char8 as DLimport qualified Codec.Binary.UTF8.String as USimport qualified Data.Text.Lazy.Encoding as ELimport qualified Data.ByteString.Lazy as Limport Data.Text.Encoding as E-- Handle CSV / TSV files with ...import Data.Csvimport qualified Data.Vector as Vimport Data.Char -- ordcsvFile :: FilePathcsvFile = "myFile.txt"-- Set delimiter to \t (tabulator)myOptions = defaultDecodeOptions {decDelimiter = fromIntegral (ord '\t')}main :: IO ()main = docsvData <- L.readFile csvFilecase EL.decodeUtf8' csvData ofLeft err -> print errRight dat ->case decodeWith myOptions NoHeader $ EL.encodeUtf8 dat ofLeft err -> putStrLn errRight v -> V.forM_ v $ \ (category :: String ,user :: String ,date :: String,time :: String,message :: String) -> doprint messageI tried using decodingUtf8', preprocessing (filtering) the input with predicates from Data.Char, and much more. However the endOfFile error persists.My CSV-file looks like this:a - - - RT USE " Kenny" • Hahahahahahahahaha. #Emmen #Brandstapela - - - Uhm .. wat dan ook ????!!!! 👋Or more literally:a\t-\t-\t-\tRT USE " Kenny" • Hahahahahahahahaha. #Emmen #Brandstapela\t-\t-\t-\tUhm .. wat dan ook ????!!!! 👋The problem chars are the 👋 and • (and in my complete file, there are many more of similar characters). What can I do, so that cassava / Data.Csv can read my file properly?I've also posted this question at StackOverflow a few days ago: http://stackoverflow.com/questions/26499831/parse-csv-tsv-file-in-haskell-unicode-charactersBest,Volker
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe