
Hacky patch to fix this for future reference, against bytestring-csv-0.1.2, cost center annotations used to anecdotally verify that the change doesn't significantly impact performance, (interestingly the Alex lexer in bytestring-csv appears to allocate 1.5GB while lexing a 1.6MB csv file!?) Text/CSV/ByteString.hs 65c65 < fields = [ unquote s | Item s <- line ] ---
fields = [ unquote s | Item s <- pline line]
76a77,86
pline fs@(Item x : []) = fs pline (Item x : Comma : []) = {-# SCC "plinea" #-} Item x : Comma : Item
S.empty : []
pline (Item x : Comma : rs) = {-# SCC "plineb" #-} Item x : Comma : pline rs pline (Comma : []) = {-# SCC "plinec" #-} Comma : Item S.empty : Comma : Item S.empty : [] pline (Comma : rs) = {-# SCC "plined" #-} Item S.empty : Comma : pline rs pline (Newline : rs ) = [] pline [] = []
On 17 February 2012 23:16, Tom Doris
the bytestring-csv package appears to have a bug whereby empty fields are dropped completely from the row, which is different to Text.CSV , which will return an empty field in the parse result. I'd argue this is a bug in bytestring-csv, anyone know whether this has been raised before, or know of a workaround?
Prelude Data.Maybe Data.List Text.CSV.ByteString Data.ByteString.Char8> parseCSV $ pack "a,b,c\n1,2,3\n1,,9\n" Just [["a","b","c"],["1","2","3"],["1","9"]]
-- the last row has two fields ^
Prelude Text.CSV> parseCSV "/tmp/err" "a,b,c\n1,2,3\n1,,9\n" Right [["a","b","c"],["1","2","3"],["1","","9"],[""]]