Telling Cassava to ignore lines

I'm happily using Cassava to parse CSV, only to discover that non-conforming lines in the input data are causing the parser to error out. let e = decodeByName y' :: Either String (Header, Vector Person) chugs along fine until line 461 of the input when "parse error (endOfInput) at ..." Ironically when my Person (ha) data type was all fields of :: Text it just worked, but now that I've specified one or two of the fields as Int or Float or whatever, it's mis-parsing. Is there a way to tell it to just ignore lines that don't parse, rather than it killing the whole run? Cassava understands skipping the *header* line (and indeed using it to do the -by-name field mapping). Otherwise the only thing I can see is going back to all the fields being :: Text, and then running over that as an intermediate structure and validating whether or not things parse to i.e. float. AfC Sydney

Hi, It depends on what you mean by "doesn't parse". From your message is assume the CSV is valid, but some of the actual values fails to convert (using FromField). There are a couple of things you could try: 1. Define a newtype for your field that calls runParser using e.g. the Int parser and if it fails, return some other value. I should probably add an Either instance that covers this case, but there's none there now. newtype MaybeInt = JustI !Int | ParseFailed instance FromField MaybeInt where parseField s = case runParser (parseField s) of Left err -> pure ParseFailed Right (n :: Int) -> JustI <$> n (This is from memory, so I might have gotten some of the details wrong.) 2. Use the Streaming module, which lets you skip whole records that fails to parse (see the docs for the Cons constructor). -- Johan On Tue, Sep 17, 2013 at 6:43 PM, Andrew Cowie < andrew@operationaldynamics.com> wrote:
I'm happily using Cassava to parse CSV, only to discover that non-conforming lines in the input data are causing the parser to error out.
let e = decodeByName y' :: Either String (Header, Vector Person)
chugs along fine until line 461 of the input when
"parse error (endOfInput) at ..."
Ironically when my Person (ha) data type was all fields of :: Text it just worked, but now that I've specified one or two of the fields as Int or Float or whatever, it's mis-parsing.
Is there a way to tell it to just ignore lines that don't parse, rather than it killing the whole run? Cassava understands skipping the *header* line (and indeed using it to do the -by-name field mapping).
Otherwise the only thing I can see is going back to all the fields being :: Text, and then running over that as an intermediate structure and validating whether or not things parse to i.e. float.
AfC Sydney
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (2)
-
Andrew Cowie
-
Johan Tibell