Parsec and Validation

I'm attempting to use Parsec to write a parser for times and dates. I'm aware of Data.Time.Format, but it doesn't offer the flexibility I need for this project. My current code (see below) uses Control.Monad.guard to validate the numeric values for hours and minutes inside the parser. The problem I'm running into is that checking for errors in the parsing code causes Text.Parsec.Combinator.choice (in the time function below) to fail and return an error if it cannot parse input using the first option (tTimeHourMin), even if the input should match the second option (t24hrClock). I have several questions: Why is choice failing in the time function below? Is there a better way to do numeric validation while parsing? Should I only use Parsec to validate that the input is syntactically correct and do the numeric validation elsewhere? What are some good examples of validating input using Parsec? Here is a bit of the code I'm working with: time :: Parser TimeOfDay time = choice [ tTimeHourMin, t24hrClock ] tTimeHourMin :: Parser TimeOfDay tTimeHourMin = do hour <- range (0, 23) oneOf " :,." min <- range (0, 59) return (TimeOfDay hour min 0) t24hrClock :: Parser TimeOfDay t24hrClock = do (h, m) <- splitAt 2 <$> count 4 digit let hour = read h let min = read m guard (hour >= 0 && hour <= 23 && min >= 0 && min <= 59) <?> printf "24hr time represented as hhmm" return (TimeOfDay hour min 0) range :: (Int,Int) -> Parser Int range (lower, upper) = do t <- read <$> many1 digit guard (t >= lower && t <= upper) <?> printf "integer in range [%d,%d]" lower upper return t Thanks for any suggestions, vladimir

Could you try it with try's:
time :: Parser TimeOfDay
time = choice $ map try [ tTimeHourMin, t24hrClock ]
In a very informal and loose description, if tTimeHourMin consumes some
input before failing, parsec gives up.
Best,
Ozgur
On 31 July 2010 20:57, Vladimir Solmon
time :: Parser TimeOfDay time = choice [ tTimeHourMin, t24hrClock ]

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 7/31/10 15:57 , Vladimir Solmon wrote:
time :: Parser TimeOfDay time = choice [ tTimeHourMin, t24hrClock ]
If the parse of tTimeHourMin fails after reading some characters (most probably, at the oneOf because it has been fed a t24hrClock value), those characters remain read and t24hrClock will pick up where the oneOf failed, then itself fail because all the digits were read by the many1 in range. To prevent this, resetting to where tTimeHourMin started its parse, wrap it in a try:
time = choice [ try tTimeHourMin , t24hrClock ]
- -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkxUg28ACgkQIn7hlCsL25WvWwCfd+a9hGc2iS/Gxph+SjDKOuIg L6cAoIbGGUojYjmruoo3vsiu9HGB8kMl =cndm -----END PGP SIGNATURE-----

Hello Vladimir In general it is better to avoid the try combinator and left factor the grammar instead. Because parsers are functions you can abuse left factoring a bit, parse the prefix in a top level combinator and supply the prefix to the sub-parsers: leftFactoredTime :: Parser TimeOfDay leftFactoredTime = do hh <- width2Num sep <- optionMaybe (oneOf ";,.") case sep of Nothing -> tTimeHourMin hh Just _ -> t24hrClock hh tTimeHourMin :: Int -> Parser TimeOfDay tTimeHourMin hh = do mm <- width2Num return (TimeOfDay hh mm 0) t24hrClock :: Int -> Parser TimeOfDay t24hrClock hh = do mm <- width2Num return (TimeOfDay hh mm 0) However in this case, the 24 hour clock and TimeHourMin are identical functions, the separator is a McGuffin [*] so: betterTime :: Parser TimeOfDay betterTime = do hh <- rangeP 0 23 width2Num _sep <- optionMaybe (oneOf ";,.") mm <- rangeP 0 59 width2Num return (TimeOfDay hh mm 0) To parse ranges I would make a new combinator that takes a range plus a number parser and returns a new number parser: rangeP :: Int -> Int -> Parser Int -> Parser Int rangeP hi lo p = do a <- p if (lo <= a && a <= hi) then return a else fail "out-of-range" Finally avoiding using read is good when using Parsec. Parsec has token parsers to read numbers, but here read can be avoided with this one: width2Num :: Parser Int width2Num = do a <- digit b <- digit return $ (10*digitToInt a) + digitToInt b digitToInt is in the Data.Char module. [*] A plot device used by Alfred Hitchcock films to throw the viewer off the scent.
participants (4)
-
Brandon S Allbery KF8NH
-
Ozgur Akgun
-
Stephen Tetley
-
Vladimir Solmon