Parsec bug, or...?

13 Oct 2009

      a brain fart?

Hi, cafe, I've been playing a little bit with a small command
processor, and I decided it'd be nice to allow the user to not have to
enter a complete command, but to recognize a unique prefix of it. So I
started with the list of allowed commands, used filter and isPrefixOf,
and was happy. But then I increased the complexity a little bit and it
got hairier, so I decided to rewrite the parser for this bit in
parsec. The function I came up with is

parsePrefixOf n str =
  string (take n str) >> opts (drop n str) >> return str
  where opts [] = return ()
        opts (c:cs) = optional (char c >> opts cs)

which I call as

parseFoo = parsePrefixOf 1 "foo"

and it recognizes all of "f", "fo", and "foo" as "foo".

OK so far, this also seems to work fine. But during the course of
writing this, I made a stupid mistake at one point, and the result of
that seemed odd. Consider the following program. It's stupid because
the required prefix of "frito" is only 2 characters, which isn't
enough to actually distinguish this from the next one, "fromage". (And
if I change that to 2 to 3 characters, everything works fine.) So
here's the complete program

module Main where

import Prelude
import System
import Text.ParserCombinators.Parsec as TPCP

myPrefixOf n str =
  string (take n str) >> opts (drop n str) >> return str
  where opts [] = return ()
        opts (c:cs) = optional (char c >> opts cs)

myTest = myPrefixOf 1 "banana"
      <|> myPrefixOf 1 "chocolate"
      <|> TPCP.try (myPrefixOf 2 "frito")
      <|> myPrefixOf 3 "fromage"

myBig = spaces >> myTest >>= (\g -> spaces >> eof >> return g)

parseTry input =
  case parse myBig "test" input of
       Left err -> return (show err)
       Right val -> return ("success: '" ++ val ++ "'")

main = getArgs >>= (\a -> parseTry (a !! 0)) >>= putStrLn

If I compile this, say as program "opry", and run it as shown below, I
expect the results I get for all but the last one:

% ./opry b
success: 'banana'

% ./opry c
success: 'chocolate'

% ./opry fr
success: 'frito'

% ./opry fri
success: 'frito'

% ./opry fro
"test" (line 1, column 3):
unexpected "o"
expecting "i", white space or end of input

Sooo... why do I get that last one? My expectation was that  parsec
would try the string "fro" with the parser for "frito", it would fail,
having consumed 2 characters, but then the TPCP.try which is wrapped
around all of that should restore everything, and then the final
parser for "fromage" should succeed. The same reasoning seems to me to
apply if I specify 3 characters as the required initial portion for
"frito", and if I do that it does succeed as I expect.

So is this a bug in parsec, or a bug in my brain?

thanks... Uwe

Uwe Hollerbach

Derek Elkins

Uwe Hollerbach

Brandon S. Allbery KF8NH

Uwe Hollerbach

Martijn van Steenbergen

Uwe Hollerbach

S. Doaitse Swierstra

Uwe Hollerbach

S. Doaitse Swierstra

wren ng thornton

Christian Maeder

tags

participants (7)