Re: [Haskell-cafe] Is this Parsec code idiomatic?

5 Sep 2009

      Am Freitag 04 September 2009 21:23:35 schrieb Serge LE HUITOUZE:
...
Hi Haskellers,
I'm asking some advice on a small piece of code representing a
simplified version of a treatment I need to perform.
I have a line-oriented string/file, from which I want to extract
only a substring of those lines starting with char '+' (the detail
of the extraction is irrelevant here, I'll just return what follows
the '+').
[I also simplified the "eol" parser for shorter code.]
I came out with the code below.
The line parser returns a "Maybe String".
The complete parser return a "[Maybe String]" by mere concatenation.
The main function filters the 'Nothing' with 'catMaybes'.
...
import Text.ParserCombinators.Parsec
import Data.Maybe
maybePlusFile :: GenParser Char st [Maybe String]
maybePlusFile = endBy maybePlusLine eol
maybePlusLine :: GenParser Char st (Maybe String)
maybePlusLine = try (do char('+')
No need for try here, char '+' either fails without consuming input or the whole branch 
succeeds.
...
...
result <- many (noneOf "\n")
                        return $ Just result)
            <|> do many (noneOf "\n")
                   return $ Nothing
eol = char '\n'
maybePlusLine = do
        char '+'
        fmap Just $ manyTill anyChar eol
    <|> do
        skipMany (noneOf "\n")
        return Nothing

maybePlusFile = many maybePlusLine
...
...
selectPlus :: String -> Either ParseError [String]
selectPlus input =
  case parse maybePlusFile "(input)" input of
       Left e -> Left e
       Right mblist -> Right $ catMaybes mblist
This works as expected (or so it seems), as the ghci dump shows:
...
GHCi, version 6.10.1: http://www.haskell.org/ghc/  :? for help
...
Prelude> :l selectPlus.hs
[1 of 1] Compiling Main             ( selectPlus.hs, interpreted )
Ok, modules loaded: Main.
*Main> selectPlus "abc\n+123\ndef\n+456\n"
Loading package parsec-2.1.0.1 ... linking ... done.
Right ["123","456"]
*Main>
I'd like to know if this code is good style, and how you would
possibly improve it.
Except for the superfluous try, it's good.
Unless you can't guarantee that the input ends with a newline. Then you should take care 
of that (for example

eol = char '\n' <|> (eof >> return '\n')
).

If it's such a simple extraction, there's no need for a parser, however,

map tail . filter startsPlus . lines $ input
    where
      startsPlus ('+':_) = True
      startsPlus _ = False

will do fine.
...
Thanks in advance.
--Serge

Re: [Haskell-cafe] Is this Parsec code idiomatic?

Daniel Fischer