
Am Freitag 04 September 2009 21:23:35 schrieb Serge LE HUITOUZE:
Hi Haskellers,
I'm asking some advice on a small piece of code representing a simplified version of a treatment I need to perform. I have a line-oriented string/file, from which I want to extract only a substring of those lines starting with char '+' (the detail of the extraction is irrelevant here, I'll just return what follows the '+'). [I also simplified the "eol" parser for shorter code.]
I came out with the code below. The line parser returns a "Maybe String". The complete parser return a "[Maybe String]" by mere concatenation. The main function filters the 'Nothing' with 'catMaybes'.
import Text.ParserCombinators.Parsec import Data.Maybe
maybePlusFile :: GenParser Char st [Maybe String] maybePlusFile = endBy maybePlusLine eol
maybePlusLine :: GenParser Char st (Maybe String) maybePlusLine = try (do char('+')
No need for try here, char '+' either fails without consuming input or the whole branch succeeds.
result <- many (noneOf "\n") return $ Just result) <|> do many (noneOf "\n") return $ Nothing
eol = char '\n'
maybePlusLine = do char '+' fmap Just $ manyTill anyChar eol <|> do skipMany (noneOf "\n") return Nothing maybePlusFile = many maybePlusLine
selectPlus :: String -> Either ParseError [String] selectPlus input = case parse maybePlusFile "(input)" input of Left e -> Left e Right mblist -> Right $ catMaybes mblist
This works as expected (or so it seems), as the ghci dump shows:
GHCi, version 6.10.1: http://www.haskell.org/ghc/ :? for help ... Prelude> :l selectPlus.hs [1 of 1] Compiling Main ( selectPlus.hs, interpreted ) Ok, modules loaded: Main. *Main> selectPlus "abc\n+123\ndef\n+456\n" Loading package parsec-2.1.0.1 ... linking ... done. Right ["123","456"] *Main>
I'd like to know if this code is good style, and how you would possibly improve it.
Except for the superfluous try, it's good. Unless you can't guarantee that the input ends with a newline. Then you should take care of that (for example eol = char '\n' <|> (eof >> return '\n') ). If it's such a simple extraction, there's no need for a parser, however, map tail . filter startsPlus . lines $ input where startsPlus ('+':_) = True startsPlus _ = False will do fine.
Thanks in advance.
--Serge