
I am using Parsec as my parsing library and quite liking it: though, I am unsure on how to properly tackle a 'free element' problem. Let me explain:
In my file, there are some elements which look like this:
this is some

Hi Franco
The best "solution" is really to work out a grammar of text strings
and write simpler productions that handle it.
Otherwise you can treat it as a "lexing" problem but then the results
get messy as you have found out.
It's a bit late in the UK and I though I've looked at the code I
haven't worked out an answer yet, I'll have a proper look tomorrow if
no one else has answered but here is my first step, this is a "lexing"
solution but written directly rather than with Parsec. It is easier to
write a "lexing" solution this as two mutually recursive functions for
the lexer states - consuming free text, or consuming a format string.
data Text1 = FreeText String | Formatted String
deriving (Eq,Ord,Show)
type Text = [Text1]
-- The type of /accumulator/.
type Acc = ShowS
-- We want to grow Strings from the right.
snoc :: Acc -> Char -> Acc
snoc ss c = ss . (c:)
toString :: Acc -> String
toString = ($ "")
empty :: Acc
empty = id
runText :: String -> Text
runText = text empty
-- Minor problem - generates empty FreeText if the accumulator is
-- empty, this can be easily fixed at some loss of clarity.
--
text :: Acc -> String -> Text
text ac [] = [FreeText (toString ac)]
text ac ('<':cs) = FreeText (toString ac) : formatted empty cs
text ac (c:cs) = text (ac `snoc` c) cs
formatted :: Acc -> String -> Text
formatted _ [] = error "missing terminator for formatting"
formatted ac ('>':cs) = Formatted (toString ac) : text empty cs
formatted ac (c:cs) = formatted (ac `snoc` c) cs
demo01 = runText "[ someconditions | this is some

Hi Franco
Actually the simple case of finding formatting tags in free text was
easier in Parsec than my email last night suggested. Perhaps it is
artificially easy because you can identify tag start and ends with a
single character so you can use `satisfy`.
import Text.Parsec
import Text.Parsec.String
import Control.Applicative hiding ( (<|>), many )
data Text1 = FreeText String | Formatted String
deriving (Eq,Ord,Show)
type Text = [Text1]
runText :: String -> Either ParseError Text
runText = runP lexer () "no-input"
notLAngle :: Char -> Bool
notLAngle = (/= '<')
notRAngle :: Char -> Bool
notRAngle = (/= '>')
lexer :: Parser Text
lexer = many (formatted <|> free)
formatted :: Parser Text1
formatted = Formatted <$>
between (char '<') (char '>') (many1 (satisfy notRAngle))
free :: Parser Text1
free = FreeText <$> many1 (satisfy notLAngle)
demo01 = runText "[ someconditions | this is some

Indeed the second solution is more elegant (and the examples very simple to follow, they should be added to Parsec's documentation!).
For the records, before reading this I was using the ductape solution below:
161
162 -- take a string till t, on that runs parser p.
163 -- The last parameters sets wheter t will be consumed or
164 -- not. FILE HANDLING FOR ERRORS?
165 parseTill :: Parser a -> Parser b -> Parser b
166 parseTill ter p = manyTill anyChar ter >>= \sndPar ->
167 case parse p "" sndPar of
168 Left a -> fail "todo: check how error msg are propagted"
169 Right b -> return b
Thanks again
-F
On Sun, 11 Mar 2012 10:53:50 +0000
Stephen Tetley
Hi Franco
Actually the simple case of finding formatting tags in free text was easier in Parsec than my email last night suggested. Perhaps it is artificially easy because you can identify tag start and ends with a single character so you can use `satisfy`.
import Text.Parsec import Text.Parsec.String import Control.Applicative hiding ( (<|>), many )
data Text1 = FreeText String | Formatted String deriving (Eq,Ord,Show)
type Text = [Text1]
runText :: String -> Either ParseError Text runText = runP lexer () "no-input"
notLAngle :: Char -> Bool notLAngle = (/= '<')
notRAngle :: Char -> Bool notRAngle = (/= '>')
lexer :: Parser Text lexer = many (formatted <|> free)
formatted :: Parser Text1 formatted = Formatted <$> between (char '<') (char '>') (many1 (satisfy notRAngle))
free :: Parser Text1 free = FreeText <$> many1 (satisfy notLAngle)
demo01 = runText "[ someconditions | this is some
text.]" demo02 = runText " more text."
--
Franco
participants (2)
-
Franco
-
Stephen Tetley