
Hi all, I've used Parsec to "tokenize" data from a text file. It was actually quite easy, everything is correctly identified. So now I have a list/stream of self defined "Tokens" and now I'm stuck. Because now I need to write my own parsec-token-parsers to parse this token stream in a context-sensitive way. Uhm, how do I that then? Günther a Token is something like: data Token = ZE String | OPS | OPSShort String | OPSLong String | Other String | ZECd String deriving Show

Hi, Günther, you could write functions that pattern-match on various
sequences of tokens in a list, you could for example have a look at
the file Evaluator.hs in my scheme interpreter haskeem, or you could
build up more-complex data structures entirely within parsec, and for
this I would point you at the file Parser.hs in my accounting program
umm; both are on hackage. Undoubtedly there are many more and probably
better examples, but I think these are at least a start...
regards, Uwe
On 1/11/10, Günther Schmidt
Hi all,
I've used Parsec to "tokenize" data from a text file. It was actually quite easy, everything is correctly identified.
So now I have a list/stream of self defined "Tokens" and now I'm stuck. Because now I need to write my own parsec-token-parsers to parse this token stream in a context-sensitive way.
Uhm, how do I that then?
Günther
a Token is something like:
data Token = ZE String | OPS | OPSShort String | OPSLong String | Other String | ZECd String deriving Show
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

2010/1/12 Günther Schmidt
[Snip...] I need to write my own parsec-token-parsers to parse this token stream in a context-sensitive way.
Uhm, how do I that then?
Hi Günther Get the Parsec manual from Daan Leijen's home page then see the section '2.11 Advanced: Seperate scanners'. Though mentioned rarely, Parsec in its regular mode is a scannerless parser. Unless you have complex formatting problems (e.g. indentation sensitivity, vis Python or Haskell's syntax) scannerless parsers are often much more convenient than parsers+lexers (see the grammar formalism SDF for many examples). For Parsec, if you want a separate scanner there's quite a lot of boilerplate you need to manufacture if you want to use the technique in section 2.11. Usually I can get by with the Token and Language modules or do a few tricks with the 'symbol' parser instead. Parsec is monadic so (>>=) allows you to write context-sensitive parsers, see section '3.1. Parsec Prim' for a discussion and example. Again, writing a context-sensitive parser can often be more trouble than studying the format of the input and working out a context-free grammar (if there is one). Best wishes Stephen

2010/1/12 Günther Schmidt
Hi all,
I've used Parsec to "tokenize" data from a text file. It was actually quite easy, everything is correctly identified.
So now I have a list/stream of self defined "Tokens" and now I'm stuck. Because now I need to write my own parsec-token-parsers to parse this token stream in a context-sensitive way.
Uhm, how do I that then?
Günther
a Token is something like:
data Token = ZE String | OPS | OPSShort String | OPSLong String | Other String | ZECd String deriving Show
Maybe this can be of help (though it's for Parsec 2): http://therning.org/magnus/archives/367 It's not the only example of this either, tagsoup-parsec is available on Hackage. /M -- Magnus Therning (OpenPGP: 0xAB4DFBA4) magnus@therning.org Jabber: magnus@therning.org http://therning.org/magnus identi.ca|twitter: magthe

В сообщении от 12 января 2010 03:35:10 Günther Schmidt написал:
Hi all,
I've used Parsec to "tokenize" data from a text file. It was actually quite easy, everything is correctly identified.
So now I have a list/stream of self defined "Tokens" and now I'm stuck. Because now I need to write my own parsec-token-parsers to parse this token stream in a context-sensitive way.
Uhm, how do I that then?
That's pretty easy actually. You can use function `token' to define you own primitive parsers. It's defined in Parsec.Prim If I'm correctly remember. Also you could want to add information about position in the source code to you lexems. Here is some code to illustrate usage:
-- | Language lexem data LexemData = Ident String | Number Double | StringLit String | None | EOL deriving (Show,Eq)
data Lexem = Lexem { lexemPos :: SourcePos , lexemData :: LexemData } deriving Show
type ParserLex = Parsec [Lexem] ()
num :: ParserLex Double num = token (show . lexemData) lexemPos (comp . lexemData) where comp (Number x) = Just x comp _ = Nothing
participants (5)
-
Günther Schmidt
-
Khudyakov Alexey
-
Magnus Therning
-
Stephen Tetley
-
Uwe Hollerbach