
Lazy parsing has been the default for the last ten years in uulib, and is now available in the simple uu-parsinglib (http://hackage.haskell.org/cgi-bin/hackage-scripts/package/uu-parsinglib ). The whole design of the latter in described in a technical report to which references are given on the web page. It provides also error correction, the ability to use several different kinds of input tokens, and (with some help) ambiguities. If speed is an issue you can insert extra hints which locally change the breadth-first parsing process locally into a somewhat more depth-first form. When compared with Parsec the good news is that usually you do not have to put annotations to get nice results. The older uulib version also performs an abstract interpretation which basically changes the search for which alternative to take from a linear to a logarithmic complexity, but does not provide a monadic structure, in which you use results recognised thus far to construct new parsers. Both the old uulib version and the new version have always had an applicative interface. In the near future elements of the abstract interpretation of the old uulib version will migrate into the new version. It is the advent of GADT's which made this new version feasable. An example of the error correction at work at the following example code: pa, pb, paz :: P_m (Str Char) [Char] pa = lift <$> pSym 'a' pb = lift <$> pSym 'b' p <++> q = (++) <$> p <*> q pa2 = pa <++> pa pa3 = pa <++> pa2 pCount p = (\ a b -> b+1) <$> p <*> pCount p <<|> pReturn 0 pExact 0 p = pReturn [] pExact n p = (:) <$> p <*> pExact (n-1) p paz = pMany (pSym ('a', 'z')) paz' = pSym (\t -> 'a' <= t && t <= 'z', "a .. z", 'k') main :: IO () main = do print (test pa "a") print (test pa "b") print (test pa2 "bbab") print (test pa "ba") print (test pa "aa") print (test (do l <- pCount pa pExact l pb) "aaacabbb") print (test (amb ( (++) <$> pa2 <*> pa3 <|> (++) <$> pa3 <*> pa2)) "aaabaa") print (test paz "ab1z7") print (test paz' "m") print (test paz' "") is loeki:~ doaitse$ ghci -package uu-parsinglib GHCi, version 6.10.1: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer ... linking ... done. Loading package base ... linking ... done. Loading package syb ... linking ... done. Loading package array-0.2.0.0 ... linking ... done. Loading package filepath-1.1.0.1 ... linking ... done. Loading package old-locale-1.0.0.1 ... linking ... done. Loading package old-time-1.0.0.1 ... linking ... done. Loading package unix-2.3.1.0 ... linking ... done. Loading package directory-1.0.0.2 ... linking ... done. Loading package process-1.0.1.1 ... linking ... done. Loading package random-1.0.0.1 ... linking ... done. Loading package haskell98 ... linking ... done. Loading package uu-parsinglib-2.0.0 ... linking ... done. Prelude> :m Text.ParserCombinators.UU.Examples Prelude Text.ParserCombinators.UU.Examples> main ("a",[]) ("a",[ Deleted 'b' at position 0 expecting one of ["'a'"], Inserted 'a' at position 1 expecting one of ["'a'"]]) ("aa",[ Deleted 'b' at position 0 expecting one of ["'a'"], Deleted 'b' at position 1 expecting one of ["'a'"], Deleted 'b' at position 3 expecting one of ["'a'"], Inserted 'a' at position 4 expecting one of ["'a'"]]) ("a",[ Deleted 'b' at position 0 expecting one of ["'a'"]]) ("a",[ The token 'a'was not consumed by the parsing process.]) (["b","b","b","b"],[ Deleted 'c' at position 3 expecting one of ["'a'","'b'"], Inserted 'b' at position 8 expecting one of ["'b'"]]) (["aaaaa"],[ Deleted 'b' at position 3 expecting one of ["'a'","'a'"]]) ("abz",[ Deleted '1' at position 2 expecting one of ["'a'..'z'"], The token '7'was not consumed by the parsing process.]) ('m',[]) ('k',[ Inserted 'k' at position 0 expecting one of ["a .. z"]]) Prelude Text.ParserCombinators.UU.Examples> Doaitse Swierstra On 27 mei 2009, at 01:52, Günther Schmidt wrote:
Hi all,
is it possible to do lazy parsing with Parsec? I understand that one can do that with polyparse, don't know about uulib, but I happen to be already somewhat familiar with Parsec, so before I do switch to polyparse I rather make sure I actually have to.
The files it has to parse is anywhere from 500 MB to 5 GB.
Günther
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe