
Am Mittwoch, 28. März 2007 11:57 schrieb Dmitri O.Kondratiev:
Daniel, I am still trying to figure out the order of function applications in the parser returning list of objects (I attached again the code to the end of this message for convenience).
You wrote: (>*>) associates to the right, hence p >*> (p >*> (p >*> (... >*> (p >*> succeed [])...)))
I don't understand where (p >*> succeed []) comes from?
The final 'succeed []' comes from a) the definition of pList p as pList p = succeed [] `alt` ((p >*> pList p) `build` (uncurry (:))) plus b) the assumption that p should be a parser which doesn't succed on an empty input and that the input is finite (though the second point is not necessary). Let us unfold a little: pList dig "12ab" === succeed [] "12ab" ++ (((dig >*> pList dig) `build` (uncurry (:))) "12ab") === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- pList dig "2ab"] -- since dig "12ab" = [('1',"2ab")] === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- (succed [] `alt` (((dig >*> pList dig) `build` (uncurry (:))))) "2ab"] === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- ([([],"2ab")] ++ [('2' : ds2,rem2) | (ds2,rem2) <- pList dig "ab"])] === [([],"12ab"),("1","2ab")] ++ [('1' : '2' : ds2,rem2) | (ds2,rem2) <- (succeed [] `alt` (((dig >*> pList dig) `build` (uncurry (:))))) "ab"] === [([],"12ab"),("1","2ab")] ++ [('1' : '2' : ds2,rem2) | (ds2,rem2) <- ([([],"ab")] ++ (((dig >*> pList dig) `build` (uncurry (:))) "ab"))] -- now 'dig' and hence 'dig >*> pList dig' fail on the input "ab", thus === [([],"12ab"),("1","2ab"),("12","ab")] Hum, I find that a bit hard to read myself, so let's introduce an alias for 'alt', call it (<+>) and a new combinator which joins (>*>) and 'build (uncurry (:))' : (<:>) :: Parser a b -> Parser a [b] -> Parser a [b] p1 <:> p2 = \inp -> [(x:ys,rem2) | (x,rem1) <- p1 inp, (ys,rem2) <- p2 rem1] -- or p1 <:> p2 = build (p1 >*> p2) (uncurry (:)) Then we have (because p1 <:> (p2 <+> p3) === (p1 <:> p2) <+> (p1 <:> p3)) pList p === succeed [] <+> (p <:> pList p) === succeed [] <+> (p <:> (succeed [] <+> (p <:> pList p))) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> pList p)) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> (succeed [] <+> (p <:> pList p)))) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> succeed [])) <+> (p <:> (p <:> (p <:> succeed []))) <+> (p <:> (p <:> (p <:> (p <:> pList p)))) and so on. And when we request more p's than the input provides, pList p isn't reached anymore and recursion stops (e.g. with p = dig and input "123" or "123a45", the last line will fail because it demands four digits from the beginning of the input, but there are only three). If p would succeed on an empty input, e.g. p = succeed 1 or the input is an infinite list of successes for p, e.g. p = dig and input = cycle "123", the unfolding would never stop, producing an infinite list of results, but each of these results wolud come from a finite chain of p's ended by a 'succeed []'. So the order of evaluation of pList p input = (succeed [] <+> (p <:> pList p)) input = succeed [] input ++ (p <:> pList p) input is 1. evaluate the first argument of (++), succeed [] input == [([],input)] Since this is not _|_, we need also the second argument of (++), so 2. evaluate (p <:> pList p) input (to whnf first, more if required) 3. evaluate (++) as far as needed 2. is further split, 2.1. evaluate p input, giving a list of (obj,rem) pairs -- if that's empty, we're done, also if that produces _|_ 2.2. (partially) evaluate pList p rem (goto 1.) giving a list of (objlist,rem2); [([],rem),([obj2],rem'),([obj2,obj3],rem'')...] 2.3. return the list of (obj:objlist,rem2) pairs
Yet, if the order is as you describe, everything goes well, for example:
comp1 = dig >*> dig has type - Parser char (char, char) comp2 = dig >*> (succeed []) has type - Parser char (char, [a]) pl1 = comp2 `build` (uncurry (:)) has type - Parser char (char, [char])
pl1 has type Parser Char [Char] because 'uncurry (:)' has type (a,[a]) -> [a]
At first run (succeed []) `alt` ((p >*> pList p) `build` (uncurry (:)))
should be: [] ++ ((p >*> pList p) `build` (uncurry (:)))
(succeed [] `alt` ((p >*> pList p) `build` (uncurry (:)))) input gives [([],input)] ++ ((p >*> pList p) `build` (uncurry (:))) input
so how we get: (p >*> succeed []) ?
Thanks, Dima
Anytime, Daniel