Re: [Haskell-cafe] Newbie: a parser for a list of objects?

-----Ursprüngliche Nachricht----- Von: "Dmitri O.Kondratiev"
Gesendet: 26.03.07 16:44:12 An: haskell-cafe@haskell.org Betreff: [Haskell-cafe] Newbie: a parser for a list of objects?
Please see my questions inside comments {-- --} : Thanks!
--- module Parser where
import Data.Char
type Parse a b = [a] -> [(b, [a])]
{-- Newbie: a parser for a list of objects?
I am working with the section 17.5 "Case study: parsing expressions" of the book "Haskell The Craft of Functional Programming", where a parser for a list of objects is defined. I called this function pList in order to avoid confusion with 'list' as a term for data structure.
Please help me to understand how pList works (please, see the rest of the code at the end of this message): --}
pList :: Parse a b -> Parse a [b] pList p = (succeed []) `alt` ((p >*> pList p) `build` (uncurry (:)))
{-- First of all, I don't quite understand why there must be a choice ('alt') between the function ('succeed') that always returns an empty list and the other part? This results in adding [] to the front, why?
Well, if the parser p doesn't succeed, we don't want the whole thing to fail. And p will (almost certainly) fail when the end of input is reached. So without the alternative 'succeed []', we'd get pL1 dig "12" = [(('1':y),rem) | (y,rem) <- pL1 dig "2"] = [(('1':y),rem) | (y,rem) <- [(('2':z),rem2) | (z,rem2) <- pL1 dig ""]] = [(('1':y),rem) | (y,rem) <- [(('2':z),rem2) | (z,rem2) <- []] = [(('1':y),rem) | (y,rem) <- []] = [] because dig "" = []
I thought that 'simplified' version of pList should still work fine. Trying to prove this I wrote : --}
pL1 :: Parse a b -> Parse a [b] pL1 p = (p >*> pL1 p) `build` (uncurry (:))
{-- Which, as expected, does not work correctly - just gives an empty list [] - but I don't understand why:
because the parser eventually fails when the end of input is reached.
*Parser> t1 "12345" [] *Parser>
Also, I don't understand why the textbook version of pList gives this result:
*Parser> test "12345" [("","12345"),("1","2345"),("12","345"),("123","45"),("1234","5"),("12345","")]
That's because of the order of alt's arguments: (succeed [] `alt` p) inp = [([],inp)] ++ (p inp) with pList p = ((p >*> pList p) `build` (uncurry (:))) `alt` succeed [] the resulting list woulde be reversed.
*Parser>
In particular, I don't understand where the first element ("","12345") of the resulting list comes from?
I am trying to figure out how pList recursively unfolds. To my mind operators in the expression:
(succeed []) `alt`((p >*> pList p) `build` (uncurry (:)))
has the following execution order: 1) >*> 2) 'build' 3) 'alt'
No, the first argument of alt gets evaluated first, because (p1 `alt` p2) inp = (p1 inp) ++ (p2 inp), thus we need p1 inp first. Then we see we haven't hit bottom, so we need the second argument of (++) (resp. alt). So next we need to evaluate p, then pList p, combine the results of those with the second argument of build, uncurry (:).
It seems that operation >*> should be done as many times as many elements the input list has. Right?
Unfortunately not. Let's stay with pList dig. Say your input starts with n digits.
From the example above you can conjecture that length (pList dig inp) == (n+1). Now in the outermost (dig >*> pList dig) branch, you apply (pList dig) to an input beginning with (n-1) digits, returning a list of length n, to each element of this list you adjoin the first digit, resulting in n + (n-1) + ... + 1 = n*(n+1)/2 applications of (>*>). (Lesson: you need an exclusive choice, using the second parser only if the first one fails and a maximal munch combinator in your library, too)
Signature:
(>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
implies that second argument of the expression:
p >*> pList p
should be of type 'Parse a c' but in this application it is of type 'Parse a b -> Parse a [b]'
c is [b], so p >*> pList p has type Parse a (b,[b]), then (p >*> pList p) `build` (uncurry (:)) has type Parse a [b]
How can that be? How recursion termination conditinon is expressed in pList?
recursion terminates when p fails. HTH, Daniel
--}
none :: Parse a b none inp = []
succeed :: b -> Parse a b succeed val inp = [(val, inp)]
suc:: b -> [a] -> [(b, [a])]
suc val inp = [(val, inp)]
spot :: (a -> Bool) -> Parse a a spot p [] = [] spot p (x:xs) | p x = [(x, xs)] | otherwise = []
alt :: Parse a b -> Parse a b -> Parse a b alt p1 p2 inp = p1 inp ++ p2 inp
bracket = spot (=='(') dash = spot (== '-') dig = spot isDigit alpha = spot isAlpha
infixr 5 >*>
(>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
(>*>) p1 p2 inp = [((x,y), rem2) |(x, rem1) <- p1 inp, (y, rem2) <- p2 rem1]
build :: Parse a b -> (b -> c) -> Parse a c build p f inp = [ (f x, rem) | (x, rem) <- p inp]
test = pList dig t1 = pL1 dig
-----------------------------------------------------------------

Thanks Daniel!
Things are getting more in shape, yet I still can not fully comprehend the
expression:
((p >*> pList p) `build` (uncurry (:)))
where
(>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
(>*>) p1 p2 inp = [((x,y), rem2) |(x, rem1) <- p1 inp, (y, rem2) <- p2
rem1]
build :: Parse a b -> (b -> c) -> Parse a c
build p f inp = [ (f x, rem) | (x, rem) <- p inp]
So in fact recursive application:
p >*> pList p
should unfold in something like:
((p >*> p) >*> p) >*> p ...
and *all* iterations of
p >*> pList p
will be done *before* 'build' will be applied?
Correct?
Thanks,
Dima
On 3/26/07, Daniel Fischer
-----Ursprüngliche Nachricht----- Von: "Dmitri O.Kondratiev"
Gesendet: 26.03.07 16:44:12 An: haskell-cafe@haskell.org Betreff: [Haskell-cafe] Newbie: a parser for a list of objects? Please see my questions inside comments {-- --} : Thanks!
--- module Parser where
import Data.Char
type Parse a b = [a] -> [(b, [a])]
{-- Newbie: a parser for a list of objects?
I am working with the section 17.5 "Case study: parsing expressions" of the book "Haskell The Craft of Functional Programming", where a parser for a list of objects is defined. I called this function pList in order to avoid confusion with 'list' as a term for data structure.
Please help me to understand how pList works (please, see the rest of the code at the end of this message): --}
pList :: Parse a b -> Parse a [b] pList p = (succeed []) `alt` ((p >*> pList p) `build` (uncurry (:)))
{-- First of all, I don't quite understand why there must be a choice ('alt') between the function ('succeed') that always returns an empty list and the other part? This results in adding [] to the front, why?
Well, if the parser p doesn't succeed, we don't want the whole thing to fail. And p will (almost certainly) fail when the end of input is reached. So without the alternative 'succeed []', we'd get
pL1 dig "12" = [(('1':y),rem) | (y,rem) <- pL1 dig "2"] = [(('1':y),rem) | (y,rem) <- [(('2':z),rem2) | (z,rem2) <- pL1 dig ""]] = [(('1':y),rem) | (y,rem) <- [(('2':z),rem2) | (z,rem2) <- []] = [(('1':y),rem) | (y,rem) <- []] = []
because dig "" = []
I thought that 'simplified' version of pList should still work fine. Trying to prove this I wrote : --}
pL1 :: Parse a b -> Parse a [b] pL1 p = (p >*> pL1 p) `build` (uncurry (:))
{-- Which, as expected, does not work correctly - just gives an empty list [] - but I don't understand why:
because the parser eventually fails when the end of input is reached.
*Parser> t1 "12345" [] *Parser>
Also, I don't understand why the textbook version of pList gives this
result:
*Parser> test "12345"
[("","12345"),("1","2345"),("12","345"),("123","45"),("1234","5"),("12345","")]
That's because of the order of alt's arguments:
(succeed [] `alt` p) inp = [([],inp)] ++ (p inp)
with pList p = ((p >*> pList p) `build` (uncurry (:))) `alt` succeed [] the resulting list woulde be reversed.
*Parser>
In particular, I don't understand where the first element ("","12345")
of the resulting list comes from?
I am trying to figure out how pList recursively unfolds. To my mind
operators in the expression:
(succeed []) `alt`((p >*> pList p) `build` (uncurry (:)))
has the following execution order: 1) >*> 2) 'build' 3) 'alt'
No, the first argument of alt gets evaluated first, because (p1 `alt` p2) inp = (p1 inp) ++ (p2 inp), thus we need p1 inp first. Then we see we haven't hit bottom, so we need the second argument of (++) (resp. alt). So next we need to evaluate p, then pList p, combine the results of those with the second argument of build, uncurry (:).
It seems that operation >*> should be done as many times as many elements the input list has. Right?
Unfortunately not. Let's stay with pList dig. Say your input starts with n digits. From the example above you can conjecture that length (pList dig inp) == (n+1). Now in the outermost (dig >*> pList dig) branch, you apply (pList dig) to an input beginning with (n-1) digits, returning a list of length n, to each element of this list you adjoin the first digit, resulting in n + (n-1) + ... + 1 = n*(n+1)/2 applications of (>*>). (Lesson: you need an exclusive choice, using the second parser only if the first one fails and a maximal munch combinator in your library, too)
Signature:
(>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
implies that second argument of the expression:
p >*> pList p
should be of type 'Parse a c' but in this application it is of type
'Parse a b -> Parse a [b]'
c is [b], so p >*> pList p has type Parse a (b,[b]), then (p >*> pList p) `build` (uncurry (:)) has type Parse a [b]
How can that be? How recursion termination conditinon is expressed in pList?
recursion terminates when p fails.
HTH, Daniel
--}
none :: Parse a b none inp = []
succeed :: b -> Parse a b succeed val inp = [(val, inp)]
suc:: b -> [a] -> [(b, [a])]
suc val inp = [(val, inp)]
spot :: (a -> Bool) -> Parse a a spot p [] = [] spot p (x:xs) | p x = [(x, xs)] | otherwise = []
alt :: Parse a b -> Parse a b -> Parse a b alt p1 p2 inp = p1 inp ++ p2 inp
bracket = spot (=='(') dash = spot (== '-') dig = spot isDigit alpha = spot isAlpha
infixr 5 >*>
(>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
(>*>) p1 p2 inp = [((x,y), rem2) |(x, rem1) <- p1 inp, (y, rem2) <- p2 rem1]
build :: Parse a b -> (b -> c) -> Parse a c build p f inp = [ (f x, rem) | (x, rem) <- p inp]
test = pList dig t1 = pL1 dig
-----------------------------------------------------------------

Am Dienstag, 27. März 2007 12:15 schrieb Dmitri O.Kondratiev:
Thanks Daniel! Things are getting more in shape, yet I still can not fully comprehend the expression:
((p >*> pList p) `build` (uncurry (:)))
where
(>*>) :: Parse a b -> Parse a c -> Parse a (b, c) (>*>) p1 p2 inp = [((x,y), rem2) |(x, rem1) <- p1 inp, (y, rem2) <- p2 rem1]
build :: Parse a b -> (b -> c) -> Parse a c build p f inp = [ (f x, rem) | (x, rem) <- p inp]
So in fact recursive application:
p >*> pList p
should unfold in something like:
((p >*> p) >*> p) >*> p ...
(>*>) associates to the right, hence p >*> (p >*> (p >*> (... >*> (p >*> succeed [])...)))
and *all* iterations of
p >*> pList p
will be done *before* 'build' will be applied?
Correct?
Think so. Though it might be conceivable that 'build' would be partially applied before. After p has parsed the first item x1, leaving the remainder rem of the input, we can see that the result will be [(x1:lst,rem2) | (lst,rem2) <- pList p rem] and we know that pList p never fails, due to 'succeed []', so that would be more efficient than constructing and destroying a lot of pairs. I've no idea whether a compiler would do that transformation, though I'd be interested to know.
Thanks, Dima
Cheers, Daniel

Daniel,
I am still trying to figure out the order of function applications in the
parser returning list of objects (I attached again the code to the end of
this message for convenience).
You wrote:
(>*>) associates to the right, hence
p >*> (p >*> (p >*> (... >*> (p >*> succeed [])...)))
I don't understand where (p >*> succeed []) comes from?
Yet, if the order is as you describe, everything goes well, for example:
comp1 = dig >*> dig has type - Parser char (char, char)
comp2 = dig >*> (succeed []) has type - Parser char (char, [a])
pl1 = comp2 `build` (uncurry (:)) has type - Parser char (char, [char])
At first run
(succeed []) `alt` ((p >*> pList p) `build` (uncurry (:)))
should be:
[] ++ ((p >*> pList p) `build` (uncurry (:)))
so how we get:
(p >*> succeed []) ?
Thanks,
Dima
---
module MyParser where
import Data.Char
type Parse a b = [a] -> [(b, [a])]
none :: Parse a b
none = \inp -> []
succeed :: b -> Parse a b
succeed val = \inp -> [(val, inp)]
spot :: (a -> Bool) -> Parse a a
spot p = \inp -> case inp of
[] -> []
(x:xs) -> if (p x) then [(x, xs)] else []
alt :: Parse a b -> Parse a b -> Parse a b
alt p1 p2 = \inp -> p1 inp ++ p2 inp
bracket = spot (=='(')
dash = spot (== '-')
dig = spot isDigit
alpha = spot isAlpha
infixr 5 >*>
(>*>) :: Parse a b -> Parse a c -> Parse a (b, c)
(>*>) p1 p2 = \inp -> [((x,y), rem2) |(x, rem1) <- p1 inp, (y, rem2) <- p2
rem1]
build :: Parse a b -> (b -> c) -> Parse a c
build p f = \inp -> [ (f x, rem) | (x, rem) <- p inp]
pList :: Parse a b -> Parse a [b]
pList p = (succeed []) `alt`
((p >*> pList p) `build` (uncurry (:)))
comp1 = dig >*> dig
comp2 = dig >*> (succeed [])
pl1 = comp2 `build` (uncurry (:))
test = pList dig
On 3/28/07, Daniel Fischer
Thanks Daniel! Things are getting more in shape, yet I still can not fully comprehend
Am Dienstag, 27. März 2007 12:15 schrieb Dmitri O.Kondratiev: the
expression:
((p >*> pList p) `build` (uncurry (:)))
where
(>*>) :: Parse a b -> Parse a c -> Parse a (b, c) (>*>) p1 p2 inp = [((x,y), rem2) |(x, rem1) <- p1 inp, (y, rem2) <- p2 rem1]
build :: Parse a b -> (b -> c) -> Parse a c build p f inp = [ (f x, rem) | (x, rem) <- p inp]
So in fact recursive application:
p >*> pList p
should unfold in something like:
((p >*> p) >*> p) >*> p ...
(>*>) associates to the right, hence p >*> (p >*> (p >*> (... >*> (p >*> succeed [])...)))
and *all* iterations of
p >*> pList p
will be done *before* 'build' will be applied?
Correct?
Think so. Though it might be conceivable that 'build' would be partially applied before. After p has parsed the first item x1, leaving the remainder rem of the input, we can see that the result will be [(x1:lst,rem2) | (lst,rem2) <- pList p rem] and we know that pList p never fails, due to 'succeed []', so that would be more efficient than constructing and destroying a lot of pairs. I've no idea whether a compiler would do that transformation, though I'd be interested to know.
Thanks, Dima
Cheers, Daniel

Am Mittwoch, 28. März 2007 11:57 schrieb Dmitri O.Kondratiev:
Daniel, I am still trying to figure out the order of function applications in the parser returning list of objects (I attached again the code to the end of this message for convenience).
You wrote: (>*>) associates to the right, hence p >*> (p >*> (p >*> (... >*> (p >*> succeed [])...)))
I don't understand where (p >*> succeed []) comes from?
The final 'succeed []' comes from a) the definition of pList p as pList p = succeed [] `alt` ((p >*> pList p) `build` (uncurry (:))) plus b) the assumption that p should be a parser which doesn't succed on an empty input and that the input is finite (though the second point is not necessary). Let us unfold a little: pList dig "12ab" === succeed [] "12ab" ++ (((dig >*> pList dig) `build` (uncurry (:))) "12ab") === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- pList dig "2ab"] -- since dig "12ab" = [('1',"2ab")] === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- (succed [] `alt` (((dig >*> pList dig) `build` (uncurry (:))))) "2ab"] === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- ([([],"2ab")] ++ [('2' : ds2,rem2) | (ds2,rem2) <- pList dig "ab"])] === [([],"12ab"),("1","2ab")] ++ [('1' : '2' : ds2,rem2) | (ds2,rem2) <- (succeed [] `alt` (((dig >*> pList dig) `build` (uncurry (:))))) "ab"] === [([],"12ab"),("1","2ab")] ++ [('1' : '2' : ds2,rem2) | (ds2,rem2) <- ([([],"ab")] ++ (((dig >*> pList dig) `build` (uncurry (:))) "ab"))] -- now 'dig' and hence 'dig >*> pList dig' fail on the input "ab", thus === [([],"12ab"),("1","2ab"),("12","ab")] Hum, I find that a bit hard to read myself, so let's introduce an alias for 'alt', call it (<+>) and a new combinator which joins (>*>) and 'build (uncurry (:))' : (<:>) :: Parser a b -> Parser a [b] -> Parser a [b] p1 <:> p2 = \inp -> [(x:ys,rem2) | (x,rem1) <- p1 inp, (ys,rem2) <- p2 rem1] -- or p1 <:> p2 = build (p1 >*> p2) (uncurry (:)) Then we have (because p1 <:> (p2 <+> p3) === (p1 <:> p2) <+> (p1 <:> p3)) pList p === succeed [] <+> (p <:> pList p) === succeed [] <+> (p <:> (succeed [] <+> (p <:> pList p))) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> pList p)) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> (succeed [] <+> (p <:> pList p)))) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> succeed [])) <+> (p <:> (p <:> (p <:> succeed []))) <+> (p <:> (p <:> (p <:> (p <:> pList p)))) and so on. And when we request more p's than the input provides, pList p isn't reached anymore and recursion stops (e.g. with p = dig and input "123" or "123a45", the last line will fail because it demands four digits from the beginning of the input, but there are only three). If p would succeed on an empty input, e.g. p = succeed 1 or the input is an infinite list of successes for p, e.g. p = dig and input = cycle "123", the unfolding would never stop, producing an infinite list of results, but each of these results wolud come from a finite chain of p's ended by a 'succeed []'. So the order of evaluation of pList p input = (succeed [] <+> (p <:> pList p)) input = succeed [] input ++ (p <:> pList p) input is 1. evaluate the first argument of (++), succeed [] input == [([],input)] Since this is not _|_, we need also the second argument of (++), so 2. evaluate (p <:> pList p) input (to whnf first, more if required) 3. evaluate (++) as far as needed 2. is further split, 2.1. evaluate p input, giving a list of (obj,rem) pairs -- if that's empty, we're done, also if that produces _|_ 2.2. (partially) evaluate pList p rem (goto 1.) giving a list of (objlist,rem2); [([],rem),([obj2],rem'),([obj2,obj3],rem'')...] 2.3. return the list of (obj:objlist,rem2) pairs
Yet, if the order is as you describe, everything goes well, for example:
comp1 = dig >*> dig has type - Parser char (char, char) comp2 = dig >*> (succeed []) has type - Parser char (char, [a]) pl1 = comp2 `build` (uncurry (:)) has type - Parser char (char, [char])
pl1 has type Parser Char [Char] because 'uncurry (:)' has type (a,[a]) -> [a]
At first run (succeed []) `alt` ((p >*> pList p) `build` (uncurry (:)))
should be: [] ++ ((p >*> pList p) `build` (uncurry (:)))
(succeed [] `alt` ((p >*> pList p) `build` (uncurry (:)))) input gives [([],input)] ++ ((p >*> pList p) `build` (uncurry (:))) input
so how we get: (p >*> succeed []) ?
Thanks, Dima
Anytime, Daniel

Daniel,
New combinator (<:>) that you introduced helps a lot to understand the whole
thing. I think that your explanation should be included in the next edition
of the "Haskell. The Craft of Functional Programming", I really mean it.
To summarize how I now understand the parser:
Using your combinators, for example:
pList dig "123"
unfolds into:
succeed []
<+> (dig <:> succeed [])
<+> (dig <:> (dig <:> succeed []))
<+> (dig <:> (dig <:> (dig <:> succeed [])))
<+> (dig <:> (dig <:> (dig <:> (dig <:> pList dig))))
where:
succeed [] ~~> [("", "123")]
(dig <:> succeed []) ~~> [("1", "23")]
(dig <:> (dig <:> succeed [])) ~~> [("12", "3")]
(dig <:> (dig <:> (dig <:> succeed []))) ~~> [("123", "")]
(dig <:> (dig <:> (dig <:> (dig <:> pList dig)))) ~~> []
the last one returns [] because:
(dig >*> dig >*> dig >*> dig) "123" ~~> []
As a result we get:
[("", "123")] ++ [("1", "23")] ++ [("12", "3")] ++ [("123", "")] ++ []
~~> [("", "123"), ("1", "23"), ("12", "3"), ("123", "")]
Thanks again Daniel for your great help!
Dima
On 3/28/07, Daniel Fischer
Daniel, I am still trying to figure out the order of function applications in
Am Mittwoch, 28. März 2007 11:57 schrieb Dmitri O.Kondratiev: the
parser returning list of objects (I attached again the code to the end of this message for convenience).
You wrote: (>*>) associates to the right, hence p >*> (p >*> (p >*> (... >*> (p >*> succeed [])...)))
I don't understand where (p >*> succeed []) comes from?
The final 'succeed []' comes from a) the definition of pList p as pList p = succeed [] `alt` ((p >*> pList p) `build` (uncurry (:))) plus b) the assumption that p should be a parser which doesn't succed on an empty input and that the input is finite (though the second point is not necessary).
Let us unfold a little:
pList dig "12ab" === succeed [] "12ab" ++ (((dig >*> pList dig) `build` (uncurry (:))) "12ab") === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- pList dig "2ab"] -- since dig "12ab" = [('1',"2ab")] === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- (succed [] `alt` (((dig >*> pList dig) `build` (uncurry (:))))) "2ab"] === [([],"12ab")] ++ [('1' : ds,rem) | (ds,rem) <- ([([],"2ab")] ++ [('2' : ds2,rem2) | (ds2,rem2) <- pList dig "ab"])] === [([],"12ab"),("1","2ab")] ++ [('1' : '2' : ds2,rem2) | (ds2,rem2) <- (succeed [] `alt` (((dig >*> pList dig) `build` (uncurry (:))))) "ab"] === [([],"12ab"),("1","2ab")] ++ [('1' : '2' : ds2,rem2) | (ds2,rem2) <- ([([],"ab")] ++ (((dig >*> pList dig) `build` (uncurry (:))) "ab"))] -- now 'dig' and hence 'dig >*> pList dig' fail on the input "ab", thus === [([],"12ab"),("1","2ab"),("12","ab")]
Hum, I find that a bit hard to read myself, so let's introduce an alias for 'alt', call it (<+>) and a new combinator which joins (>*>) and 'build (uncurry (:))' : (<:>) :: Parser a b -> Parser a [b] -> Parser a [b] p1 <:> p2 = \inp -> [(x:ys,rem2) | (x,rem1) <- p1 inp, (ys,rem2) <- p2 rem1] -- or p1 <:> p2 = build (p1 >*> p2) (uncurry (:))
Then we have (because p1 <:> (p2 <+> p3) === (p1 <:> p2) <+> (p1 <:> p3)) pList p === succeed [] <+> (p <:> pList p) === succeed [] <+> (p <:> (succeed [] <+> (p <:> pList p))) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> pList p)) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> (succeed [] <+> (p <:> pList p)))) === succeed [] <+> (p <:> succeed []) <+> (p <:> (p <:> succeed [])) <+> (p <:> (p <:> (p <:> succeed []))) <+> (p <:> (p <:> (p <:> (p <:> pList p)))) and so on. And when we request more p's than the input provides, pList p isn't reached anymore and recursion stops (e.g. with p = dig and input "123" or "123a45", the last line will fail because it demands four digits from the beginning of the input, but there are only three). If p would succeed on an empty input, e.g. p = succeed 1 or the input is an infinite list of successes for p, e.g. p = dig and input = cycle "123", the unfolding would never stop, producing an infinite list of results, but each of these results wolud come from a finite chain of p's ended by a 'succeed []'.
So the order of evaluation of pList p input = (succeed [] <+> (p <:> pList p)) input = succeed [] input ++ (p <:> pList p) input is 1. evaluate the first argument of (++), succeed [] input == [([],input)] Since this is not _|_, we need also the second argument of (++), so 2. evaluate (p <:> pList p) input (to whnf first, more if required) 3. evaluate (++) as far as needed
2. is further split, 2.1. evaluate p input, giving a list of (obj,rem) pairs -- if that's empty, we're done, also if that produces _|_ 2.2. (partially) evaluate pList p rem (goto 1.) giving a list of (objlist,rem2); [([],rem),([obj2],rem'),([obj2,obj3],rem'')...] 2.3. return the list of (obj:objlist,rem2) pairs
Yet, if the order is as you describe, everything goes well, for example:
comp1 = dig >*> dig has type - Parser char (char, char) comp2 = dig >*> (succeed []) has type - Parser char (char, [a]) pl1 = comp2 `build` (uncurry (:)) has type - Parser char (char, [char])
pl1 has type Parser Char [Char] because 'uncurry (:)' has type (a,[a]) -> [a]
At first run (succeed []) `alt` ((p >*> pList p) `build` (uncurry (:)))
should be: [] ++ ((p >*> pList p) `build` (uncurry (:)))
(succeed [] `alt` ((p >*> pList p) `build` (uncurry (:)))) input gives [([],input)] ++ ((p >*> pList p) `build` (uncurry (:))) input
so how we get: (p >*> succeed []) ?
Thanks, Dima
Anytime, Daniel
-- Dmitri O Kondratiev dokondr@gmail.com http://www.geocities.com/dkondr
participants (2)
-
Daniel Fischer
-
Dmitri O.Kondratiev