ANNOUNCE: incremental-parser library package

The first version of incremental-parser has been released on Hackage [1]. It's yet another parser combinator library, providing the usual set of Applicative and Monad combinators. Apart from this, it has three twists that make it unique. First, the parser is incremental. That means it can be fed its input in chunks, and in proper circumstances it can also provide the parsed output in chunks. For this to be possible the result type must be a Monoid. The complete parsing result is then a concatenation of the partial results. In order to make the incremental parsing easier, the combinator set is optimized for monoidal results. The usual combinator many1, for example, assumes the result type is a monoid and concatenates its components instead of constructing a list. In Parsec:
many1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a]
In incremental-parser:
many1 :: (Monoid s, Monoid r) => Parser s r -> Parser s r
The second weirdness is that the the parser is generic in its input stream type, but this type is parameterized in a holistic way. There is no separate token type. Primitive parsers that need to peek into the input require its type to be an instance of a monoid subclass. In Parsec:
string :: Stream s m Char => String -> ParsecT s u m String char :: Stream s m Char => Char -> ParsecT s u m Char anyToken :: (Stream s m t, Show t) => ParsecT s u m t
In Attoparsec:
string :: ByteString -> Parser ByteString word8 :: Word8 -> Parser Word8 anyWord8 :: Parser Word8
In incremental-parser:
string :: (LeftCancellativeMonoid s, MonoidNull s) => s -> Parser s s token :: (Eq s, FactorialMonoid s) => s -> Parser s s anyToken :: FactorialMonoid s => Parser s s
The monoid subclasses referenced above provide methods for analyzing and subdividing the input stream. The classes are not particularly demanding, and any reasonable input stream should be able to accommodate them easily. The library comes with instances for lists, ByteString, and Text.
class Monoid m => MonoidNull m where mnull :: m -> Bool
class Monoid m => LeftCancellativeMonoid m where mstripPrefix :: m -> m -> Maybe m
class Monoid m => FactorialMonoid m where factors :: m -> [m] primePrefix :: m -> m ...
Finally, the library being implemented on the basis of Brzozowski derivatives, it can provide both the symmetric and the left-biased choice, <|> and <<|>. This is the same design choice made by Text.ParserCombinators.ReadP and uu-parsinglib. Parsec and its progeny on the other hand provide only the faster left-biased choice, at some cost to the expressiveness of the combinator language. [1] http://hackage.haskell.org/package/incremental-parser-0.1

On Tue, Mar 22, 2011 at 3:14 PM, Mario Blažević
The first version of incremental-parser has been released on Hackage [1]. It's yet another parser combinator library, providing the usual set of Applicative and Monad combinators. Apart from this, it has three twists that make it unique.
First, the parser is incremental. That means it can be fed its input in chunks, and in proper circumstances it can also provide the parsed output in chunks. For this to be possible the result type must be a Monoid. The complete parsing result is then a concatenation of the partial results.
In order to make the incremental parsing easier, the combinator set is optimized for monoidal results. The usual combinator many1, for example, assumes the result type is a monoid and concatenates its components instead of constructing a list.
In Parsec:
many1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a]
In incremental-parser:
many1 :: (Monoid s, Monoid r) => Parser s r -> Parser s r
The second weirdness is that the the parser is generic in its input stream type, but this type is parameterized in a holistic way. There is no separate token type. Primitive parsers that need to peek into the input require its type to be an instance of a monoid subclass.
In Parsec:
string :: Stream s m Char => String -> ParsecT s u m String char :: Stream s m Char => Char -> ParsecT s u m Char anyToken :: (Stream s m t, Show t) => ParsecT s u m t
In Attoparsec:
string :: ByteString -> Parser ByteString word8 :: Word8 -> Parser Word8 anyWord8 :: Parser Word8
In incremental-parser:
string :: (LeftCancellativeMonoid s, MonoidNull s) => s -> Parser s s token :: (Eq s, FactorialMonoid s) => s -> Parser s s anyToken :: FactorialMonoid s => Parser s s
The monoid subclasses referenced above provide methods for analyzing and subdividing the input stream. The classes are not particularly demanding, and any reasonable input stream should be able to accommodate them easily. The library comes with instances for lists, ByteString, and Text.
class Monoid m => MonoidNull m where mnull :: m -> Bool
class Monoid m => LeftCancellativeMonoid m where mstripPrefix :: m -> m -> Maybe m
class Monoid m => FactorialMonoid m where factors :: m -> [m] primePrefix :: m -> m ...
Finally, the library being implemented on the basis of Brzozowski derivatives, it can provide both the symmetric and the left-biased choice, <|> and <<|>. This is the same design choice made by Text.ParserCombinators.ReadP and uu-parsinglib. Parsec and its progeny on the other hand provide only the faster left-biased choice, at some cost to the expressiveness of the combinator language.
[1] http://hackage.haskell.org/package/incremental-parser-0.1
This seems very interesting. One question:
The MonadPlus and the Alternative instance differ: the former's mplus combinator equals the asymmetric <<|> choice.
Why?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Work is punishment for failing to procrastinate effectively.

This seems very interesting. One question:
The MonadPlus and the Alternative instance differ: the former's mplus combinator equals the asymmetric <<|> choice.
Why?
Good question. Basically, I see MonadPlus as a union of Monad and Alternative. The class should not exist at all. But as long as it does, I figured I should provide an instance, and I made it different from the Monoid+Alternative combination because otherwise it would be useless. My second choice would be to remove the instance completely.

2011/3/22 Mario Blažević
This seems very interesting. One question:
The MonadPlus and the Alternative instance differ: the former's mplus combinator equals the asymmetric <<|> choice.
Why?
Good question. Basically, I see MonadPlus as a union of Monad and Alternative. The class should not exist at all. But as long as it does, I figured I should provide an instance, and I made it different from the Monoid+Alternative combination because otherwise it would be useless. My second choice would be to remove the instance completely.
I have to admit I really do not like having Applicative and MonadPlus with different behavior. Yes, one is redundant, but that is more an artifact of language evolution, than an intentional opportunity for diverging behavior. Every library I am aware of to date, save of course this one, has maintained their compatibility. If the instance for Alternative satisfies the underspecified MonadPlus laws, I'd just as soon have the 'useless redundant' instance. The power of MonadPlus is in the combinators that are built on top of it. Not in the primitives themselves. If the Alternative instance would not be a legal MonadPlus instance, then I'd feel much less queasy with your second scenario, and it simply removed. -Edward

I have to admit I really do not like having Applicative and MonadPlus with different behavior. Yes, one is redundant, but that is more an artifact of language evolution, than an intentional opportunity for diverging behavior. Every library I am aware of to date, save of course this one, has maintained their compatibility.
Another way of thinking about it is that for the different behaviours to be useful, a programmer has to know about them and intentionally use one or the other. And then the person reading it later has to be aware of that difference and keep in mind the different behaviour implied by the use of different classes. That's a pretty subtle thing to have to be aware of.

Can you explain what are the advantages of your library over the online version of all applicative parsers in the uu-parsinglib, which are not restricted to the monoidal results? On 22 mrt 2011, at 15:14, Mario Blažević wrote:
The first version of incremental-parser has been released on Hackage [1]. It's yet another parser combinator library, providing the usual set of Applicative and Monad combinators. Apart from this, it has three twists that make it unique.
First, the parser is incremental. That means it can be fed its input in chunks, and in proper circumstances it can also provide the parsed output in chunks. For this to be possible the result type must be a Monoid. The complete parsing result is then a concatenation of the partial results.
In order to make the incremental parsing easier, the combinator set is optimized for monoidal results. The usual combinator many1, for example, assumes the result type is a monoid and concatenates its components instead of constructing a list.
In Parsec:
many1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a]
In incremental-parser:
many1 :: (Monoid s, Monoid r) => Parser s r -> Parser s r
The second weirdness is that the the parser is generic in its input stream type, but this type is parameterized in a holistic way. There is no separate token type. Primitive parsers that need to peek into the input require its type to be an instance of a monoid subclass.
In Parsec:
string :: Stream s m Char => String -> ParsecT s u m String char :: Stream s m Char => Char -> ParsecT s u m Char anyToken :: (Stream s m t, Show t) => ParsecT s u m t
In Attoparsec:
string :: ByteString -> Parser ByteString word8 :: Word8 -> Parser Word8 anyWord8 :: Parser Word8
In incremental-parser:
string :: (LeftCancellativeMonoid s, MonoidNull s) => s -> Parser s s token :: (Eq s, FactorialMonoid s) => s -> Parser s s anyToken :: FactorialMonoid s => Parser s s
The monoid subclasses referenced above provide methods for analyzing and subdividing the input stream. The classes are not particularly demanding, and any reasonable input stream should be able to accommodate them easily. The library comes with instances for lists, ByteString, and Text.
class Monoid m => MonoidNull m where mnull :: m -> Bool
class Monoid m => LeftCancellativeMonoid m where mstripPrefix :: m -> m -> Maybe m
class Monoid m => FactorialMonoid m where factors :: m -> [m] primePrefix :: m -> m ...
Finally, the library being implemented on the basis of Brzozowski derivatives, it can provide both the symmetric and the left-biased choice, <|> and <<|>. This is the same design choice made by Text.ParserCombinators.ReadP and uu-parsinglib. Parsec and its progeny on the other hand provide only the faster left-biased choice, at some cost to the expressiveness of the combinator language.
[1] http://hackage.haskell.org/package/incremental-parser-0.1
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Sat, Apr 2, 2011 at 7:45 AM, S. Doaitse Swierstra
Can you explain what are the advantages of your library over the online version of all applicative parsers in the uu-parsinglib, which are not restricted to the monoidal results?
To tell you the truth, even though I've read the uu-parsinglib documentation I wasn't even aware of different parser types it allowed. The library documentation does not exactly advertise online parsers. I'm reading the "A Short Tutorial" paper now ( http://www.cs.uu.nl/research/techreps/repo/CS-2008/2008-044.pdf). It is interesting, but very much misnamed: it is neither short nor a tutorial. It does a good job of explaining the library implementation, but there's very little in the way of usage. The Demo.Examples module on the other hand does not provide any example of incremental parsing. Would you mind providing a short example of use of an uu-parsinglib online parser that actually takes advantage of incremental parsing? Here is such an example of incremental-parser use that you can adapt: {-# LANGUAGE OverloadedStrings #-}
module Main where
import Prelude hiding (null) import Data.ByteString.Char8 (ByteString, hGet, null, unpack) import System.Environment (getArgs) import System.IO (hIsEOF, IOMode(ReadMode), withFile) import Text.ParserCombinators.Incremental
main= getArgs >>= mapM_ incremental
incremental filename = withFile filename ReadMode (flip processHandle testParser) where processHandle h p = do chunk <- hGet h 1024 if null chunk then putStrLn "EOF" >> extract (feedEof p) else extract (feed chunk p) >>= processHandle h extract p = let (r, p') = resultPrefix p in print r >> return p'
testParser :: Parser ByteString [Int] testParser = many0 (fmap (\digits-> [read $ unpack digits]) (takeWhile1 (\c-> c >= "0" && c <= "9")) <<|> skip anyToken)
This simple example will read a text file containing integers, in kilobyte chunks, and print out the list of integers. The parser is simplistic, but it serves to illustrate the main points of the interface: - the parser is fed input in chunks, - the parsed results are read in chunks, and - the input chunks may overlap with result components.

On 2011-04-02 13:45, S. Doaitse Swierstra wrote:
Can you explain what are the advantages of your library over the online version of all applicative parsers in the uu-parsinglib, which are not restricted to the monoidal results?
I would expect that the parsers based on Brzozowski derivatives support left recursion. -- /NAD
participants (6)
-
Edward Kmett
-
Evan Laforge
-
Gábor Lehel
-
Mario Blažević
-
Nils Anders Danielsson
-
S. Doaitse Swierstra