ANNOUNCE: incremental-parser library package

The first version of incremental-parser has been released on Hackage [1]. It's yet another parser combinator library, providing the usual set of Applicative and Monad combinators. Apart from this, it has three twists that make it unique. First, the parser is incremental. That means it can be fed its input in chunks, and in proper circumstances it can also provide the parsed output in chunks. For this to be possible the result type must be a Monoid. The complete parsing result is then a concatenation of the partial results. In order to make the incremental parsing easier, the combinator set is optimized for monoidal results. The usual combinator many1, for example, assumes the result type is a monoid and concatenates its components instead of constructing a list. In Parsec:
many1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a]
In incremental-parser:
many1 :: (Monoid s, Monoid r) => Parser s r -> Parser s r
The second weirdness is that the the parser is generic in its input stream type, but this type is parameterized in a holistic way. There is no separate token type. Primitive parsers that need to peek into the input require its type to be an instance of a monoid subclass. In Parsec:
string :: Stream s m Char => String -> ParsecT s u m String char :: Stream s m Char => Char -> ParsecT s u m Char anyToken :: (Stream s m t, Show t) => ParsecT s u m t
In Attoparsec:
string :: ByteString -> Parser ByteString word8 :: Word8 -> Parser Word8 anyWord8 :: Parser Word8
In incremental-parser:
string :: (LeftCancellativeMonoid s, MonoidNull s) => s -> Parser s s token :: (Eq s, FactorialMonoid s) => s -> Parser s s anyToken :: FactorialMonoid s => Parser s s
The monoid subclasses referenced above provide methods for analyzing and subdividing the input stream. The classes are not particularly demanding, and any reasonable input stream should be able to accommodate them easily. The library comes with instances for lists, ByteString, and Text.
class Monoid m => MonoidNull m where mnull :: m -> Bool
class Monoid m => LeftCancellativeMonoid m where mstripPrefix :: m -> m -> Maybe m
class Monoid m => FactorialMonoid m where factors :: m -> [m] primePrefix :: m -> m ...
Finally, the library being implemented on the basis of Brzozowski derivatives, it can provide both the symmetric and the left-biased choice, <|> and <<|>. This is the same design choice made by Text.ParserCombinators.ReadP and uu-parsinglib. Parsec and its progeny on the other hand provide only the faster left-biased choice, at some cost to the expressiveness of the combinator language. [1] http://hackage.haskell.org/package/incremental-parser-0.1

On Tue, Mar 22, 2011 at 3:14 PM, Mario Blažević
The first version of incremental-parser has been released on Hackage [1]. It's yet another parser combinator library, providing the usual set of Applicative and Monad combinators. Apart from this, it has three twists that make it unique.
First, the parser is incremental. That means it can be fed its input in chunks, and in proper circumstances it can also provide the parsed output in chunks. For this to be possible the result type must be a Monoid. The complete parsing result is then a concatenation of the partial results.
In order to make the incremental parsing easier, the combinator set is optimized for monoidal results. The usual combinator many1, for example, assumes the result type is a monoid and concatenates its components instead of constructing a list.
In Parsec:
many1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a]
In incremental-parser:
many1 :: (Monoid s, Monoid r) => Parser s r -> Parser s r
The second weirdness is that the the parser is generic in its input stream type, but this type is parameterized in a holistic way. There is no separate token type. Primitive parsers that need to peek into the input require its type to be an instance of a monoid subclass.
In Parsec:
string :: Stream s m Char => String -> ParsecT s u m String char :: Stream s m Char => Char -> ParsecT s u m Char anyToken :: (Stream s m t, Show t) => ParsecT s u m t
In Attoparsec:
string :: ByteString -> Parser ByteString word8 :: Word8 -> Parser Word8 anyWord8 :: Parser Word8
In incremental-parser:
string :: (LeftCancellativeMonoid s, MonoidNull s) => s -> Parser s s token :: (Eq s, FactorialMonoid s) => s -> Parser s s anyToken :: FactorialMonoid s => Parser s s
The monoid subclasses referenced above provide methods for analyzing and subdividing the input stream. The classes are not particularly demanding, and any reasonable input stream should be able to accommodate them easily. The library comes with instances for lists, ByteString, and Text.
class Monoid m => MonoidNull m where mnull :: m -> Bool
class Monoid m => LeftCancellativeMonoid m where mstripPrefix :: m -> m -> Maybe m
class Monoid m => FactorialMonoid m where factors :: m -> [m] primePrefix :: m -> m ...
Finally, the library being implemented on the basis of Brzozowski derivatives, it can provide both the symmetric and the left-biased choice, <|> and <<|>. This is the same design choice made by Text.ParserCombinators.ReadP and uu-parsinglib. Parsec and its progeny on the other hand provide only the faster left-biased choice, at some cost to the expressiveness of the combinator language.
[1] http://hackage.haskell.org/package/incremental-parser-0.1
This seems very interesting. One question:
The MonadPlus and the Alternative instance differ: the former's mplus combinator equals the asymmetric <<|> choice.
Why?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Work is punishment for failing to procrastinate effectively.

This seems very interesting. One question:
The MonadPlus and the Alternative instance differ: the former's mplus combinator equals the asymmetric <<|> choice.
Why?
Good question. Basically, I see MonadPlus as a union of Monad and Alternative. The class should not exist at all. But as long as it does, I figured I should provide an instance, and I made it different from the Monoid+Alternative combination because otherwise it would be useless. My second choice would be to remove the instance completely.

2011/3/22 Mario Blažević
This seems very interesting. One question:
The MonadPlus and the Alternative instance differ: the former's mplus combinator equals the asymmetric <<|> choice.
Why?
Good question. Basically, I see MonadPlus as a union of Monad and Alternative. The class should not exist at all. But as long as it does, I figured I should provide an instance, and I made it different from the Monoid+Alternative combination because otherwise it would be useless. My second choice would be to remove the instance completely.
I have to admit I really do not like having Applicative and MonadPlus with different behavior. Yes, one is redundant, but that is more an artifact of language evolution, than an intentional opportunity for diverging behavior. Every library I am aware of to date, save of course this one, has maintained their compatibility. If the instance for Alternative satisfies the underspecified MonadPlus laws, I'd just as soon have the 'useless redundant' instance. The power of MonadPlus is in the combinators that are built on top of it. Not in the primitives themselves. If the Alternative instance would not be a legal MonadPlus instance, then I'd feel much less queasy with your second scenario, and it simply removed. -Edward

I have to admit I really do not like having Applicative and MonadPlus with different behavior. Yes, one is redundant, but that is more an artifact of language evolution, than an intentional opportunity for diverging behavior. Every library I am aware of to date, save of course this one, has maintained their compatibility.
Another way of thinking about it is that for the different behaviours to be useful, a programmer has to know about them and intentionally use one or the other. And then the person reading it later has to be aware of that difference and keep in mind the different behaviour implied by the use of different classes. That's a pretty subtle thing to have to be aware of.
participants (4)
-
Edward Kmett
-
Evan Laforge
-
Gábor Lehel
-
Mario Blažević