
Hi Ly-Xia, Compl, I would argue that `Maybe` does not scale well to debugging issues in big files. When parsing a large dataset you want: 1. Get precise coordinates and description of each fault so you can fix it. 1. Be able to parse it as far as possible to get overview of entire dataset before it can be handled in error-free way. Because of this I suggest adding resilience at the level of collections: 1. Each list of declarations, record fields etc, can be parsed with resilient version of `listOf` or `sepBy` combinators and return valid entries. In context where it replaces `forM` or `mapM` I named it `forData`: 2. Invalid entries should be put into logging monad (`WriterT [Error]` that keeps a list of errors and their coordinates). This approach can be used in monad that processes a list of records, as well as in the parser: ``` forData :: (a -> m (Either Error b)) -> WriterT [Error] m [b] type Parser e -- error type m -- monad a -- result = WriterT [e] m b parseList :: (a -> Parser m b) -> Parser m [b] ``` Argument is expanded in my Haskell.Love presentation: https://www.youtube.com/watch?v=KY27LsV11Rg&t=1281s When you want to gradually expand information on errors you can also use contravariant logging operator: ``` withErrorInfo :: (e -> f) -> Parser e m a -> Parser f m a data InFile e = InFile FilePath e parseFile filename = do addErrorInfo (InFile filename) $ do input <- liftIO (readFile filename) parseFile input ``` This gradual enrichment of error messages is frequently seen in IOHK project's state machines if you enjoy reading their source code. (If you want to make it more efficient by immediately processing errors instead of tagging them along in alist, please look at contravariant logging, But I usually recommend error aggregation as a last stage of processing to discover the most serious issues.) -- Cheers Michał ________________________________ Hi Compl, At least, for the example you gave on this list, it can be fixed by returning Nothing instead of using the facility for failure baked into (mega)parsec. (Proposed diff for reference: https://github.com/complyue/dcp/pull/3) "Returning Nothing" can be seen as adding a new channel for errors, turning the Parser monad into `MaybeT Parser`. Then `return Nothing` is how `empty` is defined by `MaybeT`, allowing that error to be caught and recovered from at the point where it was thrown, no backtracking. (And the original failure mode of Parser becomes `lift empty`.) Does that address your problem? Cheers, Li-yao On 10/28/2020 5:18 AM, Compl Yue wrote:
I'm still not fully clear about the confusion regarding megaparsec's behavior that I posted lately here. But now comes to my mind that it may have some problem rooted in the lacking of recoverability semantic with respect to parser combinators, some quoting from http://hackage.haskell.org/package/parser-combinators/docs/Control-Applicati...
The *A note on backtracking* section
Combinators in this module are defined in terms Applicative and Alternative operations.
And `empty`'s doc:
This parser fails unconditionally without providing any information about the cause of the failure.
Clearly `empty` is used to express failure, but there is seemingly no device to explicitly express whether a failure is recoverable. Then I observed megaparsec's implicit rule as currently implemented is like:
*) a failure with no input consumed can be recovered by rest parsers *) a failure with some input consumed can not be recovered by rest parsers
This works to great extent, but I would think the expressiveness can be further extended for a parser from the application, to tell the library that some input induces recoverable failure.
I have no expertise to suggest whether `MonadPlus` and/or `MonadFail` are suitable devices to be considered, but as megaparsec has implemented instances for them, I do feel some tweaks would be possible and meaningful.
On Thu, Oct 29, 2020 at 1:01 PM
Send Haskell-Cafe mailing list submissions to haskell-cafe@haskell.org
To subscribe or unsubscribe via the World Wide Web, visit http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe or, via email, send a message with subject or body 'help' to haskell-cafe-request@haskell.org
You can reach the person managing the list at haskell-cafe-owner@haskell.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Haskell-Cafe digest..." Today's Topics:
1. Re: Consider adding recoverability to the vocabulary of parser combinators (Li-yao Xia) 2. Re: Consider adding recoverability to the vocabulary of parser combinators (Compl Yue) 3. Re: Consider adding recoverability to the vocabulary of parser combinators (Compl Yue)
---------- Forwarded message ---------- From: Li-yao Xia
To: Compl Yue Cc: "Haskell Café" Bcc: Date: Wed, 28 Oct 2020 08:19:52 -0400 Subject: Re: [Haskell-cafe] Consider adding recoverability to the vocabulary of parser combinators Hi Compl, At least, for the example you gave on this list, it can be fixed by returning Nothing instead of using the facility for failure baked into (mega)parsec. (Proposed diff for reference: https://github.com/complyue/dcp/pull/3)
"Returning Nothing" can be seen as adding a new channel for errors, turning the Parser monad into `MaybeT Parser`. Then `return Nothing` is how `empty` is defined by `MaybeT`, allowing that error to be caught and recovered from at the point where it was thrown, no backtracking. (And the original failure mode of Parser becomes `lift empty`.)
Does that address your problem?
Cheers, Li-yao
On 10/28/2020 5:18 AM, Compl Yue wrote:
(sorry for repost, seems GMail's html processing on my last email has rendered it barely readable, so again with plain text here)
Dear Cafe,
I'm still not fully clear about the confusion regarding megaparsec's behavior that I posted lately here. But now comes to my mind that it may have some problem rooted in the lacking of recoverability semantic with respect to parser combinators, some quoting from http://hackage.haskell.org/package/parser-combinators/docs/Control-Applicati...
The *A note on backtracking* section
Combinators in this module are defined in terms Applicative and Alternative operations.
And `empty`'s doc:
This parser fails unconditionally without providing any information about the cause of the failure.
Clearly `empty` is used to express failure, but there is seemingly no device to explicitly express whether a failure is recoverable. Then I observed megaparsec's implicit rule as currently implemented is like:
*) a failure with no input consumed can be recovered by rest parsers *) a failure with some input consumed can not be recovered by rest parsers
This works to great extent, but I would think the expressiveness can be further extended for a parser from the application, to tell the library that some input induces recoverable failure.
I have no expertise to suggest whether `MonadPlus` and/or `MonadFail` are suitable devices to be considered, but as megaparsec has implemented instances for them, I do feel some tweaks would be possible and meaningful.
Best regards, Compl
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
---------- Forwarded message ---------- From: Compl Yue
To: Li-yao Xia Cc: "Haskell Café" Bcc: Date: Wed, 28 Oct 2020 21:37:11 +0800 Subject: Re: [Haskell-cafe] Consider adding recoverability to the vocabulary of parser combinators Hi Li-yao, I appreciate your help especially the PR as a working fix.
But personally I don't like the overall method of your solution, I can see Monad Transformers are powerful enough to tackle similar problems, but I'm not satisfied by the ergonomics in composing monads with transformers. `ParsecT` had already caused me much pain to get started in the beginning, and I'm still not fluent (comfortable) in transforming monads, especially I'm afraid I will have to transform much of the standard combinator functions, in order to get the real case parser working, as its resulting AST is much more complex.
I still have faith in the improvement of megaparsec as a well known parser combinator library (I regard it as the best for engineering needs among other libraries), and I must admit megaparsec already elegantly works 99% out of my current use cases, and the very issue we are talking about is a nice-to-have rather than must-to-have, so I would think we still have time to anticipate more options to come out.
And I particularly like to see parser combinators have this issue addressed in its own design space.
Thanks again with best regards, Compl
On Wed, Oct 28, 2020 at 8:19 PM Li-yao Xia
wrote: Hi Compl,
At least, for the example you gave on this list, it can be fixed by returning Nothing instead of using the facility for failure baked into (mega)parsec. (Proposed diff for reference: https://github.com/complyue/dcp/pull/3)
"Returning Nothing" can be seen as adding a new channel for errors, turning the Parser monad into `MaybeT Parser`. Then `return Nothing` is how `empty` is defined by `MaybeT`, allowing that error to be caught and recovered from at the point where it was thrown, no backtracking. (And the original failure mode of Parser becomes `lift empty`.)
Does that address your problem?
Cheers, Li-yao
On 10/28/2020 5:18 AM, Compl Yue wrote:
(sorry for repost, seems GMail's html processing on my last email has rendered it barely readable, so again with plain text here)
Dear Cafe,
I'm still not fully clear about the confusion regarding megaparsec's behavior that I posted lately here. But now comes to my mind that it may have some problem rooted in the lacking of recoverability semantic with respect to parser combinators, some quoting from http://hackage.haskell.org/package/parser-combinators/docs/Control-Applicati...
The *A note on backtracking* section
Combinators in this module are defined in terms Applicative and Alternative operations.
And `empty`'s doc:
This parser fails unconditionally without providing any information about the cause of the failure.
Clearly `empty` is used to express failure, but there is seemingly no device to explicitly express whether a failure is recoverable. Then I observed megaparsec's implicit rule as currently implemented is like:
*) a failure with no input consumed can be recovered by rest parsers *) a failure with some input consumed can not be recovered by rest parsers
This works to great extent, but I would think the expressiveness can be further extended for a parser from the application, to tell the library that some input induces recoverable failure.
I have no expertise to suggest whether `MonadPlus` and/or `MonadFail` are suitable devices to be considered, but as megaparsec has implemented instances for them, I do feel some tweaks would be possible and meaningful.
Best regards, Compl
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
---------- Forwarded message ---------- From: Compl Yue
To: Li-yao Xia Cc: "Haskell Café" Bcc: Date: Wed, 28 Oct 2020 21:48:37 +0800 Subject: Re: [Haskell-cafe] Consider adding recoverability to the vocabulary of parser combinators (sorry I forgot the text color again, maybe https://darkreader.org is to blame, anyway plain text mode I should use) Hi Li-yao,
I appreciate your help especially the PR as a working fix.
But personally I don't like the overall method of your solution, I can see Monad Transformers are powerful enough to tackle similar problems, but I'm not satisfied by the ergonomics in composing monads with transformers. `ParsecT` had already caused me much pain to get started in the beginning, and I'm still not fluent (comfortable) in transforming monads, especially I'm afraid I will have to transform much of the standard combinator functions, in order to get the real case parser working, as its resulting AST is much more complex.
I still have faith in the improvement of megaparsec as a well known parser combinator library (I regard it as the best for engineering needs among other libraries), and I must admit megaparsec already elegantly works 99% out of my current use cases, and the very issue we are talking about is a nice-to-have rather than must-to-have, so I would think we still have time to anticipate more options to come out.
And I particularly like to see parser combinators have this issue addressed in its own design space.
Thanks again with best regards, Compl
On Wed, Oct 28, 2020 at 8:19 PM Li-yao Xia
wrote: Hi Compl,
At least, for the example you gave on this list, it can be fixed by returning Nothing instead of using the facility for failure baked into (mega)parsec. (Proposed diff for reference: https://github.com/complyue/dcp/pull/3)
"Returning Nothing" can be seen as adding a new channel for errors, turning the Parser monad into `MaybeT Parser`. Then `return Nothing` is how `empty` is defined by `MaybeT`, allowing that error to be caught and recovered from at the point where it was thrown, no backtracking. (And the original failure mode of Parser becomes `lift empty`.)
Does that address your problem?
Cheers, Li-yao
On 10/28/2020 5:18 AM, Compl Yue wrote:
(sorry for repost, seems GMail's html processing on my last email has rendered it barely readable, so again with plain text here)
Dear Cafe,
I'm still not fully clear about the confusion regarding megaparsec's behavior that I posted lately here. But now comes to my mind that it may have some problem rooted in the lacking of recoverability semantic with respect to parser combinators, some quoting from http://hackage.haskell.org/package/parser-combinators/docs/Control-Applicati...
The *A note on backtracking* section
Combinators in this module are defined in terms Applicative and Alternative operations.
And `empty`'s doc:
This parser fails unconditionally without providing any information about the cause of the failure.
Clearly `empty` is used to express failure, but there is seemingly no device to explicitly express whether a failure is recoverable. Then I observed megaparsec's implicit rule as currently implemented is like:
*) a failure with no input consumed can be recovered by rest parsers *) a failure with some input consumed can not be recovered by rest parsers
This works to great extent, but I would think the expressiveness can be further extended for a parser from the application, to tell the library that some input induces recoverable failure.
I have no expertise to suggest whether `MonadPlus` and/or `MonadFail` are suitable devices to be considered, but as megaparsec has implemented instances for them, I do feel some tweaks would be possible and meaningful.
Best regards, Compl
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
-- Pozdrawiam Michał