
I have a couple of questions about my first use of Parsec, which is trying to read morse code symbols from a string. I have a map of symbols: import qualified Data.Map as M morsemap = M.fromList [('A', ".-") ... , ('Z', "--..")] a string to parse, like test = "...---..-....-" and a Parser to read a single morse letter: morse1 :: GenParser Char st String morseletter = try $ M.fold ((<|>) . string) (string "") morsemap which I am hoping would be equivalent to try (string ".-") <|> (string "-.") ...etc <|> string "" but I don't know what the base of the fold should be (string "" is wrong I suppose) - what is the unit of <|>? Is there a readymade combinator for this? It fails anyway: *Main> run morse1 test parse error at (line 1, column 1): unexpected "." expecting ".-" I suppose this is because the strings I'm looking for overlap but I thought the use of try..<|> would avoid this error...? Thanks, -- View this message in context: http://www.nabble.com/Parsec-beginners-problem-tf3657821.html#a10219707 Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

On 27/04/07, Jim Burton
I have a couple of questions about my first use of Parsec, which is trying to read morse code symbols from a string. I have a map of symbols:
import qualified Data.Map as M
morsemap = M.fromList [('A', ".-") ... , ('Z', "--..")]
a string to parse, like
test = "...---..-....-"
This may be relevant or not, but I thought morse required a delimiting character between letters, because otherwise the message was ambiguous? I seem to recall somewhere that Parsec didn't handle non-deterministic parsings very well (or at all). D.

Dougal Stanton wrote:
This may be relevant or not, but I thought morse required a delimiting character between letters, because otherwise the message was ambiguous? I seem to recall somewhere that Parsec didn't handle non-deterministic parsings very well (or at all).
D. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
You could be right about delimiters, but handling the ambiguous instances is the challenge in this case, which is a Ruby Quiz [http://www.rubyquiz.com/quiz121.html] - I thought it would be a good use for Parsec, and the user guide talks about try..<|> as the tool for it, as in testOr2 = try (string "(a)") <|> string "(b)" -- View this message in context: http://www.nabble.com/Parsec-beginners-problem-tf3657821.html#a10220154 Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Jim Burton wrote:
Dougal Stanton wrote:
This may be relevant or not, but I thought morse required a delimiting character between letters, because otherwise the message was ambiguous? I seem to recall somewhere that Parsec didn't handle non-deterministic parsings very well (or at all).
D. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
You could be right about delimiters, but handling the ambiguous instances is the challenge in this case, which is a Ruby Quiz [http://www.rubyquiz.com/quiz121.html] - I thought it would be a good use for Parsec, and the user guide talks about try..<|> as the tool for it, as in
testOr2 = try (string "(a)") <|> string "(b)"
After posting I realised the difference between parsing "(a)" <|> "(b)" and parsing "a" <|> "aa" ... so Parsec doesn't do the latter well or at all? -- View this message in context: http://www.nabble.com/Parsec-beginners-problem-tf3657821.html#a10220156 Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Jim Burton
After posting I realised the difference between parsing "(a)" <|> "(b)" and parsing "a" <|> "aa" ... so Parsec doesn't do the latter well or at all?
Exactly. Parsec is designed to avoid backtracking altogether, and to give only one answer, so it is the wrong tool for the job. You could look into alternative parser combinators like the Hutton Meijer ones from way back - they are non-deterministic and backtracking, and can explicitly represent all possible parses as a list, which I think is what you want. You can find them in the polyparse package: http://www.cs.york.ac.uk/fp/polyparse Regards, Malcolm

Jim Burton wrote:
After posting I realised the difference between parsing "(a)" <|> "(b)" and parsing "a" <|> "aa" ... so Parsec doesn't do the latter well or at all?
It should do (try "aa") <|> "a" just fine. If you mean a general sequence of "a"s then (many1 (char "a")) should do. The Morse Code problem is a little harder though. To handle the ambiguity you may need to simultaneously consider possible parses of the dots and dashes. I suggest you look into using the List monad. As for the actual parsing of the dots and dashes, Parsec may be overkill.

Jim Burton wrote:
You could be right about delimiters, but handling the ambiguous instances is the challenge in this case, which is a Ruby Quiz [http://www.rubyquiz.com/quiz121.html] - I thought it would be a good use for Parsec, and the user guide talks about try..<|> as the tool for it, as in
testOr2 = try (string "(a)") <|> string "(b)"
There are other modules that come with Haskell than Text.ParserCombinators.Parsec such as Text.ParserCombinators.ReadP The solution with ReadP makes for a very short 'parse' function. Note that reader is built in a recursive manner.
module Morse where
import Control.Monad(guard) import Text.ParserCombinators.ReadP
parse = map fst . readP_to_S reader where reader = done <++ choice (map pairToReader table) done = look >>= guard . null >> return [] pairToReader (s,c) = string s >> fmap (c:) reader
table = (".-",'A'): ("-...",'B'): ("-.-.",'C'): ("-..",'D'): (".",'E'): ("..-.",'F'): ("--.",'G'): ("....",'H'): ("..",'I'): (".---",'J'): ("-.-",'K'): (".-..",'L'): ("--",'M'): ("-.",'N'): ("---",'O'): (".--.",'P'): ("--.-",'Q'): (".-.",'R'): ("...",'S'): ("-",'T'): ("..-",'U'): ("...-",'V'): (".--",'W'): ("-..-",'X'): ("-.--",'Y'): ("--..",'Z'): []
The table is sorted which means the result of 'parse' is in sorted alphabetical order. It also does not check which are dictionary words, so it finds many solutions: *Morse> length $ parse "...---..-....-" 5104 *Morse> take 10 $ parse "...---..-....-" ["EEAGAEEEA","EEAGAEEEET","EEAGAEEIT","EEAGAEEU","EEAGAEIA","EEAGAEIET","EEAGAEST","EEAGAEV","EEAGAHT","EEAGAIEA"] -- Chris

Chris Kuklewicz wrote: [snip]
There are other modules that come with Haskell than Text.ParserCombinators.Parsec such as Text.ParserCombinators.ReadP
The solution with ReadP makes for a very short 'parse' function. Note that reader is built in a recursive manner.
module Morse where
import Control.Monad(guard) import Text.ParserCombinators.ReadP
parse = map fst . readP_to_S reader where reader = done <++ choice (map pairToReader table) done = look >>= guard . null >> return [] pairToReader (s,c) = string s >> fmap (c:) reader
table = (".-",'A'): ("-...",'B'): ("-.-.",'C'): ("-..",'D'): (".",'E'): ("..-.",'F'): ("--.",'G'): ("....",'H'): ("..",'I'): (".---",'J'): ("-.-",'K'): (".-..",'L'): ("--",'M'): ("-.",'N'): ("---",'O'): (".--.",'P'): ("--.-",'Q'): (".-.",'R'): ("...",'S'): ("-",'T'): ("..-",'U'): ("...-",'V'): (".--",'W'): ("-..-",'X'): ("-.--",'Y'): ("--..",'Z'): []
The table is sorted which means the result of 'parse' is in sorted alphabetical order.
It also does not check which are dictionary words, so it finds many solutions:
*Morse> length $ parse "...---..-....-" 5104
*Morse> take 10 $ parse "...---..-....-" ["EEAGAEEEA","EEAGAEEEET","EEAGAEEIT","EEAGAEEU","EEAGAEIA","EEAGAEIET","EEAGAEST","EEAGAEV","EEAGAHT","EEAGAIEA"]
Thanks Chris, that's a neat solution and an eye opener for me -- I need to investigate the Text package I think.
participants (5)
-
Al Falloon
-
Chris Kuklewicz
-
Dougal Stanton
-
Jim Burton
-
Malcolm Wallace