
On Fri, Feb 17, 2023 at 01:32:48PM -0400, Pedro B. wrote:
I am developing a program to parse dif output taken from stdin (as in diff file1 file2 | myApp) or from a file. I am reading the input as ByteString in either case and I am parsing it Attoparsec. My question is, Should I use Data.Attoparsec.ByteString.Char8 or Data.Attoparsec.ByteString?
So far, I've been using Data.Attoparsec.ByteString.Char8 and it works for my sample files, which are in utf8 or, latin1, or the default Windows encoding.
What do you suggest?
Because the underlying ByteString data type is the same: Data.ByteString ~ Data.ByteString.Char8 you can use either or both sets of combinators as you see fit. The Char8 combinators match the parsed ByteStrings against Char predicates, while the base ByteString combinators match against Word8 predicates. The below is valid: import Data.Attoparsec.ByteString as A8 import Data.Attoparsec.ByteString.Char8 as AC ... myParser :: ... myparser ... = do ... -- parse a Word8 byte followed by an 8-bit Char w <- A8.anyWord8 c <- AC.anyChar ... -- Viktor.