Re: [Haskell-cafe] Attoparsec.ByteString.Char8 or Attoparsec.ByteString for diff output?

20 Feb 2023

      El 20/2/2023 a las 1:43 p. m., Viktor Dukhovni escribió:
...
On Mon, Feb 20, 2023 at 10:46:38AM -0400, Pedro B. wrote:
...
Thanks Li-yao . As I mentioned in my answer to Viktor, I am now using
the ByteString functions except when I want to parse Char8's, for
example to parse an 'a' with Data.Attoparsec.ByteString.Char8.char 'a'.
FWIW, you can often avoid the Char8 combinators, e.g. for matching a
specific 8-bit (ASCII) character, at a modest loss of readability,
you can just match its Word8 code point:
0x0a <--- '\n'
     0x0d <--- '\r'
     0x20 <--- ' '
     0x30 <--- '0'
     0x41 <--- 'A'
     0x61 <--- 'a'
     ...
I am comfortable with the raw hex values of various "interesting"
characters, but you can also define aliases:
import Data.Char (ord)
char_nl, char_cr, char_sp, char_0, char_A, char_a :: Word8
     char_nl = fromIntegral $ ord '\n'
     char_cr = fromIntegral $ ord '\r'
     char_sp = fromIntegral $ ord ' '
     ...
I am using the Data.Word8 module provided by the word8 package, which 
defines _lf, _tab, _cr, and so on, and even _a.._z, _0.._9, etc. For 
example, I may use (==_tab)  as the argument for 
Data.Attoparsec.ByteString.takeTill.

You made me realize that I can use "word8 _a" instead of  "char 'a'" and 
almost have no need for the Char8 combinators. I'll probably do that and 
only use  "decimal" from Char8 to parse integers, which I need to parse 
line ranges such as "2,10".

I still have a doubt though: given that I only match specific characters 
generated by diff, do I gain something by not using Char8? Performance, 
perhaps?

Regards,

Pedro