ByteString comparison question (was: another Newbie performance question)

Hi, I'm changing my CSV program to use ByteStrings, but I have problems with this: readCSVLine :: String -- line as String -> [String] -- line broken down into the value Strings readCSVLine = unfoldr builder where builder [] = Nothing builder xs = Just $ readField xs readField [] = ([],[]) readField (',':xs) = ([],xs) readField ('"':xs) = (field,rest) where (field,'"':rest) = break (== '"') xs So far, I have something like this - doesn't look too good to me, and doesn't compile: import qualified Data.ByteString as B import qualified Data.ByteString.Char8 as C8 type CSV = [[B.ByteString]] readCSVLine :: B.ByteString -- line as String -> [B.ByteString] -- line broken down into the value Strings readCSVLine = unfoldr builder where builder xs | xs == B.empty = Nothing | otherwise = Just $ readField xs readField xs | xs == B.empty = ([],[]) | B.head xs == ',' = ([], B.tail xs) | B.head xs == '"' = (field, B.tail rest) where field,rest) = B.break (== '"') xs I do not know how to compare a Word8 to a Char. Or maybe I don't need to? Regards Philip

Philip Müller
I do not know how to compare a Word8 to a Char. Or maybe I don't need to?
You don't need to, just use ByteString.Char8 or ByteString.Lazy.Char8. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited.

Achim Schneider schrieb:
Philip Müller
wrote: I do not know how to compare a Word8 to a Char. Or maybe I don't need to?
You don't need to, just use ByteString.Char8 or ByteString.Lazy.Char8.
Could you give a short example of how to check whether a ByteString starts with a comma (',')? I tried this import qualified Data.ByteString.Char8 as B readCSVLine :: B.ByteString -- line as String -> [B.ByteString] -- line broken down into the value Strings readCSVLine = unfoldr builder where builder xs | xs == B.empty = Nothing | otherwise = Just $ readField xs readField xs | xs == B.empty = (B.empty,B.empty) | B.head xs == ',' = (B.empty, B.tail xs) | B.head xs == '"' = (field, B.tail rest) where (field,rest) = B.break (== '"') xs But now I get strange errors when compiling, like Main.o: In function `rTD_info': (.text+0x5a): undefined reference to `bytestringzm0zi9zi1zi0_DataziByteStringziInternal_zdf2_closure' Is this a problem with my Code or is it just GHC unable to find the ByteString package? Regards Philip

Philip Müller
Achim Schneider schrieb:
Philip Müller
wrote: I do not know how to compare a Word8 to a Char. Or maybe I don't need to?
You don't need to, just use ByteString.Char8 or ByteString.Lazy.Char8.
Could you give a short example of how to check whether a ByteString starts with a comma (',')?
import Parsec import Parsec.ByteString.Lazy check s = runParser (char ',') () "" s ;)
But now I get strange errors when compiling, like
Main.o: In function `rTD_info': (.text+0x5a): undefined reference to `bytestringzm0zi9zi1zi0_DataziByteStringziInternal_zdf2_closure'
Is this a problem with my Code or is it just GHC unable to find the ByteString package?
No, it can find it, or you would have gotten errors beforehand. The linker can't find it because GHC can be kind of demented when it comes to remembering which packages it used while compiling, depending on options/processing mode. use ghc -package bytestring or, preferably, ghc --make -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited.

Philip Müller
import qualified Data.ByteString as B import qualified Data.ByteString.Char8 as C8
Note that these use the same underlying data structure, but Char8 interprets the contents as Char instead of Word8. So the B.heads and B.break should be CS8 - for consistency's sake, you could replace them all and drop the "B" import.
readField xs | xs == B.empty = ([],[]) | B.head xs == ',' = ([], B.tail xs) | B.head xs == '"' = (field, B.tail rest) where field,rest) = B.break (== '"') xs
-k -- If I haven't seen further, it is by standing in the footprints of giants
participants (3)
-
Achim Schneider
-
Ketil Malde
-
Philip Müller