
Hi everyone, I'm working on a digital forensics application that will take a file with lines of the following format: "MD5|name|inode|mode_as_string|UID|GID|size|atime|mtime|ctime|crtime" This string represents the metadata associated with a particular file in the filesystem. I created a data type to represent the information that I will need to perform my analysis: data Event = Event { fn :: B.ByteString, mftNum :: B.ByteString, ft :: B.ByteString, fs :: Integer, time :: Integer, at :: AccessType mt :: AccessType ct :: AccessType crt :: AccessType } deriving (Show) data AccessType = ATime | MTime | CTime | CrTime deriving (Show) I would like to create a function that takes the Bytestring representing the file and returns a list of Events: createEvents :: ByteString -> [Event] (For now I'm creating a list, but depending on the type of analysis I decide to do, I may change this data structure) I understand that I can use the Parsec Library to do this. I read RWH, and noticed they have the endBy and sepBy combinators, but my issue with these is that using these funcitons performs too many transformations on the data. endBy will return a list of strings, which then will be used by sepBy which will then return a [[ByteString]] which I will then have to iterate through to create the final [Event]. What I would like to do is define a custom parser, that will go from the ByteString to the [Event] without the overhead of those intermediate steps. This function needs to be as fast as possible, as these files can be rather large, and I will be performing many different tests and analysis on the data. I don't want the parsing to be a bottleneck. I'm under the impression that the Parsec library will allow me to define a custom parser to do this, but I'm having problems understanding the library, and the documentation for it. A gentle shove in the right direction would be greatly appreciated. Thanks for your help, Jimmy