
Ah, I thought I might have to resort to one of the ByteStrings modules. I've heard of them but was more or less avoiding them due to some complexities with installing extra libraries in my current dev environment. I'll try to work that out with the sysadmins and try it out. Thanks for the advice -----Original Message----- From: Tillmann Rendel [mailto:rendel@rbg.informatik.tu-darmstadt.de] Sent: Thursday, July 19, 2007 8:48 PM To: Re, Joseph (IT) Cc: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] Speedy parsing Re, Joseph (IT) wrote:
At this point I'm out of ideas, so I was hoping someone could identify
something stupid I've done (I'm still novice of FP in general, let alone for high performance) or direct me to a guide,website,paper,library, or some other form of help.
Two ideas about your aproaches: (1) try to avoid explicit recursion by using some standard library functions instead. it's easier (once you learned the library) and may be faster (since the library may be written in a easy to optimize style). (2) try lazy ByteStrings, they should be faster. http://www.cse.unsw.edu.au/~dons/fps.html As an example, sorting of the individual lines of a csv files by key. csv parses the csv format, uncsv produces it. these functions can't handle '=' in the key or ',' in the key or value. treesort sorts by inserting stuff into a map and removing it in ascending order:
import System.Environment import qualified Data.ByteString.Lazy.Char8 as B import qualified Data.Map as Map import Control.Arrow (second)
csv = (map $ map $ second B.tail . B.break (== '=')) . (map $ B.split ',') . (B.split '\n')
uncsv = (B.join $ B.pack "\n") . (map $ B.join $ B.pack ",") . (map $ map $ \(key, val) -> B.concat [key, B.pack "=", val])
treesort = Map.toAscList . Map.fromList
main = B.interact $ uncsv . map treesort . csv
Tillmann -------------------------------------------------------- NOTICE: If received in error, please destroy and notify sender. Sender does not intend to waive confidentiality or privilege. Use of this email is prohibited when received in error.