Re: [Haskell-cafe] Strings in Haskell

Neil Mitchell wrote:
Hi Alexy,
Now I'm reading a Haskell book which states the same. Is there a more efficient Haskell string-handling method? Which functional language is the most suitable for text processing?
There are the Data.ByteString things, which are great, and have much less overhead.
But remember that Haskell is lazy. If you are thinking "well I have to process a 50Mb file", remember that Haskell will lazily read and process this file, which substantially reduces the memory requirements so only a small portion will ever be in memory at a time.
Or you can get the best of both worlds by using Data.ByteString.Lazy :) Even with laziness, all the indirections that String causes hurts performance.

On Mon, Jan 22, 2007 at 09:37:21PM -0500, Bryan Donlan wrote:
Or you can get the best of both worlds by using Data.ByteString.Lazy :) Even with laziness, all the indirections that String causes hurts performance.
actually, strictness analysis is really good at unboxing things like this, so the indirections probably hurt less than one would initially think. I think the main issue with string processing speed is not so much the representation of characters, it is that the natural way to express algorithms in haskell is always a character at a time, rather than working on chunks of text at once. of course, Data.ByteStream can let you do this too, but you start to diverge from idiomatic haskell. Not that that is inherently the case forever. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote:
On Mon, Jan 22, 2007 at 09:37:21PM -0500, Bryan Donlan wrote:
Or you can get the best of both worlds by using Data.ByteString.Lazy :) Even with laziness, all the indirections that String causes hurts performance.
actually, strictness analysis is really good at unboxing things like this, so the indirections probably hurt less than one would initially think.
One thing that's impressed me with the Haskell application I'm currently working on (as I am trying to bully the code into handling 10M rows of data, which must all be kept in memory at once, without keeling over) is how often adding explicit strictness annotations *hasn't* improved performance. I guess this means that GHC's strictness analysis isn't much worse than my own. (and that my algorithms are such crap that issues of laziness/strictness are the least of my problems... :-)
participants (3)
-
Bryan Donlan
-
John Meacham
-
Seth Gordon