
On Fri, Jun 03, 2005 at 04:02:09PM +0100, Duncan Coutts wrote:
On Fri, 2005-06-03 at 10:53 +0200, Gracjan Polak wrote:
As intern behaves like id and does not have any side effects, I thought its interface should be purely functional. But I do not see any way to do it :( I'll end up with a monad, probably.
In related question: does anybody here have experience/benchmarks/tests how/if is PackedString better (uses less memory) than String in parsing tasks?
GHC itself uses a rather low level thing it calls FastString which is basically a pointer into a character array with a length and a unique id. The unique ids are allocated by entering each FastString into a global hash table which also provides sharing if the same string is seen more than once (like your itern feature).
It is all very low level and ghc-specific however and probably only makes sence in a compiler-like application.
jhc has something very similar in its Atom and PackedString modules. The advantages are that it always stores strings in UTF8 so the type is a CPR type rather than a union and hence can be optimized much better. (in particular it can be {-# UNPACK #-}ed. I have not done any formal comparasons though. darcs also has its own similar thing which I believe is faster but uses FFI calls to C code rather than beping pure ghc-haskell. John -- John Meacham - ⑆repetae.net⑆john⑈