
For a contrary point of view, there is a footnote at the bottom of page 20 in "Parsec, a fast combinator parser" by Daan Leijen, the creator of Parsec: "I have to warn the reader though that experience with the HaskellLight compiler has shown that it hardly pays off in practice to use special identifier representations instead of normal strings" Thomas Conway wrote:
On 10/10/07, Michael Vanier
wrote: Is there an implementation of a symbol type in Haskell i.e. a string which has a constant-time comparison operation?
To borrow Prolog terminology, it sounds like you're looking for an "atom" data type.
I've not done it, but I've plotted to implement a module according to the following sketch:
module Data.Atom where
data Atom ....
atom :: String -> Atom -- or ByteString
name :: Atom -> String -- or ByteString
instance Eq Atom where ...
instance Ord Atom where ...
The constructor function would do hash-consing using unsafePerformIO internally to build a [hash] table of extant Atoms. If ByteString is used for the internal "name", the hash consing means that you can use the Ptr for O(1) equality tests. The implementation of compare would still need to do a normal string comparison, after doing an initial equality test.
If you do the O(1) equality test before doing a full compare, the performance will be very good in many situations, since non-equal comparisons tend to terminate quickly. The exception of course is strings with long common prefixes (e.g. URLs). For symbol names in a compiler, this is unlikely to be a significant problem.
cheers, T.