
David House wrote:
On 02/07/06, Brian Hulley
wrote: I can see that an "unsafe" global ref to a Trie of Char with Unique as the "value" of a node would allow me to implement fromString, toString, and instance Eq Atom, but I've got no idea how to implement instance Ord Atom so that the order is independent of the order in which Atoms are created and exactly the same as the lexicographic ordering of the String without being O(n) where n is the min of the lengths of the Atoms being compared.
Isn't compare on strings O(n) anyway? I suppose it would actually be O(d), where d is the index of the first difference, but that's O(n) worst-case. Even equality (a special case of compare) on strings (well, [Char], to be more precise) involves traversing the list and so is O(n) worst-case, no?
If an Atom was just represented by an Int, then construction would be O(n) where n is the length of the String (using a Trie with Int as the data associated with each node), and == and /= would be O(1). If the Int's could somehow be chosen clairvoyantly then < would also be O(1). However since we can't see into the future (to know which new Atoms will be created) afaics the problem of correctly choosing these Int's is completely impossible! Of course an Integer could be used (since it can be of any length) but it's likely that any systematic way of assigning them would simply encode the string in the Integer leading back to O(n) so in terms of speed for < there would be no advantage over ByteString's. So perhaps my original spec is impossible to implement, though it is an open question whether some very clever encoding (with corresponding implementation of <) could be found which would lead to a better average performance (whatever that means). An alternative design for an atom module could be: create :: MonadIO m => String -> m Atom toString :: Atom -> String instance Eq Atom -- O(1) instance Ord Atom -- O(1) but depends on creation order but here the < would not be lexicographic, so although it would be useful for implementing symbol tables, environments etc it's not ideal for GUI use (eg when displaying a tree of modules where everything should be listed alphabetically). As an aside, if the monad was removed then the result of atom "a" < atom "b" (atom :: String -> Atom) could not be determined by analysis of the program. It would depend on the evaluation order chosen by the compiler, but in a sense this doesn't matter because whatever the result is, it would be the same at any future time during the same run of the program so the use of Atoms as keys would still be safe. But is this still "functional"? Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com