Re: [GHC] #913: instance Ord (StableName a)

#913: instance Ord (StableName a) -------------------------------------+------------------------------------- Reporter: ekarttun | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: 6.10 branch Component: libraries/base | Version: 6.4.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ekmett): Up until now the fact that you can't care about the arbitrary ordering that the `StableName`s have has been viewed as a feature. Pragmatically, if you need to use them as keys in a container `unordered-containers` should properly treat them as `Hashable`, and rather explicitly considers the container "unordered". That said, one could argue that the same reasoning should apply to `Data.Unique` in that it should be `Hashable`/`Eq` only, and not provide the `Ord` instance it already does, so to your point there is at least an appearance of inconsistency between these two positions. There is some difference between those two scenarios though. It is worth noting that nothing in our current API says a `StableName#` gets any sort of actual _unique_ integer assigned to it during creation. It just lives on the heap. Pretending we can gives me pause. You can't compare pointers to them using an ordering, any more than you can compare `MutVar#`'s for `Ord` because GC can change the relative position of them, and we don't want to inflate them with a superfluous costs to support an operation that isn't fundamental to their existence. Now, we do have a `stableNameToInt#` operation, but it rather carefully doesn't claim to produce a unique name. Mind you it doesn't make this claim largely due to concerns about GC happening in the meantime, but we've had this degree of freedom for a long time, and could abuse it more in the internals without anybody actually being able to care. @simonmar's initial suggestion that we can use the `Ord` on the `stableNameToInt#` used by `hashStableName` to compare `StableName#`s seems a little racy to me: the integer isn't guaranteed to be unique due to the fact that `StableName` IDs can be reused, but due to the fact that I could construct two stable names, converting one to an integer through `stableNameToInt#`, have a gc happen in which the now 'dead' `StableName` gets recollected, have someone else get assigned to the same id, then force another stablename that was constructed in the meantime, which might resolve to the same id. While it seems like a reasonably far fetched scenario that that might go wrong, the chain of reasoning to prove that it always goes right is delicate. e.g. I don't see what prevents you from constructing the second stable name using an `unsafePerformIO` in between the unwrapping of the first and unwrapping the second, it would be constructed by the act of forcing the `StableName` data constructor. Do I think this will really happen in user code? Maybe not, but the fact that I could even see a path towards it, and Jan-Willem's experience report, combined with my previous efforts along these lines building the `intern` library and basically giving up on the cleanup once problems get large enough due to issues of this sort, gives me a great deal of pause. But there isn't really any particular reason other than the current implementation why these Ids are small unique numbers. In a different world, they could just be heap objects with some ID created during initialization, say, based on their initial allocation address. They could then be "moved" like any other heap object. The current API doesn't require uniqueness of keys, so this would pass everything about the `StableName` API, much like how my [https://github.com/ekmett/unique/blob/master/src/Control/Concurrent/Unique.h... Control.Concurrent.Unique] matches the shape of the `Data.Unique` API, but can only supply `Hashable`, not `Ord`, as the integer of initial allocation location associated with each `Unique` value could be shared across several such `Unique` values. This has the benefit of avoiding _any_ coordination during allocation of these identifiers across threads and so scales better than our current `Data.Unique` system. I could see a world in which we'd want to refactor the internals of `System.Mem.StableName` at some point to offer a more efficient construction. The current API offers a lot of possible implementation approaches. Adding an `Ord` constraint limits those choices, and reasoning about its soundness takes us into relatively dubious territory as seen above. On a more theoretical basis, to phrase this another way, not everything that can be broken into equivalence classes breaks up into _ordered_ equivalence classes. If these were merely unique objects on the heap, then merely being able to compare these for equality vs. having the power to sort them is analogous to the problem of 'pointer discrimination' mentioned by Fritz Henglein in [https://pdfs.semanticscholar.org/f425/7af9221ca7fe21dc84a049a8545a28a874ae.p... Generic Top-down Discrimination for Sorting and Partitioning in Linear Time]. All of these issues taken together leaves me inclined to let the old `wontfix` status of this issue stand. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/913#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC