Proposal: Tidy up and export PSQ from base

The new IO event manager implements a priority search queue internally. This is a kind of handy data structure; in particular, some algorithmic improvements in Hoopl would require a decent priority queue implementation. I propose that we move GHC.Event.PSQ to Data.PSQ, cleaning up the dependency on GHC.Event.Unique, and export it unconditionally (right now it is only available on non-Windows.) Discussion period: 1 week. Cheers, Edward

On Fri, Apr 29, 2011 at 10:10 PM, Edward Z. Yang
The new IO event manager implements a priority search queue internally. This is a kind of handy data structure; in particular, some algorithmic improvements in Hoopl would require a decent priority queue implementation. I propose that we move GHC.Event.PSQ to Data.PSQ, cleaning up the dependency on GHC.Event.Unique, and export it unconditionally (right now it is only available on non-Windows.)
Why base, and not containers? Besides this, +1 blessed priority search queue. Cheers, -- Felipe.

Excerpts from Felipe Almeida Lessa's message of Fri Apr 29 21:18:09 -0400 2011:
Why base, and not containers?
This is because the event manager still needs it to use it, and presently base does not have containers as a dependency (I don't think that would work anyway.) Perhaps one alternative possibility is to put the event manager in its own package, but I think it too provides the dependencies for a lot of modules we traditionally associate with base. Edward

On Fri, Apr 29, 2011 at 8:10 PM, Edward Z. Yang
The new IO event manager implements a priority search queue internally. This is a kind of handy data structure; in particular, some algorithmic improvements in Hoopl would require a decent priority queue implementation. I propose that we move GHC.Event.PSQ to Data.PSQ, cleaning up the dependency on GHC.Event.Unique, and export it unconditionally (right now it is only available on non-Windows.)
Discussion period: 1 week.
What will you also export GHC.Event.Unique?
Cheers, Edward
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Excerpts from Antoine Latter's message of Fri Apr 29 21:25:13 -0400 2011:
What will you also export GHC.Event.Unique?
My inclination is no, although I'm willing to be convinced otherwise. The PSQ in this case would be more like an IntMap (call it IntPSQ?) One consideration is the fact that GHC.Event.Unique currently unconditionally uses Int64, where it seems to me that you'd want to use whatever the native Int is per platform--someone who worked on the new event manager care to elucidate? Edward

On 04/29/11 21:10, Edward Z. Yang wrote:
The new IO event manager implements a priority search queue internally. This is a kind of handy data structure; in particular, some algorithmic improvements in Hoopl would require a decent priority queue implementation. I propose that we move GHC.Event.PSQ to Data.PSQ, cleaning up the dependency on GHC.Event.Unique, and export it unconditionally (right now it is only available on non-Windows.)
(After code cleanup), you *could* re-export base:GHC.Event.PSQ as containers:Data.PSQ ... then if base ever stops needing it, you can copy the implementation into containers (where I guess it logically belongs) without breaking everything? -Isaac

Excerpts from Isaac Dupree's message of Sat Apr 30 12:24:47 -0400 2011:
(After code cleanup), you *could* re-export base:GHC.Event.PSQ as containers:Data.PSQ ... then if base ever stops needing it, you can copy the implementation into containers (where I guess it logically belongs) without breaking everything?
Yes, that would be a sensible option, which I quite like. So I amend this proposal to also re-export this structure from containers. Cheers, Edward

On Sat, Apr 30, 2011 at 9:24 AM, Isaac Dupree < ml@isaac.cedarswampstudios.org> wrote:
(After code cleanup), you *could* re-export base:GHC.Event.PSQ as containers:Data.PSQ ... then if base ever stops needing it, you can copy the implementation into containers (where I guess it logically belongs) without breaking everything?
I don't think it makes much sense to do anything with the PSQ code in GHC.Event. When we were writing the event manager, we tore out every piece of non-essential PSQ code in order to keep the mental footprint of base as small as we reasonably could. You'll find a trimmed-down and specialised copy of IntMap in the event manager subtree, too. Tying general-purpose data structures to base in any way would be a mistake, I think, as it would constrain them to a slow and laborious maintenance and improvement cycle. If you want evidence, witness the fitful progress of all the good work on the containers package over the past 18 months. Indeed, to avoid the containers morass, I'd recommend putting that PSQ implementation in a package of its own and just leaving it at that (assuming there's not already a copy on Hackage).

I understand the sentiment. As I mentioned earlier, the reason why I'm interested in getting some amount of code reuse is that Hoopl (which is also in the bootstrap libraries) needs priority queue implementation. If it really is that specialized, it might be a better idea to just put another copy of it in Hoopl (though I'd not be too happy about that!) Edward Excerpts from Bryan O'Sullivan's message of Sat Apr 30 13:48:05 -0400 2011:
On Sat, Apr 30, 2011 at 9:24 AM, Isaac Dupree < ml@isaac.cedarswampstudios.org> wrote:
(After code cleanup), you *could* re-export base:GHC.Event.PSQ as containers:Data.PSQ ... then if base ever stops needing it, you can copy the implementation into containers (where I guess it logically belongs) without breaking everything?
I don't think it makes much sense to do anything with the PSQ code in GHC.Event. When we were writing the event manager, we tore out every piece of non-essential PSQ code in order to keep the mental footprint of base as small as we reasonably could. You'll find a trimmed-down and specialised copy of IntMap in the event manager subtree, too.
Tying general-purpose data structures to base in any way would be a mistake, I think, as it would constrain them to a slow and laborious maintenance and improvement cycle. If you want evidence, witness the fitful progress of all the good work on the containers package over the past 18 months. Indeed, to avoid the containers morass, I'd recommend putting that PSQ implementation in a package of its own and just leaving it at that (assuming there's not already a copy on Hackage).

On Sat, Apr 30, 2011 at 3:10 AM, Edward Z. Yang
The new IO event manager implements a priority search queue internally. This is a kind of handy data structure; in particular, some algorithmic improvements in Hoopl would require a decent priority queue implementation. I propose that we move GHC.Event.PSQ to Data.PSQ, cleaning up the dependency on GHC.Event.Unique, and export it unconditionally (right now it is only available on non-Windows.)
I just wanted to note that the PSQ was made monomorphic for performance reasons (about a 30% improvement) and I be hesitant to use a more generic version in the event manager unless it performs close to the current implementation. Johan

Excerpts from Johan Tibell's message of Sun May 01 04:26:52 -0400 2011:
I just wanted to note that the PSQ was made monomorphic for performance reasons (about a 30% improvement) and I be hesitant to use a more generic version in the event manager unless it performs close to the current implementation.
Just to clarify, here you mean monomorphic in the key and priority values? Edward

On Sun, May 1, 2011 at 11:52 AM, Edward Z. Yang
Excerpts from Johan Tibell's message of Sun May 01 04:26:52 -0400 2011:
I just wanted to note that the PSQ was made monomorphic for performance reasons (about a 30% improvement) and I be hesitant to use a more generic version in the event manager unless it performs close to the current implementation.
Just to clarify, here you mean monomorphic in the key and priority values?
Yes. And the values as well (i.e. monomorphic in the keys, priorities, and values).

Excerpts from Johan Tibell's message of Sun May 01 15:19:55 -0400 2011:
Yes. And the values as well (i.e. monomorphic in the keys, priorities, and values).
Wait, values too? Doesn't look like it in the source... -- | @E k p@ binds the key @k@ with the priority @p@. data Elem a = E { key :: {-# UNPACK #-} !Key , prio :: {-# UNPACK #-} !Prio , value :: a } deriving (Eq, Show) Cheers, Edward

On May 1, 2011 9:30 PM, "Edward Z. Yang"
Excerpts from Johan Tibell's message of Sun May 01 15:19:55 -0400 2011:
Yes. And the values as well (i.e. monomorphic in the keys, priorities, and values).
Wait, values too? Doesn't look like it in the source...
-- | @E k p@ binds the key @k@ with the priority @p@. data Elem a = E { key :: {-# UNPACK #-} !Key , prio :: {-# UNPACK #-} !Prio , value :: a } deriving (Eq, Show)
Cheers, Edward
Seems like I remembered it wrong. :)

OK, to summarize the current discussion: - It would be nice to have a general-purpose priority queue in containers. I'm not interested in this goal per se, but I do view it as the cleanest way to get what I want. - The IO event manager, as a primary client of PSQ, is concerned about the following possible implications of moving PSQ entirely to containers: - As a general purpose interface, it would be more difficult to performance tune and/or specialize the queue for its application. (What a limitation of GHC Haskell! We should fix that :-) - Specialization on keys and priority values is necessary for good performance with the event manager. - There is a fear that if PSQ gets in containers, updating it will be much more difficult for the IO event manager. First, a philosophical statement: I hate code duplication. I'm already pretty unhappy with the fact that base grew its own copy of IntMap, the only real functional difference being that this IntMap appears to be strict in values. It might be the only way to achieve other desirable properties (for example, having IntMap in containers and not in base), but it makes me die a little inside. One reason why I'm so adamant about this is that each of these modules contains nontrivial algorithmic content, and as recent events with Data.Map have shown, it's still possible for us to have gotten it wrong (and subtle fixes to be necessary). Thus, any contributor who wants to submit a performance improvement to IntMap also has to know that base has its own private copy of IntMap. That just seems poor. Though, it seems I'm already too late to the party: GHC.Event.PSQ is stolen from the PSQueue Hackage package, with some modifications (I didn't realize this when I originally made my proposal). So I hereby amend my proposal: I propose we move PSQueue into containers, hereby putting it on even ground with IntMap. Its notability for inclusion is hereby established by its use in the IO Event Manager and its projected use in Hoopl. On a completely theoretical note, I wonder if there is a way to take advantage of the fact that keys are integers to improve the PSQueue structure, much in the same way IntMap improves on Map. Cheers, Edward

On Sun, May 01, 2011 at 07:58:37PM -0400, Edward Z. Yang wrote:
Though, it seems I'm already too late to the party: GHC.Event.PSQ is stolen from the PSQueue Hackage package, with some modifications (I didn't realize this when I originally made my proposal). So I hereby amend my proposal:
I propose we move PSQueue into containers, hereby putting it on even ground with IntMap. Its notability for inclusion is hereby established by its use in the IO Event Manager and its projected use in Hoopl.
Why merge the two libraries, rather than just adding PSQueue to the set of bootlibs? But either way, this won't help the IO manager use the shared code, as it is in base. Personally I think breaking base up further is the right solution, including pulling most of the IO code out into its own package (you need to leavea little bit right at the bottom of the dependency hierarchy, so that you can define things like "error"), but that's had opposition in the past. Thanks Ian

On Mon, May 2, 2011 at 2:26 AM, Ian Lynagh
But either way, this won't help the IO manager use the shared code, as it is in base. Personally I think breaking base up further is the right solution, including pulling most of the IO code out into its own package (you need to leavea little bit right at the bottom of the dependency hierarchy, so that you can define things like "error"), but that's had opposition in the past.
I think this is the way to go as well. Data structures (like Map) and simple data types (like Maybe) needs to be in the bottom of the hierarchy, so the rest of the libraries (i.e. base) can use them. I found it quite frustrating to make the I/O manager GHC specific, by e.g. changing all Data.Maybe to GHC.Maybe, when it could be compiler neutral. I also don't like to have to duplicate things like PSQ and IntMap. Johan

Excerpts from Ian Lynagh's message of Sun May 01 20:26:10 -0400 2011:
On Sun, May 01, 2011 at 07:58:37PM -0400, Edward Z. Yang wrote:
Though, it seems I'm already too late to the party: GHC.Event.PSQ is stolen from the PSQueue Hackage package, with some modifications (I didn't realize this when I originally made my proposal). So I hereby amend my proposal:
I propose we move PSQueue into containers, hereby putting it on even ground with IntMap. Its notability for inclusion is hereby established by its use in the IO Event Manager and its projected use in Hoopl.
Why merge the two libraries, rather than just adding PSQueue to the set of bootlibs?
I suppose the other works as well. Cheers, Edward

On Mon, 2011-05-02 at 01:26 +0100, Ian Lynagh wrote:
On Sun, May 01, 2011 at 07:58:37PM -0400, Edward Z. Yang wrote:
Though, it seems I'm already too late to the party: GHC.Event.PSQ is stolen from the PSQueue Hackage package, with some modifications (I didn't realize this when I originally made my proposal). So I hereby amend my proposal:
I propose we move PSQueue into containers, hereby putting it on even ground with IntMap. Its notability for inclusion is hereby established by its use in the IO Event Manager and its projected use in Hoopl.
Why merge the two libraries, rather than just adding PSQueue to the set of bootlibs?
Adding a priority queue to the containers package seems to make more sense than putting it in it's own package that is also distributed with ghc and the Haskell Platform. If it's going to be exposed and part of the HP anyway, why not just put it in with the other containers. Duncan

I think we should extend the discussion period for another week. Cheers, Edward Excerpts from Edward Z. Yang's message of Fri Apr 29 21:10:41 -0400 2011:
The new IO event manager implements a priority search queue internally. This is a kind of handy data structure; in particular, some algorithmic improvements in Hoopl would require a decent priority queue implementation. I propose that we move GHC.Event.PSQ to Data.PSQ, cleaning up the dependency on GHC.Event.Unique, and export it unconditionally (right now it is only available on non-Windows.)
Discussion period: 1 week.
Cheers, Edward

After the reviewing the discussion, it looks like this is a no-go. I'll hack up something just for Hoopl, and at some later point in time I'll re-propose a more expansive set of data structures which probably ought to be added to containers. Edward
participants (8)
-
Antoine Latter
-
Bryan O'Sullivan
-
Duncan Coutts
-
Edward Z. Yang
-
Felipe Almeida Lessa
-
Ian Lynagh
-
Isaac Dupree
-
Johan Tibell