Re: Data.Graph transitive closure

... reasonably efficient transitive closure for Data.Graph ...
well, since we have type Graph = Array Vertex [Vertex] efficiency is already questionable? We cannot quickly check whether an edge is present, we cannot quickly get all predecessors of a vertex. But perhaps we don't need to. The underlying assumptions (cost of elementary operations) of http://www.cs.hut.fi/~enu/thesis.html are not immediately clear. The cited Agda library has newtype Graph n e = Graph (Map n (Map n e)) Another option is to store back edges as well, as in https://github.com/haskell-works/relation/blob/master/src/Data/Relation/Inte... FGL fuses these two maps type GraphRep a b = IntMap (Context' a b) type Context' a b = (IntMap [b], a, IntMap [b]) https://hackage.haskell.org/package/fgl-5.7.0.1/docs/src/Data.Graph.Inductiv... this saves some look-ups (if you need successors and predecessors of the very same node) but this can't be used for relations, where start and end of an edge have different types. But for transitive closure, these types would agree anyway. What to make of this? Perhaps Data.Graph should be moved out of containers. The design space is just too large to single out one specific point? If we keep it, it would be very nice to document the complexity of algorithms. Section 7 of the King/Launchbury paper (cited in the API docs) claims "DFS in O(V+E)", backed up by experiments. This seems to be the main motivation for this library (DFS, with application: SCC). It's not clear whether the underlying design should be recommended for a general graph library. - Johannes

I've actually wondered why Data.Graph existed since it is obviously not written for serious/heavy usage. So I do see an argument for deprecating it instead of polishing it. On Fri, Jun 21, 2019 at 10:10 AM Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
... reasonably efficient transitive closure for Data.Graph ...
well, since we have
type Graph = Array Vertex [Vertex]
efficiency is already questionable? We cannot quickly check whether an edge is present, we cannot quickly get all predecessors of a vertex. But perhaps we don't need to. The underlying assumptions (cost of elementary operations) of http://www.cs.hut.fi/~enu/thesis.html are not immediately clear.
The cited Agda library has
newtype Graph n e = Graph (Map n (Map n e))
Another option is to store back edges as well, as in
https://github.com/haskell-works/relation/blob/master/src/Data/Relation/Inte...
FGL fuses these two maps
type GraphRep a b = IntMap (Context' a b) type Context' a b = (IntMap [b], a, IntMap [b])
https://hackage.haskell.org/package/fgl-5.7.0.1/docs/src/Data.Graph.Inductiv...
this saves some look-ups (if you need successors and predecessors of the very same node) but this can't be used for relations, where start and end of an edge have different types. But for transitive closure, these types would agree anyway.
What to make of this?
Perhaps Data.Graph should be moved out of containers. The design space is just too large to single out one specific point? If we keep it, it would be very nice to document the complexity of algorithms. Section 7 of the King/Launchbury paper (cited in the API docs) claims "DFS in O(V+E)", backed up by experiments. This seems to be the main motivation for this library (DFS, with application: SCC). It's not clear whether the underlying design should be recommended for a general graph library.
- Johannes
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Yes, Data.Graph is a weird, difficult-to-polish corner of containers. I'm
all for removing it, but there's one sticky point: it's managed to
accumulate some extremely important reverse dependencies. In particular,
GHC, Cabal, and Agda all use it. So if we want to deprecate it, we need a
really good plan for what happens next.
On Fri, Jun 21, 2019, 12:37 PM Elliot Cameron
I've actually wondered why Data.Graph existed since it is obviously not written for serious/heavy usage. So I do see an argument for deprecating it instead of polishing it.
On Fri, Jun 21, 2019 at 10:10 AM Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
... reasonably efficient transitive closure for Data.Graph ...
well, since we have
type Graph = Array Vertex [Vertex]
efficiency is already questionable? We cannot quickly check whether an edge is present, we cannot quickly get all predecessors of a vertex. But perhaps we don't need to. The underlying assumptions (cost of elementary operations) of http://www.cs.hut.fi/~enu/thesis.html are not immediately clear.
The cited Agda library has
newtype Graph n e = Graph (Map n (Map n e))
Another option is to store back edges as well, as in
https://github.com/haskell-works/relation/blob/master/src/Data/Relation/Inte...
FGL fuses these two maps
type GraphRep a b = IntMap (Context' a b) type Context' a b = (IntMap [b], a, IntMap [b])
https://hackage.haskell.org/package/fgl-5.7.0.1/docs/src/Data.Graph.Inductiv...
this saves some look-ups (if you need successors and predecessors of the very same node) but this can't be used for relations, where start and end of an edge have different types. But for transitive closure, these types would agree anyway.
What to make of this?
Perhaps Data.Graph should be moved out of containers. The design space is just too large to single out one specific point? If we keep it, it would be very nice to document the complexity of algorithms. Section 7 of the King/Launchbury paper (cited in the API docs) claims "DFS in O(V+E)", backed up by experiments. This seems to be the main motivation for this library (DFS, with application: SCC). It's not clear whether the underlying design should be recommended for a general graph library.
- Johannes
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

I've actually wondered why Data.Graph existed since it is obviously not written for serious/heavy usage.
But it was! As I understand, it was originally part of GHC (ghc-0.29/ghc/compiler/utils/Digraph.lhs) for computing SCCs (in the graph of dependencies of declarations). So we can assume that it does this well. And I guess it's used for the very same purpose in Cabal and Agda. - J.

The big problem is that when it was broken off into containers, there was no attempt to add appropriate abstraction barriers. So it's ... awkward. On Fri, Jun 21, 2019, 2:10 PM Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
I've actually wondered why Data.Graph existed since it is obviously not written for serious/heavy usage.
But it was!
As I understand, it was originally part of GHC (ghc-0.29/ghc/compiler/utils/Digraph.lhs) for computing SCCs (in the graph of dependencies of declarations).
So we can assume that it does this well.
And I guess it's used for the very same purpose in Cabal and Agda.
- J. _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

i've definitely used the efficient SCC bits of Data.Graph in the past.
graphs are just such a rich area that it seems doubtful that there'll ever
be a one true library
On Fri, Jun 21, 2019 at 2:21 PM David Feuer
The big problem is that when it was broken off into containers, there was no attempt to add appropriate abstraction barriers. So it's ... awkward.
On Fri, Jun 21, 2019, 2:10 PM Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
I've actually wondered why Data.Graph existed since it is obviously not written for serious/heavy usage.
But it was!
As I understand, it was originally part of GHC (ghc-0.29/ghc/compiler/utils/Digraph.lhs) for computing SCCs (in the graph of dependencies of declarations).
So we can assume that it does this well.
And I guess it's used for the very same purpose in Cabal and Agda.
- J. _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Dear Johannes, Carter, David, libraries, As a formal methodist who likes and uses functional languages a lot, I looked at doing this in a "principled" way a long time ago. The idea was to formally specify an abstract graph datatype and various graph operations (create, query, modify, transitive closure, etc, etc...) and then show correctness of of simple but possibly inefficient implementations. Then the plan was to develop efficient operations, including use of appropriate graph representation types, and show formally that they were refinements of the inefficient ones. It rapidly became *very* clear that the best/only? way to do something efficiently was to decide which operations were important, and specialise for those. Woe betide you if you left out an operator that later would be important, because its efficient implementation could well force you to do complete re-design. Basically, it is not feasible to design a large/general-purpose graph library that does everything efficiently. I then got distracted by something else..... Regards, Andrew
On 21 Jun 2019, at 19:48, Johannes Waldmann
wrote: On 21.06.19 20:44, Carter Schonwald wrote:
... it seems doubtful that there'll ever be a one true [graph] library
Yes. My point was that Data.Graph isn't a "graph library" (it was never meant to be) - it's a "DFS/SCC library".
- J. _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-------------------------------------------------------------------- Andrew Butterfield Tel: +353-1-896-2517 Fax: +353-1-677-2204 Lero@TCD, Head of Foundations & Methods Research Group School of Computer Science and Statistics, Room G.39, O'Reilly Institute, Trinity College, University of Dublin http://www.scss.tcd.ie/Andrew.Butterfield/ --------------------------------------------------------------------

Hi,
I looked at doing this in a "principled" way a long time ago. I then got distracted by something else.....
You don't say :-) Well, short-term suggestion (for containers:Data.Graph) don't change existing code, but improve documentation and do some benchmarking https://github.com/haskell/containers/issues/648 https://github.com/haskell/containers/issues/649 @David: perhaps you can label these issues as "nice to have" so it does not look like "bugs that need fixing". Benchmarking would make for a nice student project? In fact, I will pitch this to my current class. - Johannes

I have been bit before by Data.Graph... I think it's that since it's a
type synonym, it inherits Eq from Array, but it doesn't have a
normalized form, so Eq is misleading in that topologically equal
graphs are not Eq... something like that. And it can't be fixed
because it's just a type synonym over Array. That's probably the "no
attempt to add abstraction barriers" alluded to above. But the reason
I used it in the first place was I wanted a graph, and behold there is
Data.Graph already installed. So it's presence in containers gives it
the impression of authority, but it's definitely not as well developed
as the other types in containers.
Since it seems like it can't be moved or made into a proper type
without breaking things, we could at least have some caveats in the
documentation, saying what it's appropriate for and what it's not.
And maybe warning about that Eq thing. So I guess I'm seconding the
"just add documentation" suggestion.
Also while I'm here it's weird and confusing how it silently
re-exports Data.Tree...
On Mon, Jun 24, 2019 at 4:02 AM Johannes Waldmann
Hi,
I looked at doing this in a "principled" way a long time ago. I then got distracted by something else.....
You don't say :-)
Well, short-term suggestion (for containers:Data.Graph) don't change existing code, but improve documentation and do some benchmarking https://github.com/haskell/containers/issues/648 https://github.com/haskell/containers/issues/649
@David: perhaps you can label these issues as "nice to have" so it does not look like "bugs that need fixing". Benchmarking would make for a nice student project? In fact, I will pitch this to my current class.
- Johannes _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

If we could boot it out of containers and into its own package, that would
make me happier; then I wouldn't have such a weird, special-purpose
structure sitting around in an otherwise general-purpose library.
On Mon, Jun 24, 2019, 1:38 PM Evan Laforge
I have been bit before by Data.Graph... I think it's that since it's a type synonym, it inherits Eq from Array, but it doesn't have a normalized form, so Eq is misleading in that topologically equal graphs are not Eq... something like that. And it can't be fixed because it's just a type synonym over Array. That's probably the "no attempt to add abstraction barriers" alluded to above. But the reason I used it in the first place was I wanted a graph, and behold there is Data.Graph already installed. So it's presence in containers gives it the impression of authority, but it's definitely not as well developed as the other types in containers.
Since it seems like it can't be moved or made into a proper type without breaking things, we could at least have some caveats in the documentation, saying what it's appropriate for and what it's not. And maybe warning about that Eq thing. So I guess I'm seconding the "just add documentation" suggestion.
Also while I'm here it's weird and confusing how it silently re-exports Data.Tree...
On Mon, Jun 24, 2019 at 4:02 AM Johannes Waldmann
wrote: Hi,
I looked at doing this in a "principled" way a long time ago. I then got distracted by something else.....
You don't say :-)
Well, short-term suggestion (for containers:Data.Graph) don't change existing code, but improve documentation and do some benchmarking https://github.com/haskell/containers/issues/648 https://github.com/haskell/containers/issues/649
@David: perhaps you can label these issues as "nice to have" so it does not look like "bugs that need fixing". Benchmarking would make for a nice student project? In fact, I will pitch this to my current class.
- Johannes _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On 24.06.19 20:22, David Feuer wrote:
If we could boot it out of containers and into its own package,
https://hackage.haskell.org/package/GraphSCC-1.0.4/docs/Data-Graph-SCC.html ?

If Iavor wants to adopt Data.Graph into his package, I'm fine with that. I'm not exactly clear on what his package adds to the mix, but if it's not my problem it's not my problem. GHC HQ needs to sign off on adding it as a boot package, but its dependencies (base, array, containers) should make that acceptable. On Mon, Jun 24, 2019, 2:25 PM Johannes Waldmann < johannes.waldmann@htwk-leipzig.de> wrote:
On 24.06.19 20:22, David Feuer wrote:
If we could boot it out of containers and into its own package,
https://hackage.haskell.org/package/GraphSCC-1.0.4/docs/Data-Graph-SCC.html ?
participants (6)
-
Andrew Butterfield
-
Carter Schonwald
-
David Feuer
-
Elliot Cameron
-
Evan Laforge
-
Johannes Waldmann