
At the moment, there are at least three ways people use graph data-structures in Haskell: * Data.Graph from containers * Data.Graph.Inductive from FGL * A custom job (usually using something like IntMap). If we look on Hackage, there are a number of graph-related packages there, each of which uses one of the above approaches. However, for the more "generic" packages that operate _on_ graphs (rather than using graphs as an internal data structure), there is (as far as I can tell, anyway) usually no reason why at least most of that package cannot work on _any_ of these types of graphs. Examples include (in no particular order; found by looking for "graph" on the main hackage package list page): * hgal {uses Data.Graph} * vacuum and related packages {custom, [(a, [a])]} * PkgGraph {custom, a -> [b] AFAICT} * cabalgraph {custom, a -> b AFAICT} * graphviz {uses FGL} * SceneGraph {FGL} * data-reify {custom} * hsc3 and related packages {custom, not sure how} * dotgen {custom} * fenfire {custom} * GraphSCC {Data.Graph} * Graphalyze {FGL} * SourceGraph {FGL via Graphalyze} Just to give you an idea, out of these the following implement some form of graph -> .dot function (for GraphViz): * vacuum * PkgGraph * cabalgraph * graphviz (well, that's the point of this package... used by Graphalyze and SourceGraph) * SceneGraph * hsc3-dot * dotgen (From its description: "This package provides a simple interface for building .dot graph files...") Why do we duplicate so much work and effort? Not just in terms of generating Dot representations for use in GraphViz, but also algorithm development? For example, I can't see any reason why hgal shouldn't also work for any generic graph type, possibly even GraphSCC (note that sometimes specialisation isn't always avoidable: my Graphalyze library uses the inductive nature of FGL for its algorithms). I thus propose that we work out a generic graph class that can be used by the various libraries we have and use, to avoid this duplication of effort (I have already proposed that I intended to add such functionality to the graphviz library, but I'm throwing open the design of such a class to the general community). This means that even if you have to use some custom graph-like data structure in your program, you can take advantage of one of the libraries available (e.g. graphviz) without having to write your own graph functions for common tasks. Here is my current thinking on how such a class could be defined. Note that I'm defining it for _directed_ graphs, as all graph definitions used seem to match this mould (and we can emulate undirected graphs using directed graphs but not the other way around). ====================================================================== type Vertex = Int class GenericGraph g where vertices :: g -> [Vertex] edges :: g -> [(Vertex, Vertex)] order :: g -> Int order = length . vertices size :: g -> Int size = length . edges -- All vertices v' such that there's an edge (v',v) adjacentIn :: g -> Vertex -> [Vertex] adjacentIn g v = map fst . filter ((==) v . snd) $ edges g degIn :: g -> Vertex -> Int degIn g = length . adjacentIn g -- All vertices v' such that there's an edge (v,v') adjacentOut :: g -> Vertex -> [Vertex] adjacentOut g v = map snd . filter ((==) v . fst) $ edges g degOut :: g -> Vertex -> Int degOut g = length . adjacentIn g adjacent :: g -> Vertex -> [Vertex] adjacent g v = adjacentIn g v ++ adjacentOut g v degree :: g -> Vertex -> [Vertex] degree g = length . adjacent g -- For this function, maybe use as a default something based on the -- hidden isUndir function in graphviz's Data.GraphViz isDirected :: g -> Bool ====================================================================== I'm not saying that the above class is perfect (though I can't think of any other generic graph-related functions we might want to include in there off the top of my head). However, I do think it's a good starting point. Once we have defined such a class, how should we package it? In particular, how should we define instances for pre-packaged (i.e. Data.Graph and FGL) graph data structures? Method 1: Define the instances in the same package. --------------------------------------------------- If we include the instances in the same package, then all libraries that use this class will require FGL, etc. to be installed (e.g. if you want to use the graphviz library for your own custom class, then thanks to the static linking used by GHC you will still have an unwanted FGL dependency in your program). Method 2: Have seperate instance packages. ------------------------------------------ No unwanted dependencies, but this means we have what are basically useless packages containing nothing but an orphan instance. Method 3: Define the instance in Data.Graph and FGL --------------------------------------------------- I'm not sure if the maintainers of these packages (let alone the GHC maintainers who require containers) will appreciate us giving them another dependency for their packages. Furthermore, this limits users of libraries that use this class to using versions of the data structures contained in libraries new enough to have this instance defined in them. Method 4: Give a default instance in the documentation. ------------------------------------------------------- Users can copy/paste the instance definition from the documentation and put it in their code. This will allow them to tweak the definitions if they so desire, but will lead to code duplication. Please tell me any ideas/criticisms you have of this proposal. My main reason for wanting this at the moment is to define it in graphviz, though I can think of some other uses I might have of it. If we all agree to define this type of class in a custom package (rather than me just defining it in graphviz and leaving it up to other users to define instances). I'd like to have this resolved sooner rather than later, as I want to work on graphviz soonish so I can start working on Graphalyze and SourceGraph again. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

On Wed, Jun 24, 2009 at 3:21 AM, Ivan Lazar
Miljenovic
there, each of which uses one of the above approaches. However, for the more "generic" packages that operate _on_ graphs (rather than using graphs as an internal data structure),
...
I thus propose that we work out a generic graph class that can be used by the various libraries we have and use, to avoid this duplication of effort (I have already proposed that I intended to add such functionality to the graphviz library, but I'm throwing open the design of such a class to the general community). This means that even if you have to use some custom graph-like data structure in your program, you can take advantage of one of the libraries available (e.g. graphviz) without having to write your own graph functions for common tasks.
This is a good idea and one I support. (I think I've been told before that this has been tried w/o a lot of success, but, well...) My primary concern is this: you built your class for things that operate "on" graphs...but there isn't a great distinction. There are too many useful graph algorithms that require modification, or at least marking of vertices/edges (as taken, seen, by distance, color...think and you'll notice this happens everywhere.) Thus, it'd be very nice if the Graph class could have a concept of: a) some amount of modification--new vertex, additional edge, what have you... b) Labeling of vertices/edges, ideally parameterized by label type. c) some amount of modification of those marks, so we can run, say, DFS, Floyd-Warshall, Dijsktra, Prims without cumbersome external management of secondary data structures. This might require the definition of a GraphAlgorithm monad, which I've been toying with for a while--I'll see if I can dig up the code if there's desire. AHH

Yay, someone read my proposal! :p
2009/6/25 Andrew Hunter
This is a good idea and one I support. (I think I've been told before that this has been tried w/o a lot of success, but, well...) My primary concern is this: you built your class for things that operate "on" graphs...but there isn't a great distinction. There are too many useful graph algorithms that require modification, or at least marking of vertices/edges (as taken, seen, by distance, color...think and you'll notice this happens everywhere.) Thus, it'd be very nice if the Graph class could have a concept of:
I was thinking about this, because graphviz at the moment uses the label field of nodes to determine which cluster it belongs to, etc. However, the problem with this is that Data.Graph doesn't have any labels...
a) some amount of modification--new vertex, additional edge, what have you...
Data.Graph can't really be modified in terms of adding a new vertex, unless you go and expand the array it uses and copy everything over (adding a new vertex, however, is indeed possible).
b) Labeling of vertices/edges, ideally parameterized by label type.
As I said, Data.Graph doesn't have any concept of labels; besides, this will require MultiParamTypeClasses and FunDeps AFAICT (and we should probably try to make this compatible with Haskell98 rather than using extensions).
c) some amount of modification of those marks, so we can run, say, DFS, Floyd-Warshall, Dijsktra, Prims without cumbersome external management of secondary data structures. This might require the definition of a GraphAlgorithm monad, which I've been toying with for a while--I'll see if I can dig up the code if there's desire.
My original thoughts (which I didn't include with the proposal) was that algorithms _would_ use internal state. My main impetus of thinking about this class was graphviz and hgal, which already use an internal state anyway (well, maybe not hgal as much; I've been trying to work my way through it a bit at a time now and then trying to improve its efficiency for what I need). Admittedly, it might be nice to have these extra features; it just might not be practical if we want the widest possible "audience" of the class. The other alternative is if we have a second class that allows for updates, etc. but requires the first class as a dependency. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com Casey Stengel - "All right everyone, line up alphabetically according to your height." - http://www.brainyquote.com/quotes/authors/c/casey_stengel.html

Some of us on #haskell last night (well, night for me :p) were
discussing this, and we're going to start a new project to implement an
extended version of my proposal. The working project name is simply
"graph" (hey, we couldn't think of anything better!). If you want to
join in the fun, talk to either myself (ivanm), Cale or mmorrow on
#haskell.
Ivan Miljenovic
Yay, someone read my proposal! :p
2009/6/25 Andrew Hunter
: This is a good idea and one I support. (I think I've been told before that this has been tried w/o a lot of success, but, well...) My primary concern is this: you built your class for things that operate "on" graphs...but there isn't a great distinction. There are too many useful graph algorithms that require modification, or at least marking of vertices/edges (as taken, seen, by distance, color...think and you'll notice this happens everywhere.) Thus, it'd be very nice if the Graph class could have a concept of:
I was thinking about this, because graphviz at the moment uses the label field of nodes to determine which cluster it belongs to, etc. However, the problem with this is that Data.Graph doesn't have any labels...
a) some amount of modification--new vertex, additional edge, what have you...
Data.Graph can't really be modified in terms of adding a new vertex, unless you go and expand the array it uses and copy everything over (adding a new vertex, however, is indeed possible).
b) Labeling of vertices/edges, ideally parameterized by label type.
As I said, Data.Graph doesn't have any concept of labels; besides, this will require MultiParamTypeClasses and FunDeps AFAICT (and we should probably try to make this compatible with Haskell98 rather than using extensions).
c) some amount of modification of those marks, so we can run, say, DFS, Floyd-Warshall, Dijsktra, Prims without cumbersome external management of secondary data structures. This might require the definition of a GraphAlgorithm monad, which I've been toying with for a while--I'll see if I can dig up the code if there's desire.
My original thoughts (which I didn't include with the proposal) was that algorithms _would_ use internal state. My main impetus of thinking about this class was graphviz and hgal, which already use an internal state anyway (well, maybe not hgal as much; I've been trying to work my way through it a bit at a time now and then trying to improve its efficiency for what I need).
Admittedly, it might be nice to have these extra features; it just might not be practical if we want the widest possible "audience" of the class. The other alternative is if we have a second class that allows for updates, etc. but requires the first class as a dependency.
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com
participants (3)
-
Andrew Hunter
-
Ivan Lazar Miljenovic
-
Ivan Miljenovic