Generic Graph Class

24 Jun 2009

      At the moment, there are at least three ways people use graph
data-structures in Haskell:

    * Data.Graph from containers
    * Data.Graph.Inductive from FGL
    * A custom job (usually using something like IntMap).

If we look on Hackage, there are a number of graph-related packages
there, each of which uses one of the above approaches.  However, for the
more "generic" packages that operate _on_ graphs (rather than using
graphs as an internal data structure), there is (as far as I can tell,
anyway) usually no reason why at least most of that package cannot work
on _any_ of these types of graphs.  Examples include (in no particular
order; found by looking for "graph" on the main hackage package list
page):

    * hgal {uses Data.Graph}
    * vacuum and related packages {custom, [(a, [a])]}
    * PkgGraph {custom, a -> [b] AFAICT}
    * cabalgraph {custom, a -> b AFAICT}
    * graphviz {uses FGL}
    * SceneGraph {FGL}
    * data-reify {custom}
    * hsc3 and related packages {custom, not sure how}
    * dotgen {custom}
    * fenfire {custom}
    * GraphSCC {Data.Graph}
    * Graphalyze {FGL}
    * SourceGraph {FGL via Graphalyze}

Just to give you an idea, out of these the following implement some form
of graph -> .dot function (for GraphViz):

    * vacuum
    * PkgGraph
    * cabalgraph
    * graphviz (well, that's the point of this package... used by
                      Graphalyze and SourceGraph)
    * SceneGraph
    * hsc3-dot
    * dotgen (From its description: "This package provides a simple
                       interface for building .dot graph files...")

Why do we duplicate so much work and effort?  Not just in terms of
generating Dot representations for use in GraphViz, but also algorithm
development?  For example, I can't see any reason why hgal shouldn't
also work for any generic graph type, possibly even GraphSCC (note that
sometimes specialisation isn't always avoidable: my Graphalyze library
uses the inductive nature of FGL for its algorithms).

I thus propose that we work out a generic graph class that can be used
by the various libraries we have and use, to avoid this duplication of
effort (I have already proposed that I intended to add such
functionality to the graphviz library, but I'm throwing open the design
of such a class to the general community).  This means that even if you
have to use some custom graph-like data structure in your program, you
can take advantage of one of the libraries available (e.g. graphviz)
without having to write your own graph functions for common tasks.

Here is my current thinking on how such a class could be defined.  Note
that I'm defining it for _directed_ graphs, as all graph definitions
used seem to match this mould (and we can emulate undirected graphs
using directed graphs but not the other way around).

======================================================================

type Vertex = Int

class GenericGraph g where
    vertices :: g -> [Vertex]
    edges :: g -> [(Vertex, Vertex)]

    order :: g -> Int
    order = length . vertices

    size :: g -> Int
    size = length . edges

    -- All vertices v' such that there's an edge (v',v)
    adjacentIn     :: g -> Vertex -> [Vertex]
    adjacentIn g v = map fst . filter ((==) v . snd)
                     $ edges g

    degIn   :: g -> Vertex -> Int
    degIn g = length . adjacentIn g

    -- All vertices v' such that there's an edge (v,v')
    adjacentOut     :: g -> Vertex -> [Vertex]
    adjacentOut g v = map snd . filter ((==) v . fst)
                      $ edges g

    degOut   :: g -> Vertex -> Int
    degOut g = length . adjacentIn g

    adjacent     :: g -> Vertex -> [Vertex]
    adjacent g v = adjacentIn g v ++ adjacentOut g v

    degree   :: g -> Vertex -> [Vertex]
    degree g = length . adjacent g

    -- For this function, maybe use as a default something based on the
    -- hidden isUndir function in graphviz's Data.GraphViz
    isDirected :: g -> Bool

======================================================================

I'm not saying that the above class is perfect (though I can't think of
any other generic graph-related functions we might want to include in
there off the top of my head).  However, I do think it's a good starting
point.

Once we have defined such a class, how should we package it?  In
particular, how should we define instances for pre-packaged
(i.e. Data.Graph and FGL) graph data structures?  

Method 1: Define the instances in the same package.
---------------------------------------------------
If we include the instances in the same package, then all
libraries that use this class will require FGL, etc. to be installed
(e.g. if you want to use the graphviz library for your own custom class,
then thanks to the static linking used by GHC you will still have an
unwanted FGL dependency in your program).

Method 2: Have seperate instance packages.
------------------------------------------
No unwanted dependencies, but this means we have what are basically
useless packages containing nothing but an orphan instance.

Method 3: Define the instance in Data.Graph and FGL
---------------------------------------------------
I'm not sure if the maintainers of these packages (let alone the GHC
maintainers who require containers) will appreciate us giving them
another dependency for their packages.  Furthermore, this limits users
of libraries that use this class to using versions of the data
structures contained in libraries new enough to have this instance
defined in them.

Method 4: Give a default instance in the documentation.
-------------------------------------------------------
Users can copy/paste the instance definition from the documentation and
put it in their code.  This will allow them to tweak the definitions if
they so desire, but will lead to code duplication.

Please tell me any ideas/criticisms you have of this proposal.  My main
reason for wanting this at the moment is to define it in graphviz,
though I can think of some other uses I might have of it.  If we all
agree to define this type of class in a custom package (rather than me
just defining it in graphviz and leaving it up to other users to define
instances).  I'd like to have this resolved sooner rather than later, as
I want to work on graphviz soonish so I can start working on Graphalyze
and SourceGraph again.

--
Ivan Lazar Miljenovic
Ivan.Miljenovic@gmail.com
IvanMiljenovic.wordpress.com

Ivan Lazar Miljenovic

Andrew Hunter

Ivan Miljenovic

Ivan Lazar Miljenovic

tags

participants (3)