What do _you_ want to see in FGL?

Ivan Lazar Miljenovic

26 Apr 2010 26 Apr '10

2:44 a.m.

Since I've volunteered myself to help maintain/upgrade FGL, what do the people in the community want to see happen with it? Here are some ideas that I have regarding FGL: * I had already started working on a new generic graph class [1] (with initial draft at [2]) to act as a wrapper around FGL and other graph implementations. However, if I'm going to be updating FGL anyway I might as well combine the two and generalise/update FGL instead. Generally speaking, this will involve the following items: - Splitting and expanding the typeclasses involved; most class methods would have default definitions but this will allow people to override the defaults if there's a more efficient way for their datatype. - Having a separate parameter (using associated types?) for the node type rather than just using Int. - Providing more "fundamental" graph operations that may be of use. However, the downside of this is that datatypes such as Data.Graph in containers won't match this anymore due to the lack of node and edge labels. I think this fair price to pay, especially if the mapping aspects are split off into another typeclass. [1]: http://www.haskell.org/pipermail/haskell-cafe/2009-June/063402.html [2]: http://code.haskell.org/~ivanm/Graph.hs * Better fundamental data structures: one of the things that has always annoyed me about FGL is how much it uses tuples; I propose re-defining the Context type to be a record-based data structure. Also, usage of Sets, Maps, etc. where applicable. * Proper instances: make graphs instantiate Eq, Ord, Show and Read (where read . show == id). The Show and Read instances will probably usually be like how Data.Map has them, so helper functions can be defined to help out with this. * Split off the instances: maybe keep one basic graph datatype in FGL, but then have separate packages for others (which means its easier to add new datatypes without needing a new release, etc.). * Split off the various Data.Graph.Inductive.Query functions/algorithms, and merge in the ones I have defined in Graphalyze [3]. [3]: http://hackage.haskell.org/package/Graphalyze * Scrap Data.Graph.Inductive.Graphviz and recommend usage of my graphviz library [4] instead. [4]: http://hackage.haskell.org/package/graphviz * Data.Graph.Inductive.Monad: does anyone actually use this? * Benchmarks and tests. What do other people think? Any other suggestions? Also, once I release the next version of graphviz (hopefully later today), I was going to work on improving SourceGraph [5] by providing command line options, etc. to it; would people prefer I work on FGL instead? [5]: http://hackage.haskell.org/package/SourceGraph -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Show replies by date

Christian Maeder

26 Apr 26 Apr

8:50 a.m.

Ivan Lazar Miljenovic schrieb:

...

- Having a separate parameter (using associated types?) for the node type rather than just using Int.

"Just Int" for nodes was disappointing. It should have been at least a "newtype". I would vote against these experimental features like associated types or MPTC and FD. I prefer plain type parameters for data types, but that does not fit well together with type classes.

...

* Better fundamental data structures: one of the things that has always annoyed me about FGL is how much it uses tuples; I propose re-defining the Context type to be a record-based data structure. Also, usage of Sets, Maps, etc. where applicable.

Right, these tuples annoyed me, too. (If this changes, it will break a lot of our existing code, though.) I also did not like the list of links, that let me redefine the context data type (although it does not fit well into the current class): newtype Gr a b = Gr { convertToMap :: Map.IntMap (GrContext a b) } data GrContext a b = GrContext { nodeLabel :: a , nodeSuccs :: Map.IntMap [b] , loops :: [b] , nodePreds :: Map.IntMap [b] } http://trac.informatik.uni-bremen.de:8080/hets/browser/trunk/Common/Lib/Grap... Maybe "Ord b" should be required for edge labels "b" (in order to use sets), although it may make sense to have many edges with the same label between the same two nodes. However, for this special case the user could provide an additional index or a counter.

...

* Data.Graph.Inductive.Monad: does anyone actually use this?

I did not use it. (I've got no opinion about the other points you made.) Cheers Christian

Ivan Lazar Miljenovic

9 a.m.

Christian Maeder writes:

...

Ivan Lazar Miljenovic schrieb:

...
- Having a separate parameter (using associated types?) for the node type rather than just using Int.

"Just Int" for nodes was disappointing. It should have been at least a "newtype".

This then loses you all of the advantages of using Int (pre-defined data type with known space usage, ordering, etc. and the ability to use IntMap and IntSet which out-perform the normal ones).

...

I would vote against these experimental features like associated types or MPTC and FD. I prefer plain type parameters for data types, but that does not fit well together with type classes.

Why don't you like extensions? I used to feel the same way, but then someone pointed out to me that just because Haskell98 doesn't have them doesn't mean they aren't good/useful, and we should be coding for _modern_ Haskell.

...

...
* Better fundamental data structures: one of the things that has always annoyed me about FGL is how much it uses tuples; I propose re-defining the Context type to be a record-based data structure. Also, usage of Sets, Maps, etc. where applicable.

Right, these tuples annoyed me, too. (If this changes, it will break a lot of our existing code, though.)

All of my proposals will break existing code though, but I believe for the better. We can't be afraid to innovate/change our libraries, otherwise they can't improve. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Christian Maeder

9:33 a.m.

Ivan Lazar Miljenovic schrieb:

...

...
I would vote against these experimental features like associated types or MPTC and FD. I prefer plain type parameters for data types, but that does not fit well together with type classes.

Why don't you like extensions? I used to feel the same way, but then someone pointed out to me that just because Haskell98 doesn't have them doesn't mean they aren't good/useful, and we should be coding for _modern_ Haskell.

My feeling is, that the last words about fancy type class extension are not spoken yet. Already the simple classes have disadvantages (wrt multiple instances), that i.e. parameterized modules solve differently (but maybe also not good enough, yet). Parametrized data types, hofs, purity, etc. are proven concepts also found in other functional languages that I prefer to stick to. Cheers Christian

Neil Brown

9:25 a.m.

Hi, Primarily I want to see in FGL: documentation, documentation and more documentation. The library has lots of undocumented functions (especially the queries, e.g. http://hackage.haskell.org/packages/archive/fgl/5.4.2.2/doc/html/Data-Graph-... has no documentation at all), and the only reference is the original paper. I've seen several people who didn't read the paper (or didn't read the right bits) get confused by the inductive way the graph is constructed, when they try to de-construct it. I would also like to see the library reduced in size/scope if possible; there's a lot of modules in FGL, but all I've ever used are D.G.I.Graph, D.G.I.Tree, D.G.I.Graphviz and D.G.I.Query.DFS. I consider this to probably be the most common usage of the library by others too (if you throw in D.G.I.Query.BFS). So by the sounds of it, if you split Graphviz and Query off to separate libraries, I probably only use the core module. Ivan Lazar Miljenovic wrote:

...

Here are some ideas that I have regarding FGL:

* Better fundamental data structures: one of the things that has always annoyed me about FGL is how much it uses tuples; I propose re-defining the Context type to be a record-based data structure. Also, usage of Sets, Maps, etc. where applicable.

I think this is worth doing, even if it does break code. If you're doing Context, you may as well do LNode and LEdge too.

...

* Data.Graph.Inductive.Monad: does anyone actually use this?

No, now that I look at it, it's not what I would expect. It seems to be a lot of operations that could be constructed by sticking liftM before the normal operations. For example: nodesM :: GraphM m gr => m (gr a b) -> m [Node] I would have thought the useful operations for graphs and monads would have been something more like: nodesM :: GraphM m gr => gr (m a) (m b) -> m [Node] On that node, I would like to see some instances for the graph types: particularly Functor, Foldable and Traversable, for both (Gr a) and (Flip Gr b) -- where Flip is http://hackage.haskell.org/packages/archive/TypeCompose/latest/doc/html/Cont... or http://hackage.haskell.org/packages/archive/category-extras/latest/doc/html/... If you put FGL in a repository somewhere, I might be able to find some time to write those instances and perhaps contribute a bit of documentation (no promises, though!). Thanks, Neil.

Ben Franksen

17 May 17 May

2:57 p.m.

Neil Brown wrote:

...

Primarily I want to see in FGL: documentation, documentation and more documentation.

+1 Cheers Ben

Stephen Tetley

26 Apr 26 Apr

10:07 a.m.

Hello Ivan What would your thoughts be on freezing FGL as it is and putting changes into a new package "FGL2" or "NewFGL"? The implementation technique for FGL is independently interesting; Martin Erwig expanded on it in other papers ('Metamorphic Programming') but no one else seems to have picked up on it. As I think there is more mileage in the idiom and hopefully someone will pick up on it at some point, it would be nice if it survives in the code. Best wishes Stephen

Ivan Lazar Miljenovic

11:35 a.m.

Stephen Tetley writes:

...

What would your thoughts be on freezing FGL as it is and putting changes into a new package "FGL2" or "NewFGL"?

That's another possibility. However, I was planning on keeping the fundamental layout and design of FGL. I quite like and have used the inductive approach of graph construction/deconstruction in my own code; I just plan on updating/modernising the layout and design (after all, large all-in-one packages are _so_ pre-Hackage :p). Whilst freezing it is an option, I feel that this will lead to the same problems that we already face with mtl: most people agree/know that the approach/design is bad, but we keep using it because there's no (one) viable alternative to be used (and thus mtl stays in the Platform, which means more people use it, and thus we have this vicious cycle). By using the same name we can break this cycle: there is no implicit "loyalty" to an old package that people still use because its familiar (of course, this leaves it open to the problems with QuickCheck and Parsec, yet they are being resolved and more and more people are moving to QuickCheck-2 and Parsec-3).

...

The implementation technique for FGL is independently interesting; Martin Erwig expanded on it in other papers ('Metamorphic Programming') but no one else seems to have picked up on it. As I think there is more mileage in the idiom and hopefully someone will pick up on it at some point, it would be nice if it survives in the code.

Hmmm.... I haven't come across "Metamorphic Programming" before, will have to have a read through it (which isn't that appealing at the moment, since I'm in the middle of the literature review section of my PhD :s). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

27 Apr 27 Apr

1:59 p.m.

Ivan Lazar Miljenovic schrieb:

...

Since I've volunteered myself to help maintain/upgrade FGL, what do the people in the community want to see happen with it?

I was not happy with the way FGL handles lables so far: http://www.haskell.org/pipermail/libraries/2008-February/009241.html

Ivan Lazar Miljenovic

10:05 p.m.

Henning Thielemann writes:

...

I was not happy with the way FGL handles lables so far: http://www.haskell.org/pipermail/libraries/2008-February/009241.html

Not sure I follow what you want there: you want to remove the whole concept of labels and replace it with the node type? What about edge labels then? Also, what happens if I want a label to be a function? -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

10:48 p.m.

Ivan Lazar Miljenovic schrieb:

...

Henning Thielemann writes:

...
I was not happy with the way FGL handles lables so far: http://www.haskell.org/pipermail/libraries/2008-February/009241.html

Not sure I follow what you want there: you want to remove the whole concept of labels and replace it with the node type? What about edge labels then?

You can label edges using a Map (i,i) label.

...

Also, what happens if I want a label to be a function?

I don't see a problem, functions can be elements of an Array or a Map.

Ivan Miljenovic

28 Apr 28 Apr

midnight

On 28 April 2010 08:48, Henning Thielemann wrote:

...

Ivan Lazar Miljenovic schrieb:

...
Henning Thielemann writes:

...
I was not happy with the way FGL handles lables so far: http://www.haskell.org/pipermail/libraries/2008-February/009241.html

Not sure I follow what you want there: you want to remove the whole concept of labels and replace it with the node type? What about edge labels then?

You can label edges using a Map (i,i) label.

So you don't want the labels to be part of the actual datatype? And for users to then have to deal with any labels they want themselves? If so, I don't think this is feasible; some of the nice parts of FGL IMHO are how it deals with labels (admittedly, I've had to write and use my own "((Int,a) -> a') -> g a b -> g a' b" function because it doesn't have one...). Removing this would be a step backwards. How exactly is it bad/a pain to have to deal with specifying "g () ()", especially since there are some pre-defined "unlabelled" graph type and function aliases? -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

10:39 a.m.

Ivan Miljenovic schrieb:

...

So you don't want the labels to be part of the actual datatype? And for users to then have to deal with any labels they want themselves?

No, you would continue to provide labelled and unlabelled graphs, where unlabelled graphs (or just Graphs) are the base type and labelled graphs are data LabelledGraph node edge = LabelledGraph Graph (Map Node node) (Map (Node,Node) edge) This is a matter of separation of concerns. Sure, it means that you need to split the graph algorithms into their parts: each algorithm into an unlabelled and a labelled part. If there are algorithms that make no sense on labelled graphs, then you need only the first part.

...

If so, I don't think this is feasible; some of the nice parts of FGL IMHO are how it deals with labels (admittedly, I've had to write and use my own "((Int,a) -> a') -> g a b -> g a' b" function because it doesn't have one...). Removing this would be a step backwards.

How exactly is it bad/a pain to have to deal with specifying "g () ()", especially since there are some pre-defined "unlabelled" graph type and function aliases?

For problems that do not need labels, why shall I cope with them? I expect that you quickly run into the need for type extensions, if you define a graph type class that have only unlabelled graphs as instance. For instance:

...

instance SpecialGraph (gr () ()) where

is not Haskell 98, instead

...

instance (IsUnit a) => SpecialGraph (gr a a) where

class IsUnit a where toUnit :: a -> () instance IsUnit () where toUnit = id

would be Haskell 98, but is certainly more complicated. This may also answer your question, how hard you should try to stay Haskell 98. My experience is, that with a proper design of a library you can reduce the need for type extensions. This makes your code more portable and easier to understand.

Ivan Lazar Miljenovic

10:48 a.m.

Henning Thielemann writes:

...

Ivan Miljenovic schrieb:

...
So you don't want the labels to be part of the actual datatype? And for users to then have to deal with any labels they want themselves?

No, you would continue to provide labelled and unlabelled graphs, where unlabelled graphs (or just Graphs) are the base type and labelled graphs are

data LabelledGraph node edge = LabelledGraph Graph (Map Node node) (Map (Node,Node) edge)

This is a matter of separation of concerns. Sure, it means that you need to split the graph algorithms into their parts: each algorithm into an unlabelled and a labelled part. If there are algorithms that make no sense on labelled graphs, then you need only the first part.

I'm hesitant to do such a thing for the simple reason that it will involve duplicate work... It might be possible, by having two classes that do the same thing (Foo and FooLabelled), but I'm not sure how well this would scale.

...

...
If so, I don't think this is feasible; some of the nice parts of FGL IMHO are how it deals with labels (admittedly, I've had to write and use my own "((Int,a) -> a') -> g a b -> g a' b" function because it doesn't have one...). Removing this would be a step backwards.

How exactly is it bad/a pain to have to deal with specifying "g () ()", especially since there are some pre-defined "unlabelled" graph type and function aliases?

For problems that do not need labels, why shall I cope with them?

Ummm... I fail to see how having labels would make FGL harder to use just because you have to do "gr () ()" rather than just "gr" in your types.

...

I expect that you quickly run into the need for type extensions,

I was planning on using associated types to state what the node type was.

...

if you define a graph type class that have only unlabelled graphs as instance. For instance:

...
instance SpecialGraph (gr () ()) where

is not Haskell 98, instead

...
instance (IsUnit a) => SpecialGraph (gr a a) where

class IsUnit a where toUnit :: a -> () instance IsUnit () where toUnit = id

would be Haskell 98, but is certainly more complicated.

Not sure what you're doing here... But isn't that what newtypes are for?

...

This may also answer your question, how hard you should try to stay Haskell 98. My experience is, that with a proper design of a library you can reduce the need for type extensions. This makes your code more portable and easier to understand.

I don't plan on going overboard, but I am not going to go out of my way to avoid extensions (sane ones; there are no plans on using IncoherentInstances or something like that!). Yes, it means that it won't compile on a non-GHC Haskell implementation; but how many people actually use any other compiler full time? My understanding was that the other Haskell compilers still being worked on (JHC and UHC) didn't fully implement Haskell98 either. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

10 May 10 May

12:01 p.m.

On Wed, 28 Apr 2010, Ivan Miljenovic wrote:

...

So you don't want the labels to be part of the actual datatype? And for users to then have to deal with any labels they want themselves?

Recently I wrote cabal-sort using FGL http://hackage.haskell.org/package/cabal-sort It sorts cabal packages topologically according to their dependencies. However, I was neither happy with the way FGL currently works, nor with the way I proposed recently (splitting into unlabelled and labelled graphs). I like to use the package name as node identifier. I do not need any label, but I need a node type different from Int. With current FGL I need to maintain a Map PkgName Int. Would it be sensible to generalize the Node type to any Ord type or does FGL use optimizations specific to Int? In another example I used FGL for finding all topological orderings of a set of database transactions. In this case I used an enumeration type as node type. Are there other applications for alternative Node types?

Ivan Lazar Miljenovic

12:27 p.m.

Henning Thielemann writes:

...

On Wed, 28 Apr 2010, Ivan Miljenovic wrote:

...
So you don't want the labels to be part of the actual datatype? And for users to then have to deal with any labels they want themselves?

Recently I wrote cabal-sort using FGL http://hackage.haskell.org/package/cabal-sort

It sorts cabal packages topologically according to their dependencies. However, I was neither happy with the way FGL currently works,

In what sense?

...

nor with the way I proposed recently (splitting into unlabelled and labelled graphs). I like to use the package name as node identifier. I do not need any label, but I need a node type different from Int. With current FGL I need to maintain a Map PkgName Int. Would it be sensible to generalize the Node type to any Ord type or does FGL use optimizations specific to Int? In another example I used FGL for finding all topological orderings of a set of database transactions. In this case I used an enumeration type as node type. Are there other applications for alternative Node types?

We're considering doing this. Pros for allowing you to use a custom node type: * Matches your data better * No need for extra lookup maps when converting your data to FGL form Cons: * Makes type-sigs uglier/more verbose * Restricts the ability to optimise Using Int gives us a fixed-size data type with known good comparison performance and with a range that should suit most purposes. As an alternative, see how I have Graphalyze create an FGL graph from [n] and [(n,n,e)]: http://hackage.haskell.org/packages/archive/Graphalyze/0.9.0.0/doc/html/Data... We're also considering using MPTCs or ATs to let you place restrictions on the label types (when defining a new graph instance). However, before you again state how you don't want labels, consider this: * It's easier to make a labelled graph act as an unlabelled one than the other way around. * We don't want to implement two "types" of graphs (labelled and unlabelled) as that would involve duplicate work and there would be difficulty in switching between the two. * Graph labels provide you with a great deal of flexibility: how else would you do graph colouring (both vertex and edge)? Clustering? etc. The current "type Node = Int" alias is only there to provide a unique index type for referencing nodes; the actual Ints don't really represent the nodes IMHO, the labels do (in conjunction with the index for colouring, etc.). In your example case of using PkgName (I assume you mean PackageName or PackageIdentifier?), can you guarantee that each PkgName value is unique before you go blindly using it (what about having the same library installed in both the global and user database?)? It's a lot easier to do this kind of stuff (and do generic algorithms on FGL graphs) when you know what the type of the index type is. That's not to say that there aren't cases where having a generic type wouldn't be useful, especially for quick-and-dirty hacks. But in general, I think that using Int for the index type makes more sense (unless you have more vertices than the number of Int values allowed). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

1:10 p.m.

Ivan Lazar Miljenovic schrieb:

...

Pros for allowing you to use a custom node type: * Matches your data better * No need for extra lookup maps when converting your data to FGL form

Cons: * Makes type-sigs uglier/more verbose

Unlabelled graphs with custom node type would have only one type parameter. :-)

...

* Restricts the ability to optimise

That's my question. As far as I know the set of node identifiers in a graph is not contiguous, thus you cannot use Arrays but you must use IntMap or so.

...

Using Int gives us a fixed-size data type with known good comparison performance and with a range that should suit most purposes.

Replacing IntMap by Map is not much slower, is it?

...

As an alternative, see how I have Graphalyze create an FGL graph from [n] and [(n,n,e)]: http://hackage.haskell.org/packages/archive/Graphalyze/0.9.0.0/doc/html/Data...

We're also considering using MPTCs or ATs to let you place restrictions on the label types (when defining a new graph instance). However, before you again state how you don't want labels, consider this:

* It's easier to make a labelled graph act as an unlabelled one than the other way around.

Sure, it's the same argument that in MatLab everything is a complex-valued matrix. They even represent Bool by a 1x1 complex-valued matrix. Possible, but clean? From today's viewpoint separation of labelled and unlabelled graphs is additional work, but I'm afraid there will arise problems with this design. Unfortunately, I can't tell them today.

...

The current "type Node = Int" alias is only there to provide a unique index type for referencing nodes; the actual Ints don't really represent the nodes IMHO, the labels do (in conjunction with the index for colouring, etc.). In your example case of using PkgName (I assume you mean PackageName or PackageIdentifier?), can you guarantee that each PkgName value is unique before you go blindly using it (what about having the same library installed in both the global and user database?)?

Of course you must choose data as key, that is actually a key. But this is not the problem of FGL. And with Node = Int, I also have to choose a key for my dictionary (Map Something Node). In my case I consider packages that are to be installed, so no distinction between globally and locally installed packages. I want really the package name as key.

Ivan Lazar Miljenovic

1:25 p.m.

Henning Thielemann writes:

...

Ivan Lazar Miljenovic schrieb:

...
Pros for allowing you to use a custom node type: * Matches your data better * No need for extra lookup maps when converting your data to FGL form

Cons: * Makes type-sigs uglier/more verbose

Unlabelled graphs with custom node type would have only one type parameter. :-)

Not quite: we'd be using ATs, so you'd have to have functions that have something like "Show (Vertex g)" as a constraint in their type signatures.

...

...
* Restricts the ability to optimise

That's my question. As far as I know the set of node identifiers in a graph is not contiguous, thus you cannot use Arrays but you must use IntMap or so.

We're considering a vector-backed instance (we're not quite sure how this will cope with match and &, but using a rope approach might work).

...

...
Using Int gives us a fixed-size data type with known good comparison performance and with a range that should suit most purposes.

Replacing IntMap by Map is not much slower, is it?

IntMap is faster than (Map Int) (which is why PatriciaTree has better performance than Tree).

...

...
* It's easier to make a labelled graph act as an unlabelled one than the other way around.

Sure, it's the same argument that in MatLab everything is a complex-valued matrix. They even represent Bool by a 1x1 complex-valued matrix. Possible, but clean? From today's viewpoint separation of labelled and unlabelled graphs is additional work, but I'm afraid there will arise problems with this design. Unfortunately, I can't tell them today.

Well, we'd need to have duplicate classes, duplicate instances, duplicate graph operation functions, etc. And for what? To avoid typing "()" a few times?

...

...
The current "type Node = Int" alias is only there to provide a unique index type for referencing nodes; the actual Ints don't really represent the nodes IMHO, the labels do (in conjunction with the index for colouring, etc.). In your example case of using PkgName (I assume you mean PackageName or PackageIdentifier?), can you guarantee that each PkgName value is unique before you go blindly using it (what about having the same library installed in both the global and user database?)?

Of course you must choose data as key, that is actually a key. But this is not the problem of FGL. And with Node = Int, I also have to choose a key for my dictionary (Map Something Node). In my case I consider packages that are to be installed, so no distinction between globally and locally installed packages. I want really the package name as key.

So if I have "network-2.2.1.7" installed in both global package ID and user package ID, what happens? If I have two different versions of network installed, what happens? One of the things I'm considering doing with FGL is porting some of the stuff from Graphalyze over, such that the actual user-level stuff just considers the labels, and only the underlying functions deal with the vertex type. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

1:36 p.m.

Ivan Lazar Miljenovic wrote:

...

Henning Thielemann writes:

...
Recently I wrote cabal-sort using FGL http://hackage.haskell.org/package/cabal-sort

It sorts cabal packages topologically according to their dependencies. However, I was neither happy with the way FGL currently works, nor with the way I proposed recently (splitting into unlabelled and labelled graphs). I like to use the package name as node identifier. I do not need any label, but I need a node type different from Int. With current FGL I need to maintain a Map PkgName Int. Would it be sensible to generalize the Node type to any Ord type or does FGL use optimizations specific to Int? In another example I used FGL for finding all topological orderings of a set of database transactions. In this case I used an enumeration type as node type. Are there other applications for alternative Node types?

We're considering doing this.

Pros for allowing you to use a custom node type: * Matches your data better * No need for extra lookup maps when converting your data to FGL form

Cons: * Makes type-sigs uglier/more verbose * Restricts the ability to optimise

Using Int gives us a fixed-size data type with known good comparison performance and with a range that should suit most purposes.

I have the same gripe as Henning, though I'm not sure I concur with his proposal. Here a snippet from a quick & dirty 'make' implementation that I use for building my website: data Rule = Rule { ins :: [FilePath], out :: FilePath, effect :: IO () } rules2Graph :: [Rule] -> G.Gr (IO ()) () rules2Graph rules = G.mkGraph nodes' edges' where nodes = [(out r, conditionally (effect r) (ins r) (out r)) | r <- rules] edges = [(i, out r, ()) | r <- rules, i <- ins r, i `Map.member` nodeMap] -- ignore source nodes nodeMap = Map.fromList nodes index k = Map.findIndex k nodeMap nodes' = map (\(a,b) -> (index a, b)) nodes edges' = map (\(a,b,c) -> (index a, index b, c)) edges The nodes are file paths, labeled with a corresponding IO action to create the file. The nodes are created from a list of rules that specify how to create an output file from several input files. As you can see, I had to use Data.Map to convert file paths into node indexes. Ideally, the Data.Graph.Inductive.NodeMap module should be used for that, but after some frustration, I found it completely unsuitable for this task because it insists on using the graph label as node identifier. I am particularly unhappy about the definitions of nodes' and edges' , the clever use of Map.findIndex to translate indexes while keeping track of a label and the need for mapping the indexes myself. I'm not sure what the right solution is, but I think it definitely involves catering for different node types. For instance, the library could operate on a type newtype Graph node a b = Graph (Gr a b, Data.Map.Map Int node) or it could offer a more useful NodeMap type and make the Node type abstract. Some systematic and simple abstractions to manage nodes is needed anyway. Also, hard-coding Node = Int feels like the wrong kind of flexibility: the user is able to do anything with it, just in case the library forgot some important functionality. Which is exactly what happened in my case when I used Map.findIndex . I prefer the library to get it right. PS: While we're at it, I think newNodes should return an infinite list of Node instead of requiring a fixed number to be specified in advance? Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Ivan Lazar Miljenovic

1:50 p.m.

Heinrich Apfelmus writes:

...

I'm not sure what the right solution is, but I think it definitely involves catering for different node types. For instance, the library could operate on a type

newtype Graph node a b = Graph (Gr a b, Data.Map.Map Int node)

or it could offer a more useful NodeMap type and make the Node type abstract. Some systematic and simple abstractions to manage nodes is needed anyway.

As I said, we're considering using an Associated Type to let users choose what type they want to use (probably with a default Map instance for this). However, we'd recommend/push the Int-based one.

...

Also, hard-coding Node = Int feels like the wrong kind of flexibility: the user is able to do anything with it, just in case the library forgot some important functionality. Which is exactly what happened in my case when I used Map.findIndex . I prefer the library to get it right.

What do you mean by "the library forgot some important functionality"?

...

PS: While we're at it, I think newNodes should return an infinite list of Node instead of requiring a fixed number to be specified in advance?

Well, if we let the vertex type be _anything_ (that is an instance of Ord; we'll probably require that much at least, though maybe just Eq would make sense for list-based graphs), then how do we generate newNodes? Require Enum? Bounded? Really, performance aside, this is my biggest possible problem with generic label types is that it may make it harder to define various algorithms on graphs because you can no longer guarantee what you can do with the vertex types; as such people may resort to requring the vertex type to be Int or something to use a specific algorithm. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

1:58 p.m.

On Mon, 10 May 2010, Ivan Lazar Miljenovic wrote:

...

As I said, we're considering using an Associated Type to let users choose what type they want to use (probably with a default Map instance for this). However, we'd recommend/push the Int-based one.

I do not see why there is the need for any type extension, at all. Consider cabal-sort, a very basic program, that is Haskell-98 today, will no longer run in Hugs and JHC (untested so far) because it uses FGL's topological sort when FGL upgraded to associated types.

...

Well, if we let the vertex type be _anything_ (that is an instance of Ord; we'll probably require that much at least, though maybe just Eq would make sense for list-based graphs), then how do we generate newNodes? Require Enum? Bounded?

You might consider a new type class. I argued in the past, that using Ord as type class for Set and Map was not the best choice, but in order to stay consistent with the 'containers' package you may define a sub-class of Ord as class for node types.

...

Really, performance aside, this is my biggest possible problem with generic label types is that it may make it harder to define various algorithms on graphs because you can no longer guarantee what you can do with the vertex types; as such people may resort to requring the vertex type to be Int or something to use a specific algorithm.

Interesting, what graph algorithms rely on nodes being represented by Ints? Matrix based graph algorithms? But they have to compress the actual set of Node values in a graph to a sequence 0..n, too.

Ivan Lazar Miljenovic

2:09 p.m.

Henning Thielemann writes:

...

On Mon, 10 May 2010, Ivan Lazar Miljenovic wrote:

...
As I said, we're considering using an Associated Type to let users choose what type they want to use (probably with a default Map instance for this). However, we'd recommend/push the Int-based one.

I do not see why there is the need for any type extension, at all. Consider cabal-sort, a very basic program, that is Haskell-98 today, will no longer run in Hugs and JHC (untested so far) because it uses FGL's topological sort when FGL upgraded to associated types.

How should it be able to specify a fixed type value for the vertex type? We can't specify that g :: * -> * -> * -> * because the vertex type should _not_ be a parameter in that sense (since for many graphs it won't be, and we need some way of specifying what it is). And, to be honest, I don't really care about Hugs (JHC I do in the sense that it sounds cool but I haven't even downloaded the source yet let alone tried it).

...

...
Well, if we let the vertex type be _anything_ (that is an instance of Ord; we'll probably require that much at least, though maybe just Eq would make sense for list-based graphs), then how do we generate newNodes? Require Enum? Bounded?

You might consider a new type class. I argued in the past, that using Ord as type class for Set and Map was not the best choice, but in order to stay consistent with the 'containers' package you may define a sub-class of Ord as class for node types.

Huh? What's wrong with Ord? The only reason I said maybe Eq is in case someone stupidly decided to make [(a, [a])] a graph.

...

...
Really, performance aside, this is my biggest possible problem with generic label types is that it may make it harder to define various algorithms on graphs because you can no longer guarantee what you can do with the vertex types; as such people may resort to requring the vertex type to be Int or something to use a specific algorithm.

Interesting, what graph algorithms rely on nodes being represented by Ints? Matrix based graph algorithms? But they have to compress the actual set of Node values in a graph to a sequence 0..n, too.

None require Ints per-se; it's knowing what you _can_ do with the vertices that could be a problem. For example, I've used [1..] a few times when dealing with vertices; what should I replace that with? "enumFrom minBound" (thus requiring the vertex type to be an instance of Bounded and Enum)? -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

2:16 p.m.

On Tue, 11 May 2010, Ivan Lazar Miljenovic wrote:

...

Henning Thielemann writes:

...
I do not see why there is the need for any type extension, at all. Consider cabal-sort, a very basic program, that is Haskell-98 today, will no longer run in Hugs and JHC (untested so far) because it uses FGL's topological sort when FGL upgraded to associated types.

How should it be able to specify a fixed type value for the vertex type? We can't specify that g :: * -> * -> * -> * because the vertex type should _not_ be a parameter in that sense (since for many graphs it won't be, and we need some way of specifying what it is).

If I understand you right, you say that explicit type parameters for labels are ok, because I can set them to () if not needed, but for an alternative node type you find an explicit type parameter inappropriate?

Ivan Miljenovic

11 May 11 May

12:55 a.m.

On 11 May 2010 00:16, Henning Thielemann wrote:

...

On Tue, 11 May 2010, Ivan Lazar Miljenovic wrote:

...
Henning Thielemann writes:

...
I do not see why there is the need for any type extension, at all. Consider cabal-sort, a very basic program, that is Haskell-98 today, will no longer run in Hugs and JHC (untested so far) because it uses FGL's topological sort when FGL upgraded to associated types.

How should it be able to specify a fixed type value for the vertex type? We can't specify that g :: * -> * -> * -> * because the vertex type should _not_ be a parameter in that sense (since for many graphs it won't be, and we need some way of specifying what it is).

If I understand you right, you say that explicit type parameters for labels are ok, because I can set them to () if not needed, but for an alternative node type you find an explicit type parameter inappropriate?

An explicit type parameter for the vertex type is not appropriate for this reason: you don't want to change it. If we had that g :: * -> * -> * -> *, then you'd have to explicitly carry around your vertex type with you everywhere for all functions with the hint that it might be possible to change (like the label types are). Now, when using a generic Map-based graph representation, this is unavoidable; but when using a custom type with a given fixed vertex type, it should be implicit what that vertex type is without having to carry it around and specify it in the type signature. By specifying it as an Associated Type when defining the instance, that type is accessible to functions that need it and can be ignored for those that don't. For labels, however, we _have_ to have them as type parameters to be able to have mapping functions (how else do you indicate that the type of the labels has changed?). What you seem to want is an explicit hierarchy of graphs where labels are an "extra". There are two (feasible) options that I see to this: 1) My so-far-mainly-vapourware generic graph class (see http://code.haskell.org/~ivanm/Graph.hs for a draft) has a base graph class that specifies what the label types are as ATs but allows you to fix them when defining an instance (you still need to set that the label type is (), but there will be convenience functions for use for labels of that type). This has the added advantage of you being able to treat Cabal's PackageIndex type as a graph in its own right (as it is). Currently we're taking a few ideas from this for FGL, but FGL will probably always require the double-label kind that it currently has (with the idea being that FGL is a "nice" wrapper around the set of classes). The main problem with this at the moment is that there's no real way to bridge the gap: the current type checker doesn't allow equality constraints in superclass contexts (so there's no way of stating that the VLabel type of a graph corresponds to its first type parameter, etc.). 2) Edward Kmett has some interesting notions in terms of _annotated_ graphs. I'm not sure if I follow exactly how it will work, but from what I understand it might be possible to add labels to a graph as an annotated extension. Whichever way ends up getting implemented and chosen, it will not affect FGL: its graphs will always require two type parameters for the labels (however, you _will_ be able to create a new instance which forces those labels to always be () if you so wish). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

12 May 12 May

9:33 a.m.

Ivan Lazar Miljenovic wrote:

...

Heinrich Apfelmus writes:

...
I'm not sure what the right solution is, but I think it definitely involves catering for different node types. For instance, the library could operate on a type

newtype Graph node a b = Graph (Gr a b, Data.Map.Map Int node)

or it could offer a more useful NodeMap type and make the Node type abstract. Some systematic and simple abstractions to manage nodes is needed anyway.

As I said, we're considering using an Associated Type to let users choose what type they want to use (probably with a default Map instance for this). However, we'd recommend/push the Int-based one.

For my make example, I prefer a plain parameter because it would be too verbose to define a new class that has FilePaths as node types. I'd use the Int instead, but this defeats the point: why offer flexible node types when it's too much of a burden to use them?

...

An explicit type parameter for the vertex type is not appropriate for this reason: you don't want to change it.

It's true that I don't want to change the node type, but I want to curry it. If I don't feel like writing out the node type every time, I can use a type synonym: type MyGraph a b = Graph FilePath a b Graphs with different node types don't behave differently; graphs are parametric with respect to the node type, just like lists don't behave differently on different element types.

...

...
Also, hard-coding Node = Int feels like the wrong kind of flexibility: the user is able to do anything with it, just in case the library forgot some important functionality. Which is exactly what happened in my case when I used Map.findIndex . I prefer the library to get it right.

What do you mean by "the library forgot some important functionality"?

Ah, I'm comparing Node = Int to a Node type that is entirely abstract. After all, conceptually, Node is not an integer, it's a unique identifier. If Node is abstract, then you will probably miss things like the [1..] that you mentioned. Nonetheless, I would like to see Node to become abstract. This means that the graph library should include a library that deals with unique identifiers Node . The [1..] pattern would correspond to a function freshNodes :: () -> [Node]

...

...
PS: While we're at it, I think newNodes should return an infinite list of Node instead of requiring a fixed number to be specified in advance?

Well, if we let the vertex type be _anything_ (that is an instance of Ord; we'll probably require that much at least, though maybe just Eq would make sense for list-based graphs), then how do we generate newNodes? Require Enum? Bounded?

Ah, that suggestion was for Node = Int or Node = abstract . If the library user uses his own node type, it's him who is responsible for allocating new Nodes .

...

Really, performance aside, this is my biggest possible problem with generic label types is that it may make it harder to define various algorithms on graphs because you can no longer guarantee what you can do with the vertex types; as such people may resort to requring the vertex type to be Int or something to use a specific algorithm.

Ah, you mean algorithms that create new nodes on the fly? I don't think that Node = Int works for them either, because the user might have his own ideas about which Ints can appear as graph vertexes. For instance, he might only use even numbers to denote vertexes and will be surprised by a library algorithm that suddenly creates odd vertexes. In short, Node needs to be abstract for that. Other than that, I don't see much of a difference between custom vertex types and Int . Internally, you can always use Int to reference nodes and keep the association between the custom vertex type and Int in a separate map, like this data Graph node a b = Graph { internal :: Gr a b , nodes :: Map node a } Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Ivan Lazar Miljenovic

10:08 a.m.

Heinrich Apfelmus writes:

...

Ivan Lazar Miljenovic wrote:

...
As I said, we're considering using an Associated Type to let users choose what type they want to use (probably with a default Map instance for this). However, we'd recommend/push the Int-based one.

For my make example, I prefer a plain parameter because it would be too verbose to define a new class that has FilePaths as node types. I'd use the Int instead, but this defeats the point: why offer flexible node types when it's too much of a burden to use them?

The point is so that you can define new graph types with that node type.

...

...
An explicit type parameter for the vertex type is not appropriate for this reason: you don't want to change it.

It's true that I don't want to change the node type, but I want to curry it. If I don't feel like writing out the node type every time, I can use a type synonym:

type MyGraph a b = Graph FilePath a b

Graphs with different node types don't behave differently; graphs are parametric with respect to the node type, just like lists don't behave differently on different element types.

There will be a Map-based graph available that will have the node type parameter, but the variant that's currently called PatriciaTree will most likely be the preferred default one (as it will have better performance due to its use of IntMap). We can't require the class to have the vertex type as a type parameter for when we want a graph (such as PatriciaTree) _with_ a fixed vertex type.

...

...
...
Also, hard-coding Node = Int feels like the wrong kind of flexibility: the user is able to do anything with it, just in case the library forgot some important functionality. Which is exactly what happened in my case when I used Map.findIndex . I prefer the library to get it right.

What do you mean by "the library forgot some important functionality"?

Ah, I'm comparing Node = Int to a Node type that is entirely abstract. After all, conceptually, Node is not an integer, it's a unique identifier.

If Node is abstract, then you will probably miss things like the [1..] that you mentioned.

Nonetheless, I would like to see Node to become abstract. This means that the graph library should include a library that deals with unique identifiers Node . The [1..] pattern would correspond to a function

freshNodes :: () -> [Node]

Well, I don't use [1..] to get new nodes, I use them to map over nodes (depending on how I construct the graph) or to create a new graph completely from scratch. Also, that type signature doesn't make sense; something like "freshNodes :: (Graph g) => g -> [Vertex g]" might, but the problem with a generic node type is that its not really possible to do such a thing in general; AFAICT the full type signature will need to be: freshNodes :: ( Graph g, Enum (Vertex g), Bounded (Vertext g) , Ord (Vertex g)) => g -> [Vertex g] where the Bounded is needed for empty graphs (i.e. use minBound); if the graph isn't empty, then take the maximum (hence Ord). This of course will be tricky to implement for the general case (if we have vertices [1,2,4], should it return 3? What happens about overflows?). If we define such a function, it will most likely not be part of the class definition since we wouldn't want to put these restrictions on the vertex type (since I might want Integer as my vertex type, which isn't an instance of Bounded; maybe having an unsafe variant of this that assumes the graph is non-empty makes sense).

...

...
Really, performance aside, this is my biggest possible problem with generic label types is that it may make it harder to define various algorithms on graphs because you can no longer guarantee what you can do with the vertex types; as such people may resort to requring the vertex type to be Int or something to use a specific algorithm.

Ah, you mean algorithms that create new nodes on the fly? I don't think that Node = Int works for them either, because the user might have his own ideas about which Ints can appear as graph vertexes. For instance, he might only use even numbers to denote vertexes and will be surprised by a library algorithm that suddenly creates odd vertexes. In short, Node needs to be abstract for that.

Actually, I've looked through my code and it appears that (apart from verboseness), there won't be too much of a problem with removing the assumption of vertex type == Int. However, I can't see any reason why someone would only want to use even Int values. As I think I've said before (I've been making these arguments in various threads and discussions, so I'm not sure if I've said it here): the vertex type is just an _index_ to ensure consistency, etc; it is _not_ IMHO meant to represent the actual data: that's what the labels are for.

...

Other than that, I don't see much of a difference between custom vertex types and Int . Internally, you can always use Int to reference nodes and keep the association between the custom vertex type and Int in a separate map, like this

data Graph node a b = Graph { internal :: Gr a b , nodes :: Map node a }

Custom vertex types will work; it's just that using Int will probably prove to be in general more efficient and easier to use. I haven't said we'll disallow custom vertex types, but I don't plan on going on the extreme of having optional labels, or of making the vertex type a type parameter for all graphs (since as I've said, you don't/can't always want/assume that). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

13 May 13 May

7:09 a.m.

Ivan Lazar Miljenovic wrote:

...

Heinrich Apfelmus writes:

...
Graphs with different node types don't behave differently; graphs are parametric with respect to the node type, just like lists don't behave differently on different element types.

There will be a Map-based graph available that will have the node type parameter, but the variant that's currently called PatriciaTree will most likely be the preferred default one (as it will have better performance due to its use of IntMap).

We can't require the class to have the vertex type as a type parameter for when we want a graph (such as PatriciaTree) _with_ a fixed vertex type.

Ah, ok, you want graphs that only work with one node type. If there is at most one such graph for each node type, you could make a data type family and retain the parameter, though data family Graph node :: * -> * data family Graph Int a b = PatriciaTree a b data family Graph node a b = GenericTree But it seems that this doesn't work because the cases overlap.

...

Actually, I've looked through my code and it appears that (apart from verboseness), there won't be too much of a problem with removing the assumption of vertex type == Int.

However, I can't see any reason why someone would only want to use even Int values. As I think I've said before (I've been making these arguments in various threads and discussions, so I'm not sure if I've said it here): the vertex type is just an _index_ to ensure consistency, etc; it is _not_ IMHO meant to represent the actual data: that's what the labels are for.

Yes, the integers are just indexes. Of course, the example with the even integers is a bit silly; but if the integers are actually indexes, then it's conceptually cleaner to make them abstract, i.e. data Node -- constructors are not exported and provide combinators to operate on these abstract indexes, including a corresponding Data.Graph.Inductive.NodeMap module. I'd like to see such an abstract Node type, because then the library will provide all operations I need. It took me some time to figure out how to best use Int as indexes in my example code; an abstract Node type and a good NodeMap module would have made my life much easier.

...

...
Other than that, I don't see much of a difference between custom vertex types and Int . Internally, you can always use Int to reference nodes and keep the association between the custom vertex type and Int in a separate map, like this

data Graph node a b = Graph { internal :: Gr a b , nodes :: Map node a }

Custom vertex types will work; it's just that using Int will probably prove to be in general more efficient and easier to use. I haven't said we'll disallow custom vertex types, but I don't plan on going on the extreme of having optional labels, or of making the vertex type a type parameter for all graphs (since as I've said, you don't/can't always want/assume that).

Darn, I meant data Graph node a b = Graph { internal :: Graph Int a b, nodes :: Map Int a } The idea is to use Ints internally and only store a loose association to the custom vertex type. In particular, no Map a Int is required, only from Int to a . Now, I realize that the other way round is required as well for querying the context of a node in a graph. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Ivan Miljenovic

7:20 a.m.

On 13 May 2010 17:09, Heinrich Apfelmus wrote:

...

Ah, ok, you want graphs that only work with one node type. If there is at most one such graph for each node type, you could make a data type family and retain the parameter, though

data family Graph node :: * -> * data family Graph Int a b = PatriciaTree a b data family Graph node a b = GenericTree

But it seems that this doesn't work because the cases overlap.

Yup: PatriciaTree, VectorGraph, etc.

...

...
However, I can't see any reason why someone would only want to use even Int values. As I think I've said before (I've been making these arguments in various threads and discussions, so I'm not sure if I've said it here): the vertex type is just an _index_ to ensure consistency, etc; it is _not_ IMHO meant to represent the actual data: that's what the labels are for.

Yes, the integers are just indexes. Of course, the example with the even integers is a bit silly; but if the integers are actually indexes, then it's conceptually cleaner to make them abstract, i.e.

data Node -- constructors are not exported

and provide combinators to operate on these abstract indexes, including a corresponding Data.Graph.Inductive.NodeMap module.

I'd like to see such an abstract Node type, because then the library will provide all operations I need. It took me some time to figure out how to best use Int as indexes in my example code; an abstract Node type and a good NodeMap module would have made my life much easier.

I'm not sure I understand what you're saying here: first you said you wanted to be able to specify a vertex type, now you're saying that you don't want to know what the vertex type even is (except that it's some abstract Node type)? Whilst this would make graph usage safer/more robust, this seems to contradict your earlier arguments...

...

Darn, I meant

data Graph node a b = Graph { internal :: Graph Int a b, nodes :: Map Int a }

The idea is to use Ints internally and only store a loose association to the custom vertex type. In particular, no Map a Int is required, only from Int to a . Now, I realize that the other way round is required as well for querying the context of a node in a graph.

What's the point of that useless node type parameter then? And how does the nodes map differ from just getting the graph label? -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

14 May 14 May

7:35 a.m.

Ivan Miljenovic wrote:

...

Heinrich Apfelmus wrote:

...
Yes, the integers are just indexes. Of course, the example with the even integers is a bit silly; but if the integers are actually indexes, then it's conceptually cleaner to make them abstract, i.e.

data Node -- constructors are not exported

and provide combinators to operate on these abstract indexes, including a corresponding Data.Graph.Inductive.NodeMap module.

I'd like to see such an abstract Node type, because then the library will provide all operations I need. It took me some time to figure out how to best use Int as indexes in my example code; an abstract Node type and a good NodeMap module would have made my life much easier.

I'm not sure I understand what you're saying here: first you said you wanted to be able to specify a vertex type, now you're saying that you don't want to know what the vertex type even is (except that it's some abstract Node type)? Whilst this would make graph usage safer/more robust, this seems to contradict your earlier arguments...

I'd be happy with either one. :) In both cases, I want to specify a custom vertex type. I can either do that directly if the library permits, though I think the solution with associated types is too cumbersome to be useful for my make example. Or I get an abstract Node type and the library provides just the right functions that make it easy to manage a custom vertex type myself. I had hoped that the Data.Graph.Inductive.NodeMap module provides this, which it doesn't. In other words, the abstractness of Node forces the library to provide a well-designed set of functions to work with them, and that's what I'm after. In my make example, I spent the most time thinking about how to manage the Int nodes, finally settling with Data.Map.findIndex , and I prefer the library to think about that for me.

...

...
Darn, I meant

data Graph node a b = Graph { internal :: Graph Int a b, nodes :: Map Int a }

The idea is to use Ints internally and only store a loose association to the custom vertex type. In particular, no Map a Int is required, only from Int to a . Now, I realize that the other way round is required as well for querying the context of a node in a graph.

What's the point of that useless node type parameter then? And how does the nodes map differ from just getting the graph label?

You're right, I now realize that this design doesn't work. But you asked for wishes, so I wished for something. ;) Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Ivan Lazar Miljenovic

7:50 a.m.

Heinrich Apfelmus writes:

...

I'd be happy with either one. :) In both cases, I want to specify a custom vertex type.

Except an abstract type isn't a custom vertex type...

...

I can either do that directly if the library permits, though I think the solution with associated types is too cumbersome to be useful for my make example.

Why?

...

Or I get an abstract Node type and the library provides just the right functions that make it easy to manage a custom vertex type myself. I had hoped that the Data.Graph.Inductive.NodeMap module provides this, which it doesn't.

Not sure I follow what you're saying here; then again, my graph stuff has typically been to create the graph and then do stuff to it _as_ a graph (and not wanting/needing to get a specific node based upon its label, etc.).

...

You're right, I now realize that this design doesn't work. But you asked for wishes, so I wished for something. ;)

Heh, fair enough, except I didn't ask for wishes, just what people would like to see ;-) -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

15 May 15 May

8:19 a.m.

Ivan Lazar Miljenovic wrote:

...

Heinrich Apfelmus writes:

...
I'd be happy with either one. :) In both cases, I want to specify a custom vertex type.

Except an abstract type isn't a custom vertex type...

...
I can either do that directly if the library permits, though I think the solution with associated types is too cumbersome to be useful for my make example.

Why?

I was under the impression that I would have to define a new graph data type with FilePath as vertex type and make that an instance of Graph ? In that case, it would be much shorter for me to stick with the clumsy nodeMap = Map.fromList nodes index k = Map.findIndex k nodeMap nodes' = map (\(a,b) -> (index a, b)) nodes edges' = map (\(a,b,c) -> (index a, index b, c)) edges

...

...
Or I get an abstract Node type and the library provides just the right functions that make it easy to manage a custom vertex type myself. I had hoped that the Data.Graph.Inductive.NodeMap module provides this, which it doesn't.

Not sure I follow what you're saying here; then again, my graph stuff has typically been to create the graph and then do stuff to it _as_ a graph (and not wanting/needing to get a specific node based upon its label, etc.).

In the make example, I didn't need to get a node based on its label either. But the graph was a graph of FilePaths and I still have to implement an association between that and Int . (In fact, I don't know of any graph whose nodes are unique integers conceptually.) In other words, I have to make sure that every FilePath is mapped to a unique integer which I can then glue into a graph. This is not hard to do with a Data.Map and the four lines of code above do exactly that. However, I still had to think about it and it took me way too long to come up with these four lines. What I would like to see is that the *library* has thought about that for me already. A good way to ensure that the library has thought about that is to make the Node type abstract. This way, the library has to provide the functionality to create and manage nodes, which I would otherwise cobble together on my own by messing with Int . One possibility is to offer a function mkGraph :: Ord node => [(node,a)] -> [(node,node,b)] -> (Gr a b, NodeMap node) that accepts a custom vertex type, creates all the necessary unique integers internally and also returns an association between the newly created Nodes and the custom node in case I want to refer to the nodes in the graph with the custom type. The NodeMap type - while implemented as a Data.Map - is abstract as well and has the primitives empty :: NodeMap n insert :: n -> NodeMap n -> NodeMap n lookup :: n -> NodeMap n -> Maybe Node lookupNode :: Node -> NodeMap n -> Maybe n delete :: Node -> NodeMap n -> NodeMap n This is all you need to manage the association node <-> Node . In particular, insert creates new Nodes on the fly. And since the nodes in the graph and the NodeMap will usually come in pairs, we can as well give the pair a new name: type Graph node a b = (Gr a b, NodeMap node) To summarize: an abstract Node type relieves me from thinking about the association between my conceptual node type and unique identifiers. I'd be happy with anything along these lines, the interface above is just a suggestion. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Ivan Lazar Miljenovic

16 May 16 May

5:58 a.m.

Heinrich Apfelmus writes:

...

Ivan Lazar Miljenovic wrote:

...
Heinrich Apfelmus writes:

...
I'd be happy with either one. :) In both cases, I want to specify a custom vertex type.

Except an abstract type isn't a custom vertex type...

...
I can either do that directly if the library permits, though I think the solution with associated types is too cumbersome to be useful for my make example.

Why?

I was under the impression that I would have to define a new graph data type with FilePath as vertex type and make that an instance of Graph ? In that case, it would be much shorter for me to stick with the clumsy

nodeMap = Map.fromList nodes index k = Map.findIndex k nodeMap nodes' = map (\(a,b) -> (index a, b)) nodes edges' = map (\(a,b,c) -> (index a, index b, c)) edges

Well, we'll provide a Map-based one that lets you specify the vertex type as a type parameter; this functionality (type parameter being ued for the vertex type) won't be required since not all graphs will be able to chop and choose which vertex type to use.

...

In the make example, I didn't need to get a node based on its label either. But the graph was a graph of FilePaths and I still have to implement an association between that and Int . (In fact, I don't know of any graph whose nodes are unique integers conceptually.)

In other words, I have to make sure that every FilePath is mapped to a unique integer which I can then glue into a graph. This is not hard to do with a Data.Map and the four lines of code above do exactly that. However, I still had to think about it and it took me way too long to come up with these four lines. What I would like to see is that the *library* has thought about that for me already.

Right; I'm going to look at merging this import-like functionality from Graphalyze (which already does this).

...

[snip]

To summarize: an abstract Node type relieves me from thinking about the association between my conceptual node type and unique identifiers. I'd be happy with anything along these lines, the interface above is just a suggestion.

Well, you can consider the current Int vertex type to be a (not-so-) abstract vertex type; it might be an idea to wrap this up so people don't mess with it themselves though. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

17 May 17 May

7:51 a.m.

Ivan Lazar Miljenovic wrote:

...

Heinrich Apfelmus writes:

...
I was under the impression that I would have to define a new graph data type with FilePath as vertex type and make that an instance of Graph ? [..]

Well, we'll provide a Map-based one that lets you specify the vertex type as a type parameter; this functionality (type parameter being ued for the vertex type) won't be required since not all graphs will be able to chop and choose which vertex type to use.

Ah, that would indeed make it easy. Maybe make the map based graph generic, so that it can be used with any primitive graph implementation? data VertexGraph node gr a b = VG (gr a b) (BiMap node (Vertex gr)) instance (Ord node, Graph gr) => Graph VertexGraph node gr where But this is probably not worth the hassle because if you really want performance, you shouldn't choose another gr but rather make your own instance with a custom map. I think it's fine to supply a default choice for gr .

...

...
In the make example, I didn't need to get a node based on its label either. But the graph was a graph of FilePaths and I still have to implement an association between that and Int . (In fact, I don't know of any graph whose nodes are unique integers conceptually.)

In other words, I have to make sure that every FilePath is mapped to a unique integer which I can then glue into a graph. This is not hard to do with a Data.Map and the four lines of code above do exactly that. However, I still had to think about it and it took me way too long to come up with these four lines. What I would like to see is that the *library* has thought about that for me already.

Right; I'm going to look at merging this import-like functionality from Graphalyze (which already does this).

...
[snip]

To summarize: an abstract Node type relieves me from thinking about the association between my conceptual node type and unique identifiers. I'd be happy with anything along these lines, the interface above is just a suggestion.

Well, you can consider the current Int vertex type to be a (not-so-) abstract vertex type; it might be an idea to wrap this up so people don't mess with it themselves though.

That would be much appreciated. :) Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Ivan Lazar Miljenovic

9:06 a.m.

Heinrich Apfelmus writes:

...

Ivan Lazar Miljenovic wrote:

...
Heinrich Apfelmus writes:

...
I was under the impression that I would have to define a new graph data type with FilePath as vertex type and make that an instance of Graph ? [..]

Well, we'll provide a Map-based one that lets you specify the vertex type as a type parameter; this functionality (type parameter being ued for the vertex type) won't be required since not all graphs will be able to chop and choose which vertex type to use.

Ah, that would indeed make it easy. Maybe make the map based graph generic, so that it can be used with any primitive graph implementation?

data VertexGraph node gr a b = VG (gr a b) (BiMap node (Vertex gr))

instance (Ord node, Graph gr) => Graph VertexGraph node gr where

Not sure I understand what the point of the `gr' is there; what I meant was something like: newtype MapGraph node a b = MG (Map node (Map node b, a, Map node b))

...

...
Well, you can consider the current Int vertex type to be a (not-so-) abstract vertex type; it might be an idea to wrap this up so people don't mess with it themselves though.

That would be much appreciated. :)

We'll consider this then. (*sigh* it's not enough that we're modernising this library, but we also have to hold the users hand for them in case they accidentally use the wrong Int? Gimme a break! :p) Actually, this might solve a problem that Thomas and I were discussing earlier where a Context can refer to vertices not in the graph (it won't completely remove the problem as you can still get the value from a different state of the graph, but will help remove obvious user error when manually constructing Context values). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

18 May 18 May

9:07 a.m.

Ivan Lazar Miljenovic wrote:

...

Heinrich Apfelmus writes:

...
Ivan Lazar Miljenovic wrote:

...
Well, we'll provide a Map-based one that lets you specify the vertex type as a type parameter; this functionality (type parameter being ued for the vertex type) won't be required since not all graphs will be able to chop and choose which vertex type to use.

Ah, that would indeed make it easy. Maybe make the map based graph generic, so that it can be used with any primitive graph implementation?

data VertexGraph node gr a b = VG (gr a b) (BiMap node (Vertex gr))

instance (Ord node, Graph gr) => Graph VertexGraph node gr where

Not sure I understand what the point of the `gr' is there; what I meant was something like:

newtype MapGraph node a b = MG (Map node (Map node b, a, Map node b))

Yes; what I mean is that you can retrofit a custom vertex type to any graph implementation that uses a fixed vertex type. So, let's say that data Gr a b = .. -- graph with vertex type Vertex Gr = Int then type Gr' node a b = CustomVertex node Gr a b data CustomVertex node gr a b = CV (gr a b) (Map node (Vertex gr)) is a graph with custom vertex type node . Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Ivan Lazar Miljenovic

9:17 a.m.

Heinrich Apfelmus writes:

...

Yes; what I mean is that you can retrofit a custom vertex type to any graph implementation that uses a fixed vertex type. So, let's say that

data Gr a b = .. -- graph with vertex type Vertex Gr = Int

then

type Gr' node a b = CustomVertex node Gr a b

data CustomVertex node gr a b = CV (gr a b) (Map node (Vertex gr))

is a graph with custom vertex type node .

Sounds like it's more complicated than it's worth tbh ;-) -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

19 May 19 May

8:02 a.m.

Ivan Lazar Miljenovic wrote:

...

Heinrich Apfelmus writes:

...
Yes; what I mean is that you can retrofit a custom vertex type to any graph implementation that uses a fixed vertex type. So, let's say that

data Gr a b = .. -- graph with vertex type Vertex Gr = Int

then

type Gr' node a b = CustomVertex node Gr a b

data CustomVertex node gr a b = CV (gr a b) (Map node (Vertex gr))

is a graph with custom vertex type node .

Sounds like it's more complicated than it's worth tbh ;-)

Yup. ;) Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Henning Thielemann

14 May 14 May

8:24 a.m.

Heinrich Apfelmus schrieb:

...

Ivan Miljenovic wrote:

...

...
I'm not sure I understand what you're saying here: first you said you wanted to be able to specify a vertex type, now you're saying that you don't want to know what the vertex type even is (except that it's some abstract Node type)? Whilst this would make graph usage safer/more robust, this seems to contradict your earlier arguments...

I'd be happy with either one. :) In both cases, I want to specify a custom vertex type.

I can either do that directly if the library permits, though I think the solution with associated types is too cumbersome to be useful for my make example.

Or I get an abstract Node type and the library provides just the right functions that make it easy to manage a custom vertex type myself. I had hoped that the Data.Graph.Inductive.NodeMap module provides this, which it doesn't.

In other words, the abstractness of Node forces the library to provide a well-designed set of functions to work with them, and that's what I'm after. In my make example, I spent the most time thinking about how to manage the Int nodes, finally settling with Data.Map.findIndex , and I prefer the library to think about that for me.

Full acknowledge!

Ivan Lazar Miljenovic

8:37 a.m.

Henning Thielemann writes:

...

Heinrich Apfelmus schrieb:

...
Ivan Miljenovic wrote:

...
...
I'm not sure I understand what you're saying here: first you said you wanted to be able to specify a vertex type, now you're saying that you don't want to know what the vertex type even is (except that it's some abstract Node type)? Whilst this would make graph usage safer/more robust, this seems to contradict your earlier arguments...

I'd be happy with either one. :) In both cases, I want to specify a custom vertex type.

I can either do that directly if the library permits, though I think the solution with associated types is too cumbersome to be useful for my make example.

Or I get an abstract Node type and the library provides just the right functions that make it easy to manage a custom vertex type myself. I had hoped that the Data.Graph.Inductive.NodeMap module provides this, which it doesn't.

In other words, the abstractness of Node forces the library to provide a well-designed set of functions to work with them, and that's what I'm after. In my make example, I spent the most time thinking about how to manage the Int nodes, finally settling with Data.Map.findIndex , and I prefer the library to think about that for me.

Full acknowledge!

I have no idea what this is meant to mean... -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

13 May 13 May

8:14 a.m.

Heinrich Apfelmus schrieb:

...

Yes, the integers are just indexes. Of course, the example with the even integers is a bit silly;

... might be useful for bipartite graphs

Ivan Miljenovic

8:18 a.m.

On 13 May 2010 18:14, Henning Thielemann wrote:

...

Heinrich Apfelmus schrieb:

...
Yes, the integers are just indexes. Of course, the example with the even integers is a bit silly;

... might be useful for bipartite graphs

So, a K_{0,n} bipartite graph? :p -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

10 May 10 May

1:51 p.m.

On Mon, 10 May 2010, Heinrich Apfelmus wrote:

...

The nodes are file paths, labeled with a corresponding IO action to create the file. The nodes are created from a list of rules that specify how to create an output file from several input files.

That is, in principle you could also use an unlabelled graph with FilePath as node type and you could manage a (Map FilePath (IO ())) yourselve and FGL does even not know about its existence.

Ivan Lazar Miljenovic

2:02 p.m.

Henning Thielemann writes:

...

On Mon, 10 May 2010, Heinrich Apfelmus wrote:

...
The nodes are file paths, labeled with a corresponding IO action to create the file. The nodes are created from a list of rules that specify how to create an output file from several input files.

That is, in principle you could also use an unlabelled graph with FilePath as node type and you could manage a (Map FilePath (IO ())) yourselve and FGL does even not know about its existence.

Yes, but why? That's part of the convenience of FGL: the labels are handled _for you_. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

2:08 p.m.

On Tue, 11 May 2010, Ivan Lazar Miljenovic wrote:

...

Henning Thielemann writes:

...
That is, in principle you could also use an unlabelled graph with FilePath as node type and you could manage a (Map FilePath (IO ())) yourselve and FGL does even not know about its existence.

Yes, but why? That's part of the convenience of FGL: the labels are handled _for you_.

Because looking up the Map is already very convenient. Why shall I go via the graph? In the Make example, the graph represents relations between files. It is not important what particular shell commands must be run for generating one file from other ones. Why bothering the topological sort algorithm with that information then?

Ivan Miljenovic

2:17 p.m.

On 11 May 2010 00:08, Henning Thielemann wrote:

...

Because looking up the Map is already very convenient. Why shall I go via the graph? In the Make example, the graph represents relations between files. It is not important what particular shell commands must be run for generating one file from other ones. Why bothering the topological sort algorithm with that information then?

You're splitting apart related data into _three_ different data structures (the graph, vertex labels and edge labels)? _That_ doesn't make sense. For starters, you have to ensure you keep them in sync (which FGL will already do for you). As for the Make example; who said you had to put the shell commands as labels? But I think that a label that indicates which cluster a particular node belongs to makes sense to be part of the graph itself rather than in an auxiliary data structure. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Henning Thielemann

2:22 p.m.

On Tue, 11 May 2010, Ivan Miljenovic wrote:

...

You're splitting apart related data into _three_ different data structures (the graph, vertex labels and edge labels)? _That_ doesn't make sense.

There are no edge labels, only vertex labels. And yes, I find separation of data structures for separation of concerns a good strategy.

Ivan Miljenovic

11 May 11 May

12:42 a.m.

On 11 May 2010 00:22, Henning Thielemann wrote:

...

On Tue, 11 May 2010, Ivan Miljenovic wrote:

...
You're splitting apart related data into _three_ different data structures (the graph, vertex labels and edge labels)? _That_ doesn't make sense.

There are no edge labels, only vertex labels. And yes, I find separation of data structures for separation of concerns a good strategy.

Think about it this way: you want to delete a vertex. With your approach, you have to do the following: 1) Delete it from the graph (which takes care of deleting any edges incident with that vertex) 2) Delete it from your vertex-label mapping 3) If you also have edge labels, delete it from your edge-label mapping for all edges incident with that vertex. As such, you have to do a lot of bookkeeping to keep your graph + associated mappings in a consistent state which FGL currently provides for you _for free_. These mappings are an integral part _of_ the graph, not just something extra that tags along. Keeping them all together helps avoid an inconsistent state. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Heinrich Apfelmus

12 May 12 May

9:40 a.m.

Henning Thielemann wrote:

...

Ivan Miljenovic wrote:

...
You're splitting apart related data into _three_ different data structures (the graph, vertex labels and edge labels)? _That_ doesn't make sense.

There are no edge labels, only vertex labels. And yes, I find separation of data structures for separation of concerns a good strategy.

It appears to me that the concerns of labels and vertexes are not separate enough. After all, the point is that they have to be kept in sync. Keeping them in sync should be the business of the graph library, not of the user. It doesn't have to be baked into the graph type, though, an abstract Node type might work as well. Regards, Heinrich Apfelmus -- http://apfelmus.nfshost.com

Ivan Lazar Miljenovic

10:16 a.m.

Heinrich Apfelmus writes:

...

It appears to me that the concerns of labels and vertexes are not separate enough. After all, the point is that they have to be kept in sync. Keeping them in sync should be the business of the graph library, not of the user. It doesn't have to be baked into the graph type, though, an abstract Node type might work as well.

An abstract vertex type is orthoganal to the label type (though it took me a while to convince Thomas of that; he wanted to somehow embed the vertex type inside the label :s). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

5521

Age (days ago)

5544

Last active (days ago)

List overview

Download

48 comments

9 participants

participants (9)

Ben Franksen
Christian Maeder
Heinrich Apfelmus
Henning Thielemann
Henning Thielemann
Ivan Lazar Miljenovic
Ivan Miljenovic
Neil Brown
Stephen Tetley