Re: [Haskell-cafe] Graph diagram tools?

Wow, you really have resurrected an old thread!
On 17 April 2015 at 03:12, Ivan Zakharyaschev
Hi,
I have some feedback on the API of the graphviz library's monadic API (resulting form my explorations written down at http://mathoverflow.net/a/203099/13991 ).
Le jeudi 23 juin 2011 04:38:21 UTC+4, Ivan Lazar Miljenovic a écrit :
On 23 June 2011 02:48, Stephen Tetley
wrote: Or Andy Gill's Dotgen - simple and stable:
Within the next month, I should hopefully finally finish the new version of graphviz. Various improvements include:
As such, I would greatly appreciate knowing what it is that makes you
want to use a different library (admittedly the graphviz API isn't as stable as the others, but that's because I keep trying to improve it, and typically state in the Changelog exactly what has changed).
### graphviz Haskell library and other ones
An alternative to "graphviz" Haskell package mentioned in [haskell-cafe](https://groups.google.com/d/msg/haskell-cafe/ZfZaw2E9a18/xZ0OeHCGzVgJ) is [dotgen](http://hackage.haskell.org/package/dotgen).
In [a follow-up](https://groups.google.com/d/msg/haskell-cafe/ZfZaw2E9a18/9P-dazcd0FsJ) to the post mentioning `dotgen`, the author of graphviz gives some comparison between them (and other similar Haskell libs). I assume his "plans" (about a monadic interface) have been implemented already:
Yup: http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz-T...
Within the next month, I should hopefully finally finish the new version of graphviz. Various improvements include:
...
* A Dot graph representation based loosely upon **dotgen**'s monadic interface (with Andy's blessing) but with the various Attributes being used rather than (String, String). I think I'm going to be able to make it such that you can define a graph using the monadic interface that will almost look identical to actual Dot code.
...
I would like to stress to people considering using other bindings to Graphviz/Dot (such as **dotgen**, language-dot, or their own cobbled-together interface): be very careful about quoting, etc. I have spent a _lot_ of time checking how to properly escape different values and ensuring correctness under the hood (i.e. there is no need to pre-escape your Text/String values; graphviz will do that for you when generating the actual Dot code). This, after all, is the point of having existing libraries rather than rolling your own each time.
Both points are related. (So, graphviz's monadic iterface is a safer improvement upon dotgen's one.)
### Considering dotgen vs graphviz closer
But looking into the examples, I see that `dotgen` can use "Haskell ids" to identify created nodes, whereas in graphviz's monad (see the example above) one must supply extra strings as the unique ids (by which we refer to the nodes).
I used Strings as an example, as I was directly converting an existing piece of Dot code; the original can be found here: http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz-T... But, you can use any type you like for the node identifiers, as long as you make them an instance of the PrintDot class. That's where the `n` in the `Dot n` type comes in.
I like the first approach more ("Haskell ids").
I admittedly don't have any ability in graphviz to create new identifiers for you. I could (just add a StateT to the internal monadic stack which keeps track of the next unused node identifier) but I think that would _reduce_ the flexibility of being able to use your own type (it would either only work for `Dot Int`, or even if you could apply a mapping function to use something like `GraphID`, but that has a problem if you have a `Double` with the same value - and hence same textual representation - as your Int). The way I see it, graphviz is usually used for converting existing Haskell values into Dot code and then processing with dot, neato, etc. the Monadic interface exists so that you can still use the library for static pre-specified graphs (I wrote the module for a specific use case, but in practice found it not as useful as I thought it would be as I typically don't have a need for static graphs in my Haskell code).
Cf. dotgen (from https://github.com/ku-fpg/dotgen/blob/master/test/DotTest.hs):
module Main (main) where import Text.Dot -- data Animation = Start src, box, diamond :: String -> Dot NodeId src label = node $ [ ("shape","none"),("label",label) ] box label = node $ [ ("shape","box"),("style","rounded"),("label",label) ] diamond label = node $ [("shape","diamond"),("label",label),("fontsize","10")] main :: IO () main = putStrLn $ showDot $ do attribute ("size","40,15") attribute ("rankdir","LR") refSpec <- src "S" tarSpec <- src "T" same [refSpec,tarSpec] c1 <- box "S" c2 <- box "C" c3 <- box "F" same [c1,c2,c3] refSpec .->. c1 tarSpec .->. c2 tarSpec .->. c3 m1 <- box "x" m2 <- box "y" ntm <- box "z" same [m1,m2,ntm] c1 .->. m1 c2 .->. m2 xilinxSynthesis <- box "x" c3 .->. xilinxSynthesis gns <- box "G" xilinxSynthesis .->. gns gns .->. ntm ecs <- sequence [ diamond "E" , diamond "E" , diamond "Eq" ] same ecs m1 .->. (ecs !! 0) m1 .->. (ecs !! 1) m2 .->. (ecs !! 0) m2 .->. (ecs !! 2) ntm .->. (ecs !! 1) ntm .->. (ecs !! 2) _ <- sequence [ do evidence <- src "EE" n .->. evidence | n <- ecs ] edge refSpec tarSpec [("label","Engineering\nEffort"),("style","dotted")] () <- scope $ do v1 <- box "Hello" v2 <- box "World" v1 .->. v2 (x,()) <- cluster $ do v1 <- box "Hello" v2 <- box "World" v1 .->. v2 -- x .->. m2 -- for hpc () <- same [x,x] v <- box "XYZ" v .->. v () <- attribute ("rankdir","LR") let n1 = userNodeId 1 let n2 = userNodeId (-1) () <- n1 `userNode` [ ("shape","box")] n1 .->. n2 _ <- box "XYZ" _ <- box "(\n\\n)\"(/\\)" netlistGraph (\ a -> [("label","X" ++ show a)]) (\ a -> [succ a `mod` 10,pred a `mod` 10]) [ (n,n) | n <- [0..9] :: [Int] ] return ()
My preference - and hence overall design with graphviz - is that you would generate the graph first, and _then_ convert it to a Dot representation en masse.
Cf. graphviz with string ids:
A short example of the monadic notation from [the documentation](http://hackage.haskell.org/package/graphviz-2999.16.0.0/docs/Data-GraphViz-T...):
That version is a tad out of date, but shouldn't affect this.
digraph (Str "G") $ do
cluster (Int 0) $ do graphAttrs [style filled, color LightGray] nodeAttrs [style filled, color White] "a0" --> "a1" "a1" --> "a2" "a2" --> "a3" graphAttrs [textLabel "process #1"]
cluster (Int 1) $ do nodeAttrs [style filled] "b0" --> "b1" "b1" --> "b2" "b2" --> "b3" graphAttrs [textLabel "process #2", color Blue]
"start" --> "a0" "start" --> "b0" "a1" --> "b3" "b2" --> "a3" "a3" --> "end" "b3" --> "end"
node "start" [shape MDiamond] node "end" [shape MSquare]
Thanks for the packages, and best wishes, Ivan Z.
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com http://IvanMiljenovic.wordpress.com

Hello!
2015-04-17 4:23 UTC+03:00, Ivan Lazar Miljenovic
### Considering dotgen vs graphviz closer
But looking into the examples, I see that `dotgen` can use "Haskell ids" to identify created nodes, whereas in graphviz's monad (see the
To bring more clear context for any readers, I put here a short excerpt from that dotgen example:
refSpec <- src "S" c1 <- box "S" refSpec .->. c1
example above) one must supply extra strings as the unique ids (by which we refer to the nodes).
Short example:
"start" --> "a0"
node "start" [shape MDiamond]
I used Strings as an example, as I was directly converting an existing piece of Dot code; the original can be found here: http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz-T...
But, you can use any type you like for the node identifiers, as long as you make them an instance of the PrintDot class. That's where the `n` in the `Dot n` type comes in.
Ok, thanks for the valuable information!
I like the first approach more ("Haskell ids").
I admittedly don't have any ability in graphviz to create new identifiers for you. I could (just add a StateT to the internal monadic stack which keeps track of the next unused node identifier)
Since the API is already monadic, adding another monad into the stack wouldn't impose big difficulties for the users of the API, because they won't need to restructure the code (as if it were a transition from some pure functional code into monadic).
but I think that would _reduce_ the flexibility of being able to use your own type (it would either only work for `Dot Int`, or even if you could apply a mapping function to use something like `GraphID`, but that has a problem if you have a `Double` with the same value - and hence same textual representation - as your Int).
I see: [GraphID](http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz-T...) can have distinct values with the same textual representation. But if we are thinking about automatically creating new IDs, then this problem can simply be treated in the code for tracking which IDs have already been used. There could be two APIs: a "flexible" one with user-supplied IDs, and an "automatic" API. The "automatic" one is implemented on top of the "flexible" one.
The way I see it, graphviz is usually used for converting existing Haskell values into Dot code and then processing with dot, neato, etc.
My preference - and hence overall design with graphviz - is that you would generate the graph first, and _then_ convert it to a Dot representation en masse.
If the Haskell representation of the graph doesn't already have unique IDs for the nodes, then such an "automatic" layer would be useful as an intermediate step in the conversion. So it seems it won't be useless even in your standard scenarios. *** You name flexibility for the user as an advantage of the existing approach. As for some advantages of the other approach (with using Haskell ids for the nodes): the compiler could catch more errors. For example, if I make a typo in an identifier when introducing an edge, then Haskell compiler would report this as an unknown identifier. Also the compiler would catch name clashes, if you accidentally give the same id to two different nodes. A potential disadvantage is then an increased verbosity: first, create the nodes, then use them for the edges. Meaning three actions instead of yours single one: "a0" --> "a1" Still, even in the "automatic ids" approach, this can be written compactly in a single line in the spirit of: bindM2 (-->) (node [textLabel "a0"]) (node [textLabel "a1"]) without explicitly giving Haskell ids to the two nodes. Perhaps, this is not important stuff, because--as you write--one is supposed to use Haskell representations of graphs and then convert them with graphviz... (I might simply not want to learn another language for representing graphs apart from dot, that's why I'd like to use the monadic API: because it closely follows the known dot format.) My last line of code already looks similar to a code constructing a Haskell representation of a graph. I'm just writing down my comments concerning the API, not that I'm confident that I know a definite way to make it better. Well, after writing this post and thinking it all over while writing, I tend to come to a conclusion resonating with your opinion stating that the monadic API turned out not as useful as you used to think: it seems that while imposing the monadic style onto the programmer, it doesn't give the advantages a monad could give (like generating unique ids automatically and catching errors with undefined or clashing ids). Without this stateful feature, much else can be done purely with dedicated graph structures. What do you think about these comments? As for dotgen: my wishes could be satisfied simply with the dotgen package, but--as you wrote--it is not safe w.r.t. to quoting/escaping user supplied values. Best regards, -- Ivan

On 18 April 2015 at 01:48, Ivan Zakharyaschev
Hello!
2015-04-17 4:23 UTC+03:00, Ivan Lazar Miljenovic
: ### Considering dotgen vs graphviz closer
But looking into the examples, I see that `dotgen` can use "Haskell ids" to identify created nodes, whereas in graphviz's monad (see the
To bring more clear context for any readers, I put here a short excerpt from that dotgen example:
refSpec <- src "S" c1 <- box "S" refSpec .->. c1
It should be noted that src and box are custom functions and not part of dotgen.
example above) one must supply extra strings as the unique ids (by which we refer to the nodes).
Short example:
"start" --> "a0"
node "start" [shape MDiamond]
I used Strings as an example, as I was directly converting an existing piece of Dot code; the original can be found here: http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz-T...
But, you can use any type you like for the node identifiers, as long as you make them an instance of the PrintDot class. That's where the `n` in the `Dot n` type comes in.
Ok, thanks for the valuable information!
I like the first approach more ("Haskell ids").
I admittedly don't have any ability in graphviz to create new identifiers for you. I could (just add a StateT to the internal monadic stack which keeps track of the next unused node identifier)
Since the API is already monadic, adding another monad into the stack wouldn't impose big difficulties for the users of the API, because they won't need to restructure the code (as if it were a transition from some pure functional code into monadic).
Sure, this bit itself isn't a problem.
but I think that would _reduce_ the flexibility of being able to use your own type (it would either only work for `Dot Int`, or even if you could apply a mapping function to use something like `GraphID`, but that has a problem if you have a `Double` with the same value - and hence same textual representation - as your Int).
I see: [GraphID](http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz-T...) can have distinct values with the same textual representation.
But if we are thinking about automatically creating new IDs, then this problem can simply be treated in the code for tracking which IDs have already been used.
Possibly a bit more complicated than its worth: "OK, when I convert this ID to a textual one it appears to be the same as one we've already seen" would require a lot more bookkeeping, and won't help prevent errors from explicit user-defined node IDs defined later (unless we also use some from of backwards state to check for that as well).
There could be two APIs: a "flexible" one with user-supplied IDs, and an "automatic" API. The "automatic" one is implemented on top of the "flexible" one.
The way I see it, graphviz is usually used for converting existing Haskell values into Dot code and then processing with dot, neato, etc.
My preference - and hence overall design with graphviz - is that you would generate the graph first, and _then_ convert it to a Dot representation en masse.
If the Haskell representation of the graph doesn't already have unique IDs for the nodes, then such an "automatic" layer would be useful as an intermediate step in the conversion. So it seems it won't be useless even in your standard scenarios.
***
You name flexibility for the user as an advantage of the existing approach. As for some advantages of the other approach (with using Haskell ids for the nodes): the compiler could catch more errors.
For example, if I make a typo in an identifier when introducing an edge, then Haskell compiler would report this as an unknown identifier.
But you can always use variables rather than hard-coding the Strings in... I don't *recommend* hard-coding Strings in, I just did so in that sample usage just so you could compare it to the sample Dot code and notice how similar it was.
Also the compiler would catch name clashes, if you accidentally give the same id to two different nodes.
A potential disadvantage is then an increased verbosity: first, create the nodes, then use them for the edges. Meaning three actions instead of yours single one:
"a0" --> "a1"
Still, even in the "automatic ids" approach, this can be written compactly in a single line in the spirit of:
bindM2 (-->) (node [textLabel "a0"]) (node [textLabel "a1"])
without explicitly giving Haskell ids to the two nodes.
Perhaps, this is not important stuff, because--as you write--one is supposed to use Haskell representations of graphs and then convert them with graphviz... (I might simply not want to learn another language for representing graphs apart from dot, that's why I'd like to use the monadic API: because it closely follows the known dot format.)
My last line of code already looks similar to a code constructing a Haskell representation of a graph.
I'm just writing down my comments concerning the API, not that I'm confident that I know a definite way to make it better.
Well, after writing this post and thinking it all over while writing, I tend to come to a conclusion resonating with your opinion stating that the monadic API turned out not as useful as you used to think:
it seems that while imposing the monadic style onto the programmer, it doesn't give the advantages a monad could give (like generating unique ids automatically and catching errors with undefined or clashing ids). Without this stateful feature, much else can be done purely with dedicated graph structures.
What do you think about these comments?
Pretty much. I think I had an actual use-case when I first wrote the Monadic interface (some kind of tutorial from memory), but after I finished it I realised it would be much simpler using the alternative types. If you have a data structure that already represents a graph, then graphElemsToDot will let you convert that into the representation of a Dot graph: http://hackage.haskell.org/package/graphviz-2999.17.0.2/docs/Data-GraphViz.h... The only real reason I can come up with for using a Monadic interface is when you want to embed a (relatively) static Dot graph into some Haskell code and try and get some safety from the type-checker for attribute values. In that case, some relatively simple mapM_, etc. expressions might come in handy. But unless you have something rather simple in mind, I don't think this is all that common.
As for dotgen: my wishes could be satisfied simply with the dotgen package, but--as you wrote--it is not safe w.r.t. to quoting/escaping user supplied values.
For simple values it should be OK. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com http://IvanMiljenovic.wordpress.com
participants (2)
-
Ivan Lazar Miljenovic
-
Ivan Zakharyaschev