Re: Representing cyclic data structures efficiently in Haskell

From reading your message, it seems you want a graph library of sorts.
The important bit is 'of sorts', unfortunately! Most graph libraries I've seen to date (although I must admit that they have been written in C and C++, including one I perpetrated personally) don't really work that well for circuits because, for anything other than trivially simple components, the connections between nodes need to be labelled. A component representing something simple like (say) an 'and' gate would be easy enough to represent as a node of an ordinary directed graph. In such a case, an incoming arrow would always be an input of the gate, and an outgoing arrow would always be the output. I need to be able to represent more complex components, up to and including the complexity of things like CPUs, where you might have lots and lots of inputs and outputs with substantially different meanings. In the case of a CPU, it would never be acceptable to (for example) swap a chip select output for an address or data line. This makes me think that an off-the-shelf graph library won't quite be what I'm after, so I will almost certainly need to write the code. What I'm more interested in at the moment is a view on the best kinds of approach that might be appropriate. I'd considered something like embedding the type in the IO monad, with links between components implemented as IORefs, but I'd really prefer something cleaner (pure functional). If the code ends up horribly complicated in order to get decent performance, I might as well fold and use C++ instead. I'm currently wondering about going for an adjacency list approach (as used by most graph libraries), with the tuples in the adjacency list extended to include input/output labels, resulting in a 4-ary tuple rather than the usual 2-ary. But, I don't want to be stuck with representing this as a list -- I really need log N lookups or performance will stink horribly as circuits get big. Maybe a pair of finite maps, allowing the edges to be traversed in either direction?
If FGL seems like overkill to you, I have my own little mini graph library which basically looks like:
- An ADT, Node: newtype Node = Node Int deriving ... - The 'Graph n e' data structure, containing: - a FiniteMap from n -> Node - an (growable?) array of type IOArray (Node,Node) e
the array represents the edges in the top half (i do some index fiddling to get this efficient) and the finitemap represents the node labellings. It's a pretty stupid interface and works a lot better when the graphs are fixed size (that way you don't need to grow the array). But it works.
The finite map from n to Node seems like a good idea. Does representing the adjacency list as a flat array cause any problems? Sarah -- ---------------------------------------------- / __ + / Sarah Thompson **** / / (_ _ _ _ |_ / * / / __)(_|| (_|| ) / sarah@telergy.com * / / + / http://findatlantis.com/ / ---------------------------------------------- -- ---------------------------------------------- / __ + / Sarah Thompson **** / / (_ _ _ _ |_ / * / / __)(_|| (_|| ) / sarah@telergy.com * / / + / http://findatlantis.com/ / ----------------------------------------------

Lazy evaluation means that you can embed nodes directly in a cyclic structure, even though that would appear to create an indefinite regression. In order to detect cycles, one might add a node identifier. Here's a simple example: [[ -- spike-cyclicstructure.hs data Node = Node { nodeid :: String, adjacent :: [Node] } instance Eq Node where n1 == n2 = nodeid n1 == nodeid n2 instance Show Node where show = nodeid findPaths :: Node -> Node -> [[Node]] findPaths fromNode toNode = map reverse $ findPaths1 [] fromNode toNode findPaths1 :: [Node] -> Node -> Node -> [[Node]] findPaths1 beenThere fromNode toNode | fromNode == toNode = [fromNode:beenThere] | otherwise = concat [ findPaths1 (fromNode:beenThere) nextNode toNode | nextNode <- adjacent fromNode , not ( nextNode `elem` beenThere ) ] n1 = Node "n1" [n2,n3,n4] n2 = Node "n2" [n1,n3,n4] n3 = Node "n3" [n1,n2] n4 = Node "n4" [n1,n2] test1 = findPaths n1 n2 -- [[n1,n2],[n1,n3,n2],[n1,n4,n2]] test2 = findPaths n1 n3 -- [[n1,n2,n3],[n1,n3],[n1,n4,n2,n3]] ]] One (crude) way to think about this is that a mention of a value in haskell behaves in some ways like a pointer to that value in (say) C. If you don't want to assign unique identifiers by hand, you could use a state transformer monad to generate them for you. #g -- At 16:37 07/07/03 +0100, Sarah Thompson wrote:
From reading your message, it seems you want a graph library of sorts. The important bit is 'of sorts', unfortunately! Most graph libraries I've seen to date (although I must admit that they have been written in C and C++, including one I perpetrated personally) don't really work that well for circuits because, for anything other than trivially simple components, the connections between nodes need to be labelled.
A component representing something simple like (say) an 'and' gate would be easy enough to represent as a node of an ordinary directed graph. In such a case, an incoming arrow would always be an input of the gate, and an outgoing arrow would always be the output. I need to be able to represent more complex components, up to and including the complexity of things like CPUs, where you might have lots and lots of inputs and outputs with substantially different meanings. In the case of a CPU, it would never be acceptable to (for example) swap a chip select output for an address or data line.
This makes me think that an off-the-shelf graph library won't quite be what I'm after, so I will almost certainly need to write the code. What I'm more interested in at the moment is a view on the best kinds of approach that might be appropriate.
I'd considered something like embedding the type in the IO monad, with links between components implemented as IORefs, but I'd really prefer something cleaner (pure functional). If the code ends up horribly complicated in order to get decent performance, I might as well fold and use C++ instead.
I'm currently wondering about going for an adjacency list approach (as used by most graph libraries), with the tuples in the adjacency list extended to include input/output labels, resulting in a 4-ary tuple rather than the usual 2-ary. But, I don't want to be stuck with representing this as a list -- I really need log N lookups or performance will stink horribly as circuits get big. Maybe a pair of finite maps, allowing the edges to be traversed in either direction?
If FGL seems like overkill to you, I have my own little mini graph library which basically looks like:
- An ADT, Node: newtype Node = Node Int deriving ... - The 'Graph n e' data structure, containing: - a FiniteMap from n -> Node - an (growable?) array of type IOArray (Node,Node) e
the array represents the edges in the top half (i do some index fiddling to get this efficient) and the finitemap represents the node labellings. It's a pretty stupid interface and works a lot better when the graphs are fixed size (that way you don't need to grow the array). But it works.
The finite map from n to Node seems like a good idea. Does representing the adjacency list as a flat array cause any problems?
Sarah
-- ---------------------------------------------- / __ + / Sarah Thompson **** / / (_ _ _ _ |_ / * / / __)(_|| (_|| ) / sarah@telergy.com * / / + / http://findatlantis.com/ / ----------------------------------------------
-- ---------------------------------------------- / __ + / Sarah Thompson **** / / (_ _ _ _ |_ / * / / __)(_|| (_|| ) / sarah@telergy.com * / / + / http://findatlantis.com/ / ----------------------------------------------
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-------------------
Graham Klyne

G'day all. On Mon, Jul 07, 2003 at 04:37:46PM +0100, Sarah Thompson wrote:
I'd considered something like embedding the type in the IO monad, with links between components implemented as IORefs, but I'd really prefer something cleaner (pure functional). If the code ends up horribly complicated in order to get decent performance, I might as well fold and use C++ instead.
At the risk of stating the obvious, IORefs _are_ pure functional. :-)
I'm currently wondering about going for an adjacency list approach (as used by most graph libraries), with the tuples in the adjacency list extended to include input/output labels, resulting in a 4-ary tuple rather than the usual 2-ary. But, I don't want to be stuck with representing this as a list -- I really need log N lookups or performance will stink horribly as circuits get big. Maybe a pair of finite maps, allowing the edges to be traversed in either direction?
You could do that. Or you could use just one FiniteMap and reverse the iterator before you use it. (Remember that reverse is an amortised O(1) operation, assuming you need to traverse the whole list.) Or you could take a copy of the FiniteMap source code and add your own reverse iterator to complement the forward iterator which is already there. You could even submit a patch. :-) Or you could use a different data structure, say, one with O(n) iteration from either end, O(log n) search and O(1) insertion onto either end. There are several of these floating around. This one is quite good: http://citeseer.nj.nec.com/25942.html Cheers, Andrew Bromage

Sarah Thompson
work that well for circuits because, for anything other than trivially simple components, the connections between nodes need to be labelled.
it's been a while since i used it, but i thought erwig's graph library had labels on edges. it's a really sweet piece of work (disclaimer: i used the ml version). andrew -- http://www.acooke.org
participants (4)
-
andrew cooke
-
Andrew J Bromage
-
Graham Klyne
-
Sarah Thompson