Transformations of cyclic graphs [Was: Efficiency of list indices in definitions]

[Moved to Haskell-Cafe] Hello! Cycles sure make it difficult to transform graphs in a pure non-strict language. Cycles in a source graph require us to devise a way to mark traversed nodes -- however we cannot mutate nodes and cannot even compare nodes with a generic ('derived') equality operator. Cycles in a destination graph require us to keep track of the already built nodes so we can complete a cycle. An obvious solution is to use a state monad and IORefs. There is also a monad-less solution, which is less obvious: we can't add a node to the dictionary of constructed nodes until we have built the node, then means we can't use the updated dictionary when building descendants -- which need the dictionary when linking back. The problem can be overcome however with a credit card transform (a.k.a. "buy now, pay later" transform). To avoid hitting the bottom, we just have to "pay" by the "due date". The will use as an example the NFA->DFA transformation considered earlier. Perhaps we should first attempt a few obvious optimizations of a NFA/DFA recognizer. It seems that performance matters. finAuAcceptStringQ startStates string = or [faStateAcceptStringQ s string | s <- startStates] where faStateAcceptStringQ (FaState _ acceptQ _) [] = acceptQ faStateAcceptStringQ (FaState _ _ trans) (a:as) | null followerStates = False | otherwise = or [ faStateAcceptStringQ s as | s <- followerStates ] where followerStates = if a > (length trans) - 1 then [] else trans!!a Computing (length trans) when determining the followerStates is inefficient: even if 'a' is zero, we still have to traverse the whole list 'trans'. If our alphabet is binary, perhaps its better to specialize the FaState correspondingly and write: -- Automata over a binary alphabet data (Ord l,Show l) => FaState l = FaState {label :: l, acceptQ :: Bool, trans0:: [FaState l], trans1:: [FaState l]} -- Then a finite automaton is merely a list of starting FaStates: type FinAu l = [FaState l] --So, for example, an automaton equivalent to the regular expression --0*(0(0+1)*)* could be defined as: dom18 = [one] where one = FaState 1 True [one,two] [] two = FaState 2 True [two,one] [one,two] The acceptance function can be written as finAuAcceptStringQ start_states str = foldr (\l seed -> acceptP l str || seed) False start_states where acceptP (FaState _ acceptQ _ _) [] = acceptQ acceptP (FaState _ _ t0 t1) (s:rest) = finAuAcceptStringQ (if s then t1 else t0) rest We have simplified the original expression, and manually deforested it. test1= finAuAcceptStringQ dom18 $ map (>0) [0,1,0,1] test2= finAuAcceptStringQ dom18 $ map (>0) [1,1,0,1] test3= finAuAcceptStringQ dom18 [True] test4= finAuAcceptStringQ dom18 [False] It would be great to be able to compare and print the nodes: instance (Ord l,Show l) => Eq (FaState l) where (FaState l1 _ _ _) == (FaState l2 _ _ _) = l1 == l2 but printing a node is already a slight problem. We need to keep track of the already printed nodes to avoid looping. -- a data class for an occurrence check class OCC occ where empty:: occ a seenp:: (Eq a) => a -> occ a -> Bool put:: a -> occ a -> occ a -- Currently, it's just a list. -- In the future, we can pull in something fancier from the Edison instance OCC [] where empty = [] seenp = elem put = (:) -- Depth-first, pre-order traversal of the graph, keeping track of -- already printed nodes instance (Ord l,Show l) => Show (FaState l) where show state = "{@" ++ showstates [state] (empty::[FaState l]) "@}" where -- showstates worklist seen_states suffix showstates [] states_seen suffix = suffix showstates (st:rest) states_seen suffix | st `seenp` states_seen = showstates rest states_seen suffix showstates (st@(FaState l accept t0 t1):rest) states_seen suffix = showstate st $ showstates (t0++t1++rest) (st `put` states_seen) suffix showstate (FaState l accept t0 t1) suffix = "{State " ++ (show l) ++ " " ++ (show accept) ++ " " ++ (show $ map label t0) ++ " " ++ (show $ map label t1) ++ "}" ++ suffix Now, "print dom18" prints as [{@{State 1 True [1,2] []}{State 2 True [2,1] [1,2]}@}] For the NFA->DFA conversion, we need to keep track of already built states. -- A dictionary of states class StateDict sd where emptyd :: sd (l,FaState l) locate :: (Eq l) => l -> sd (l,FaState l) -> Maybe (FaState l) putd :: (l,FaState l) -> sd (l,FaState l) -> sd (l,FaState l) -- if performance matters, we can use a fancier dictionary from the Edison instance StateDict [] where emptyd = [] locate = lookup putd = (:) -- [nfa_state] -> dictionary_of_seen_states -> ([dfa_state],updated_dictionary) -- [dfa_state] is a singleton list determinize_cc states converted_states = -- first, check the cache case dfa_label `locate` converted_states of Nothing -> build_state Just dfa_state -> ([dfa_state],converted_states) where -- [NFA_labels] -> DFA_labels det_labels = sort . nub . (map label) dfa_label = det_labels states -- find out NFA-followers for [nfa_state] upon ingestion of 0 and 1 (t0_followers,t1_followers) = foldr (\st (f0,f1) -> (trans0 st ++ f0, trans1 st ++ f1)) ([],[]) states acceptQ' = or (map acceptQ states) -- really build the dfa state and return ([dfa_state],updated_cache) build_state = let -- node, dfa_state is computed _below_ converted_states1 = (dfa_label,dfa_state) `putd` converted_states (t0', converted_states2) = (determinize_cc t0_followers converted_states1) (t1', converted_states3) = (determinize_cc t1_followers converted_states2) dfa_state = (FaState dfa_label acceptQ' t0' t1') in ([dfa_state],converted_states3) finAuDeterminize states = fst $ determinize_cc states [] At the heart of the credit card transform is the phrase converted_states1 = (dfa_label,dfa_state) `putd` converted_states we added to the dictionary of the computed states the state that we haven't built yet. Because (,) is non-strict in its arguments and `locate` is non-strict in its result, we can get away with a mere promise to "pay". We can print the DFA for dom18 to see what we've got: finAuDeterminize dom18 -- shows [{@{State [1] True [[1,2]] [[]] } {State [1,2] True [[1,2]] [[1,2]]} {State [] False [[]] [[]] }@}] which is a DFA (which happens to be minimal) recognizing (0+1)* - 1(0+1)* An example from the original message: dom19 = [one,two] where one = FaState 1 True [two] [] two = FaState 2 True [one] [one] finAuDeterminize dom19 -- shows [{@{State [1,2] True [[1,2]] [[1]] } {State [1] True [[2]] [[]] } {State [2] True [[1]] [[1]] } {State [] False [[]] [[]] }@}] which recognizes (0+1)* - (0+1)*11(0+1)*
participants (1)
-
oleg@pobox.com