
I've come across a slight variant of memoization and I'm curious if either there is a better way to think about (I have yet to reach Haskell Enlightement), or if it's been thought about before. I'm writing a program that does DFA minimization, as described here: https://www.tutorialspoint.com/automata_theory/dfa_minimization.htm. For me, the simplest way to understand the algorithm is we form an equivalence relation, where two states are different if either one is final and the other isn't, or if there are transitions from them to states that are different. different a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different (step a c) (step b c)) This definition diverges; it never decides that two states are the same. The standard algorithm builds a table, starting with the assumption that all states are the same, and iteratively applying this definition until it converges. I'm thinking of that algorithm as a memoized version of the implementation of different above. But it's not quite: different = memoFix2 different' where different' different'' a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different'' (step a c) (step b c)) for the same reason; it diverges. But with memoization I can detect the diversion by putting a marker in the memo table. If the computation makes a recursive call on an input that is started but not computed, then (in this case) I want to return false. In general, I think what I want is memoFixWithDefault :: (a -> b) -> ((a->b) -> (a -> b)) -> (a -> b) Where (memoFixWithDefault default f) is like (memoFix f) except that if (memoFix f a) would loop, then (memoFixWithDefault default f a) = default a. I suspect there's a nice denotational way of describing this, but my denotational semantics is rusty (maybe it's the least fixed point of f that is greater than g or something, but that's not it). I suspect that there needs to be a nice relationship between f and g for memoFixWithDefault f g to be sensible. I also suspect that there's a cute 3-line implementation like all the other implementations of memoize out there, but I don't see it. I also suspect that depth-first search has a cute implementation using this combinator. So I was wondering if other people had thoughts or references before I went off and thought about it. Thanks! -M

Hello Michael, As indicated by its name, memoFix defines a fixed point of its argument. In particular it defines the least defined fixed point, which is unique, and can formally be obtained by iteration starting from bottom. You want a function which is defined everywhere, so it is maximal for the ordering of definedness. In general there is no unique greatest fixed point for that ordering (e.g., id makes everything a fixed point), which explains why there is no standard way to build one. In your case different' also doesn't have a unique fixed point if your automaton is not minimal. There are at least the one you're looking for, and the function corresponding to the total relation (State × State): const (const True) Since different' is defined with disjunctions and no negations, it is monotonic with the ordering (False < True) lifted to total functions. Thus iterating from the bottom value (const (const False)) gets you the least fixed point... Interesting. By the way
Where (memoFixWithDefault default f) is like (memoFix f) except that if (memoFix f a) would loop, then (memoFixWithDefault default f a) = default a.
That will not produce the correct result. For instance, assume there are two non-final states A and B which point to themselves with the same letter C, (A -C-> A, B -C-> B) but are actually not equivalent, so different A B should be True. Yet different (step A C) (step B C) = different A B so memoFix may still loop. Shortcircuiting when a loop is detected works to compute a *least defined* fixed point (i.e., Haskell's fix, or memoFix) because definedness is "conjunctive" (expanding the parallel between bottom and False): all values which are needed must be defined, so encountering bottom kills the whole computation. On the contrary, the function "different" is "disjunctive", it shortcircuits on True. When you encounter a loop, the result (which you are in the middle of computing) could still be either True or False. So you could *temporarily* return False from such a recursive call in order to continue searching, but must not mark the value as having been computed, because you're still doing that. Regards, Li-yao On 12/10/2016 04:50 PM, Michael George wrote:
I've come across a slight variant of memoization and I'm curious if either there is a better way to think about (I have yet to reach Haskell Enlightement), or if it's been thought about before.
I'm writing a program that does DFA minimization, as described here: https://www.tutorialspoint.com/automata_theory/dfa_minimization.htm.
For me, the simplest way to understand the algorithm is we form an equivalence relation, where two states are different if either one is final and the other isn't, or if there are transitions from them to states that are different.
different a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different (step a c) (step b c))
This definition diverges; it never decides that two states are the same. The standard algorithm builds a table, starting with the assumption that all states are the same, and iteratively applying this definition until it converges.
I'm thinking of that algorithm as a memoized version of the implementation of different above. But it's not quite:
different = memoFix2 different' where different' different'' a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different'' (step a c) (step b c))
In general, I think what I want is
memoFixWithDefault :: (a -> b) -> ((a->b) -> (a -> b)) -> (a -> b)
I suspect there's a nice denotational way of describing this, but my denotational semantics is rusty (maybe it's the least fixed point of f that is greater than g or something, but that's not it). I suspect that there needs to be a nice relationship between f and g for memoFixWithDefault f g to be sensible. I also suspect that there's a cute 3-line implementation like all the other implementations of memoize out there, but I don't see it. I also suspect that depth-first search has a cute implementation using this combinator.
So I was wondering if other people had thoughts or references before I went off and thought about it.
Thanks!
-M _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Hello Michael, I've been looking for an elegant, functional implementation of DFA minimization and I have not found it. If you discover a way to do it please share. However, I have seen some papers which might be relevant to the problem (although not necessarily good for writing a Haskell implementation). The CoCaml paper https://www.cs.cornell.edu/~kozen/papers/cocaml.pdf discusses how to write recursive functions on infinite objects, which would ordinarily not terminate, but in their language are instead treated as a set of recursive equations and solved by some extralinguistic constraint solver. This is similar in spirit to your desire to "detect diversion" when you loop around the memotable. The Datafun paper https://people.mpi-sws.org/~neelk/datafun.pdf recently reminded me that Datalog-style programming is all about fixpoints. I wonder if DFA minimization can be done in a Datalog-like language. I have no idea if these are actually relevant, and would be interested to hear if they are. Edward Excerpts from Michael George's message of 2016-12-10 11:50:03 -0500:
I've come across a slight variant of memoization and I'm curious if either there is a better way to think about (I have yet to reach Haskell Enlightement), or if it's been thought about before.
I'm writing a program that does DFA minimization, as described here: https://www.tutorialspoint.com/automata_theory/dfa_minimization.htm.
For me, the simplest way to understand the algorithm is we form an equivalence relation, where two states are different if either one is final and the other isn't, or if there are transitions from them to states that are different.
different a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different (step a c) (step b c))
This definition diverges; it never decides that two states are the same. The standard algorithm builds a table, starting with the assumption that all states are the same, and iteratively applying this definition until it converges.
I'm thinking of that algorithm as a memoized version of the implementation of different above. But it's not quite:
different = memoFix2 different' where different' different'' a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different'' (step a c) (step b c))
for the same reason; it diverges. But with memoization I can detect the diversion by putting a marker in the memo table. If the computation makes a recursive call on an input that is started but not computed, then (in this case) I want to return false.
In general, I think what I want is
memoFixWithDefault :: (a -> b) -> ((a->b) -> (a -> b)) -> (a -> b)
Where (memoFixWithDefault default f) is like (memoFix f) except that if (memoFix f a) would loop, then (memoFixWithDefault default f a) = default a.
I suspect there's a nice denotational way of describing this, but my denotational semantics is rusty (maybe it's the least fixed point of f that is greater than g or something, but that's not it). I suspect that there needs to be a nice relationship between f and g for memoFixWithDefault f g to be sensible. I also suspect that there's a cute 3-line implementation like all the other implementations of memoize out there, but I don't see it. I also suspect that depth-first search has a cute implementation using this combinator.
So I was wondering if other people had thoughts or references before I went off and thought about it.
Thanks!
-M

"But with memoization I can detect the diversion by putting a marker in the
memo table."
I'm not sure I entirely understand your problem, but it seems when you make
a recursive call to a node you've already visited, you want to return
false. Like you said, you have to "mark" this in some way. One way do to
this is to just add a parameter to the recursive call which notes the nodes
you've just visited. There's a number of ways to do this, in the ST Monad
an mutable bit array would work nicely but if you're not in ST a map which
you check and add elements to should do the job fine. Just make this check
the first part of an "and" statement, if it's already there you short
circuit and return false.
On Tue, Dec 13, 2016 at 5:26 PM, Edward Z. Yang
Hello Michael,
I've been looking for an elegant, functional implementation of DFA minimization and I have not found it. If you discover a way to do it please share.
However, I have seen some papers which might be relevant to the problem (although not necessarily good for writing a Haskell implementation).
The CoCaml paper https://www.cs.cornell.edu/~kozen/papers/cocaml.pdf discusses how to write recursive functions on infinite objects, which would ordinarily not terminate, but in their language are instead treated as a set of recursive equations and solved by some extralinguistic constraint solver. This is similar in spirit to your desire to "detect diversion" when you loop around the memotable.
The Datafun paper https://people.mpi-sws.org/~neelk/datafun.pdf recently reminded me that Datalog-style programming is all about fixpoints. I wonder if DFA minimization can be done in a Datalog-like language.
I have no idea if these are actually relevant, and would be interested to hear if they are.
Edward
Excerpts from Michael George's message of 2016-12-10 11:50:03 -0500:
I've come across a slight variant of memoization and I'm curious if either there is a better way to think about (I have yet to reach Haskell Enlightement), or if it's been thought about before.
I'm writing a program that does DFA minimization, as described here: https://www.tutorialspoint.com/automata_theory/dfa_minimization.htm.
For me, the simplest way to understand the algorithm is we form an equivalence relation, where two states are different if either one is final and the other isn't, or if there are transitions from them to states that are different.
different a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different (step a c) (step b c))
This definition diverges; it never decides that two states are the same. The standard algorithm builds a table, starting with the assumption that all states are the same, and iteratively applying this definition until it converges.
I'm thinking of that algorithm as a memoized version of the implementation of different above. But it's not quite:
different = memoFix2 different' where different' different'' a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different'' (step a c) (step b c))
for the same reason; it diverges. But with memoization I can detect the diversion by putting a marker in the memo table. If the computation makes a recursive call on an input that is started but not computed, then (in this case) I want to return false.
In general, I think what I want is
memoFixWithDefault :: (a -> b) -> ((a->b) -> (a -> b)) -> (a -> b)
Where (memoFixWithDefault default f) is like (memoFix f) except that if (memoFix f a) would loop, then (memoFixWithDefault default f a) = default a.
I suspect there's a nice denotational way of describing this, but my denotational semantics is rusty (maybe it's the least fixed point of f that is greater than g or something, but that's not it). I suspect that there needs to be a nice relationship between f and g for memoFixWithDefault f g to be sensible. I also suspect that there's a cute 3-line implementation like all the other implementations of memoize out there, but I don't see it. I also suspect that depth-first search has a cute implementation using this combinator.
So I was wondering if other people had thoughts or references before I went off and thought about it.
Thanks!
-M
Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Hi Michael:
This is an interesting question, and I think you might indeed be able to
"hack" into the recursive calls if you are willing to be explicit with your
fixed-point calculation. Note that you'll have to *do* the work of keeping
track of "visited" nodes, so to speak; there's no free lunch. But the
construction of the table in a monadic form can be incorporated using
what's known as "monadic memoization mixins." See this paper by Brown and
Cook for details:
https://www.cs.utexas.edu/users/wcook/Drafts/2009/sblp09-memo-mixins.pdf
The idea is to detour through a memo-routine before each recursive call to
do the bookkeeping necessary. The underlying monad keeps track of the data
you are memoizing.
However, before diving into memo-mixins, you might want to just implement
the algorithm keeping track of a "visited" table in the monadic style, just
to appreciate what the mixins will buy you. For fun, I coded your minimize
example in idiomatic Haskell here:
https://gist.github.com/LeventErkok/615afe9a0e24249daedab3c623241275
Enjoy,
-Levent.
On Sat, Dec 10, 2016 at 8:50 AM, Michael George
I've come across a slight variant of memoization and I'm curious if either there is a better way to think about (I have yet to reach Haskell Enlightement), or if it's been thought about before.
I'm writing a program that does DFA minimization, as described here: https://www.tutorialspoint.com/automata_theory/dfa_minimization.htm.
For me, the simplest way to understand the algorithm is we form an equivalence relation, where two states are different if either one is final and the other isn't, or if there are transitions from them to states that are different.
different a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different (step a c) (step b c))
This definition diverges; it never decides that two states are the same. The standard algorithm builds a table, starting with the assumption that all states are the same, and iteratively applying this definition until it converges.
I'm thinking of that algorithm as a memoized version of the implementation of different above. But it's not quite:
different = memoFix2 different' where different' different'' a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different'' (step a c) (step b c))
for the same reason; it diverges. But with memoization I can detect the diversion by putting a marker in the memo table. If the computation makes a recursive call on an input that is started but not computed, then (in this case) I want to return false.
In general, I think what I want is
memoFixWithDefault :: (a -> b) -> ((a->b) -> (a -> b)) -> (a -> b)
Where (memoFixWithDefault default f) is like (memoFix f) except that if (memoFix f a) would loop, then (memoFixWithDefault default f a) = default a.
I suspect there's a nice denotational way of describing this, but my denotational semantics is rusty (maybe it's the least fixed point of f that is greater than g or something, but that's not it). I suspect that there needs to be a nice relationship between f and g for memoFixWithDefault f g to be sensible. I also suspect that there's a cute 3-line implementation like all the other implementations of memoize out there, but I don't see it. I also suspect that depth-first search has a cute implementation using this combinator.
So I was wondering if other people had thoughts or references before I went off and thought about it.
Thanks!
-M _______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On Sat, Dec 10, 2016 at 8:50 AM, Michael George
different a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different (step a c) (step b c))
This definition diverges; it never decides that two states are the same. The standard algorithm builds a table, starting with the assumption that all states are the same, and iteratively applying this definition until it converges.
In the (skeleton of an) algorithm you linked, it is up to the implementor to ensure convergence, it is not guaranteed. This implementor requirement is hidden in the third step: "Repeat this step until we cannot mark anymore states". Note that "cannot" is a modal operator which implies the existence of some sort of oracle that can tell us about whether something is possible in any potential future. Such an oracle is typically implemented by: (a) carefully ordering things so that you achieve exhaustion by design; (b) having a work queue with things ordered carefully enough that an empty queue logically entails no further changes can be made; (c) keeping track of visited sets, to explicitly detect cycles; (d) building the result in a step-indexed way to detect when convergence has been achieved; or similar. Whether it terminates or not has nothing to do with memoization (memoization just makes the performance reasonable). Termination has to do with whether you've implemented things with a properly well-founded induction. For (a) or (b) that's induction on the size of the work queue, for (c) that's induction on the complement of the visited set, for (d) that's induction on the step indices. As an example of this last one, consider the algorithm: Construct a descending chain of relations E0 >= E1 >= E2 >= ... as follows: E0 = { (s,t) | s <- states, t <- states, s /= t && isFinal s == isFinal t } E{i+1} = flip filter E{i} $ \(s,t) -> flip all chars $ \a -> step s a == step t a || (step s a, step t a) `elem` E{i} Stop once E{n} == E{n+1} for some n. E{n} `union` {(s,s) | s <- states} is the finest equivalence relation on states. Note that this algorithm *does* terminate as written: because E0 is finite, and thanks to properties of filter either E{i+1} < E{i} so the induction is permitted, or else E{i+1} == E{i} so we stop. This particular algorithm is the dual to Myhill–Nerode, constructing the equivalence relation rather than constructing the difference relation. Fundamentally the insight is the same. Of course, the runtime of the algorithm I provided can be halved by making the elements of the relation unordered pairs rather than ordered ones. Operationally we can represent each E{i} as a symmetric 2D bitmap, aka a triangular 2D bitmap. Since each bit of E{i} is independent from one another, we can allocate and fill the triangular 2D bitmap in one go. Since each E{i+1} only depends on the elements of E{i} we can drop the previous table once we've constructed the new one. (Or recycle it, to reduce allocation.) @ezyang: does this algorithm meet your requirements for an elegant functional algorithm? If not, why not? (If so, it should be noted that I'm not the one to come up with it. I can give references if desired.) -- Live well, ~wren
participants (6)
-
Clinton Mead
-
Edward Z. Yang
-
Levent Erkok
-
Li-yao Xia
-
Michael George
-
wren romano