
On Sat, Dec 10, 2016 at 8:50 AM, Michael George
different a b = isFinal a && not (isFinal b) || isFinal b && not (isFinal a) || exists chars (\c -> different (step a c) (step b c))
This definition diverges; it never decides that two states are the same. The standard algorithm builds a table, starting with the assumption that all states are the same, and iteratively applying this definition until it converges.
In the (skeleton of an) algorithm you linked, it is up to the implementor to ensure convergence, it is not guaranteed. This implementor requirement is hidden in the third step: "Repeat this step until we cannot mark anymore states". Note that "cannot" is a modal operator which implies the existence of some sort of oracle that can tell us about whether something is possible in any potential future. Such an oracle is typically implemented by: (a) carefully ordering things so that you achieve exhaustion by design; (b) having a work queue with things ordered carefully enough that an empty queue logically entails no further changes can be made; (c) keeping track of visited sets, to explicitly detect cycles; (d) building the result in a step-indexed way to detect when convergence has been achieved; or similar. Whether it terminates or not has nothing to do with memoization (memoization just makes the performance reasonable). Termination has to do with whether you've implemented things with a properly well-founded induction. For (a) or (b) that's induction on the size of the work queue, for (c) that's induction on the complement of the visited set, for (d) that's induction on the step indices. As an example of this last one, consider the algorithm: Construct a descending chain of relations E0 >= E1 >= E2 >= ... as follows: E0 = { (s,t) | s <- states, t <- states, s /= t && isFinal s == isFinal t } E{i+1} = flip filter E{i} $ \(s,t) -> flip all chars $ \a -> step s a == step t a || (step s a, step t a) `elem` E{i} Stop once E{n} == E{n+1} for some n. E{n} `union` {(s,s) | s <- states} is the finest equivalence relation on states. Note that this algorithm *does* terminate as written: because E0 is finite, and thanks to properties of filter either E{i+1} < E{i} so the induction is permitted, or else E{i+1} == E{i} so we stop. This particular algorithm is the dual to Myhill–Nerode, constructing the equivalence relation rather than constructing the difference relation. Fundamentally the insight is the same. Of course, the runtime of the algorithm I provided can be halved by making the elements of the relation unordered pairs rather than ordered ones. Operationally we can represent each E{i} as a symmetric 2D bitmap, aka a triangular 2D bitmap. Since each bit of E{i} is independent from one another, we can allocate and fill the triangular 2D bitmap in one go. Since each E{i+1} only depends on the elements of E{i} we can drop the previous table once we've constructed the new one. (Or recycle it, to reduce allocation.) @ezyang: does this algorithm meet your requirements for an elegant functional algorithm? If not, why not? (If so, it should be noted that I'm not the one to come up with it. I can give references if desired.) -- Live well, ~wren