
I guess that deciding whether two functions are equal in most cases is algorithmically impossible. However maybe there exists quite a large domain of decidable cases? If so, how can I employ that in Haskell? It is a common situation when one has two implementations of the same function, one being straightforward but slow, and the other being fast but complex. It would be nice to be able to check if these two versions are equal to catch bugs in the more complex implementation.

2011/4/9 Grigory Sarnitskiy
I guess that deciding whether two functions are equal in most cases is algorithmically impossible. However maybe there exists quite a large domain of decidable cases? If so, how can I employ that in Haskell?
It is a common situation when one has two implementations of the same function, one being straightforward but slow, and the other being fast but complex. It would be nice to be able to check if these two versions are equal to catch bugs in the more complex implementation.
Hi, Instead a trying to decide if they are equal, I would simply go through a well-known route: Pick the simple implementation and test toroughly, to see if it is 'correct' w.r.t. some specification. This involves things like unit tests and QuickCheck. Then apply those those tests to the second implementation once you're satified with the results of the first one. This transforms your problem of deciding if two functions are equal into trusting enough your two functions given some tests. Cheers, Thu

Excerpts from Grigory Sarnitskiy's message of Sat Apr 09 13:26:28 -0400 2011:
I guess that deciding whether two functions are equal in most cases is algorithmically impossible. However maybe there exists quite a large domain of decidable cases? If so, how can I employ that in Haskell?
In the case of functions where the domain and range are finite, function equality is decidable but not usually feasible. If your function is a combinatorial circuit, you can apply technology like SAT solvers to hopefully decide equality in faster than exponential time (this is what Cryptol does; you may find it interesting.) Cheers, Edward

On Sat, Apr 9, 2011 at 11:26 AM, Grigory Sarnitskiy
I guess that deciding whether two functions are equal in most cases is algorithmically impossible. However maybe there exists quite a large domain of decidable cases? If so, how can I employ that in Haskell?
It is a common situation when one has two implementations of the same function, one being straightforward but slow, and the other being fast but complex. It would be nice to be able to check if these two versions are equal to catch bugs in the more complex implementation.
Every function with a "searchable" domain and a decidable codomain has decidable equality. The classic example of a big searchable domain is the cantor space: import Data.Searchable type Nat = Int -- works with Integer too type Cantor = Nat -> Bool bit :: Set Bool bit = doubleton True cantor :: Set Cantor cantor = fmap (!!) . sequence . repeat $ bit decEq :: (Eq a) => (Cantor -> a) -> (Cantor -> a) -> Bool decEq f g = forEvery (\c -> f c == g c) So for example: ghci> decEq ($ 4) ($ 4) True ghci> decEq ($ 4) ($ 100) False ghci> decEq (\c -> c 4 && c 42) (\c -> not (not (c 4) || not (c 42))) True Searchable is based on work by Martin Escardo, very cool stuff. Data.Searchable comes from the infinite-search package. Luke

On 4/9/11 1:26 PM, Grigory Sarnitskiy wrote:
I guess that deciding whether two functions are equal in most cases is algorithmically impossible. However maybe there exists quite a large domain of decidable cases? If so, how can I employ that in Haskell?
It is a common situation when one has two implementations of the same function, one being straightforward but slow, and the other being fast but complex. It would be nice to be able to check if these two versions are equal to catch bugs in the more complex implementation.
This common situation is often actually one of the harder ones to prove, I say coming from proving a few of them in Coq. The thing is that a lot of the common optimizations (e.g., TCO) completely wreck the inductive structure of the function which, in turn, makes it difficult to say interesting things about them.[1] The easy path forward for this situation is to demonstrate the correctness of the slow/obvious implementation and then use a combination of lazy SmallCheck, QuickCheck, and HUnit in order to show that the fast implementation produces equal outputs for all small inputs, randomly chosen inputs, and select manually chosen inputs. Or, if you happen to be working with a nice well-behaved type, then you can use circuit SAT, SMT, and other domain solvers, or Martin Escardo's excellent work on compact spaces. [1] Though if something as simple as TCO is so hard to prove equivalences with, maybe that says something about our current crop of proof assistants and theorem provers. -- Live well, ~wren

On 10/04/2011 04:22, wren ng thornton wrote:
The thing is that a lot of the common optimizations (e.g., TCO) completely wreck the inductive structure of the function which, in turn, makes it difficult to say interesting things about them.[1]
Could you point me to some Haskell references concerning this point. Thanks, Pat This message has been scanned for content and viruses by the DIT Information Services E-Mail Scanning Service, and is believed to be clean. http://www.dit.ie

It is a common situation when one has two implementations of the same function, one being straightforward but slow, and the other being fast but complex. It would be nice to be able to check if these two versions are equal to catch bugs in the more complex implementation.
This common situation is often actually one of the harder ones to prove, I say coming from proving a few of them in Coq. The thing is that a lot of the common optimizations (e.g., TCO) completely wreck the inductive structure of the function which, in turn, makes it difficult to say interesting things about them.[1]
The traditional approach is to derive the efficient version from the simple, obviously correct version, by a series of small code transformations. The steps would include meaning-preserving equivalences as well as refinements (where implementation decisions come in to narrow down the set of equivalent code). Advantages: codes are equivalent by construction (modulo implementation decisions), and the relationship is documented (so you can replay it in case requirements changes make you want to revisit some implementation decisions). Even with modern provers/assistants, this should be easier than trying to relate two separately developed pieces of code, though I can't speak from experience on this last point. But there have been derivations of tail-recursive code from the general form (don't have any reference at hand right now). Claus
participants (7)
-
Claus Reinke
-
Edward Z. Yang
-
Grigory Sarnitskiy
-
Luke Palmer
-
Patrick Browne
-
Vo Minh Thu
-
wren ng thornton