Verifying Haskell programs vs. generating verified Haskell programs

Hello Haskell fellows, recently there has been a huge progress in generating real programs by specifying them in interactive theorems prover like Isabelle or Coq, in particular a verified C Compiler has been generated out of a specification in Coq [0]. Furthermore it is often stated, that it is easier to reason about functional languages. So in my opinion there are two approaches building verified software (e.g. in case of compilers: verify that the semantics of a high-level language maps correctly to the semantics of a low-level language). (0) Programming in any language and then verifying it with an external theorem prover (1) Specifying the program in a proof language and then generating program code out of it (like in the CompCert project) I think both ideas have their (dis)advantages. The question is, which concept will be used more frequently? In particular how hard would it be to build something like CompCert the other way around (in this case writing a C compiler in SML and proving its correctness?) Best regards, Uli Kastlunger [0] http://compcert.inria.fr/

On Tue, Jun 21, 2011 at 3:31 AM, Uli Kastlunger
Hello Haskell fellows,
recently there has been a huge progress in generating real programs by specifying them in interactive theorems prover like Isabelle or Coq, in particular a verified C Compiler has been generated out of a specification in Coq [0]. Furthermore it is often stated, that it is easier to reason about functional languages. So in my opinion there are two approaches building verified software (e.g. in case of compilers: verify that the semantics of a high-level language maps correctly to the semantics of a low-level language).
(0) Programming in any language and then verifying it with an external theorem prover (1) Specifying the program in a proof language and then generating program code out of it (like in the CompCert project)
I think both ideas have their (dis)advantages. The question is, which concept will be used more frequently? In particular how hard would it be to build something like CompCert the other way around (in this case writing a C compiler in SML and proving its correctness?)
There's a second (haha) approach, which I use basically every day. Use the typing language fragment from a strongly typed programming language to express a specification, and then rely on "free" functions/theorems and the Howard-Curry isomorphism theorem to guarantee correctness of implementation relative to the specification.

On Tue, Jun 21, 2011 at 3:31 AM, Uli Kastlunger
Hello Haskell fellows,
recently there has been a huge progress in generating real programs by specifying them in interactive theorems prover like Isabelle or Coq, in particular a verified C Compiler has been generated out of a specification in Coq [0]. Furthermore it is often stated, that it is easier to reason about functional languages. So in my opinion there are two approaches building verified software (e.g. in case of compilers: verify that the semantics of a high-level language maps correctly to the semantics of a low-level language).
(0) Programming in any language and then verifying it with an external theorem prover (1) Specifying the program in a proof language and then generating program code out of it (like in the CompCert project)
I think both ideas have their (dis)advantages. The question is, which concept will be used more frequently? In particular how hard would it be to build something like CompCert the other way around (in this case writing a C compiler in SML and proving its correctness?)
Proving the correctness of a C compiler is quite a challenge regardless of how you write the compiler. The reason there is that you need a well-defined version of C so that you have a spec for the C compiler that can be used in formal methods approaches. Assuming you've done that, then it seems like SML would lend itself just fine to being your implementation language. Tools like Isabelle/HOL could be used as your external theorem prover. If your compiler is fully verified then that's great news but you could still write buggy programs with it. Likewise, you could have a proven your program is correct with respect to the formal semantics of the language and your program specification but then used a buggy compiler. So it seems to me that not only do the two approaches you outline have different advantages but they are solving different and closely related problems. I'm no expert with compcert, but as I understand it their approach is to only do semantic preserving changes to the program at each step in the translation from C source to binary. I'm not sure what they used as their specification of C and it seems like the correctness of their approach would still depend on a spec. I hope that helps, Jason

2011/6/21 Jason Dagit
Hello Haskell fellows,
recently there has been a huge progress in generating real programs by specifying them in interactive theorems prover like Isabelle or Coq, in particular a verified C Compiler has been generated out of a specification in Coq [0]. Furthermore it is often stated, that it is easier to reason about functional languages. So in my opinion there are two approaches building verified software (e.g. in case of compilers: verify that the semantics of a high-level language maps correctly to the semantics of a low-level language).
(0) Programming in any language and then verifying it with an external theorem prover (1) Specifying the program in a proof language and then generating
On Tue, Jun 21, 2011 at 3:31 AM, Uli Kastlunger
wrote: program code out of it (like in the CompCert project)
I think both ideas have their (dis)advantages. The question is, which concept will be used more frequently? In particular how hard would it be to build something like CompCert the other way around (in this case writing a C compiler in SML and proving its correctness?)
Proving the correctness of a C compiler is quite a challenge regardless of how you write the compiler. The reason there is that you need a well-defined version of C so that you have a spec for the C compiler that can be used in formal methods approaches.
Assuming you've done that, then it seems like SML would lend itself just fine to being your implementation language. Tools like Isabelle/HOL could be used as your external theorem prover.
If your compiler is fully verified then that's great news but you could still write buggy programs with it. Likewise, you could have a proven your program is correct with respect to the formal semantics of the language and your program specification but then used a buggy compiler. So it seems to me that not only do the two approaches you outline have different advantages but they are solving different and closely related problems.
I'm no expert with compcert, but as I understand it their approach is to only do semantic preserving changes to the program at each step in the translation from C source to binary. I'm not sure what they used as their specification of C and it seems like the correctness of their approach would still depend on a spec.
I hope that helps, Jason
First, a small correction to CompCert: I got this wrong, it generates Caml Code not SML Code. You are right, to do a verified Compiler, you need a specified Version of C and also a specified Version of you target assembly language. But the question is, if it is then more applicable to write your Compiler in a language like Haskell and then verify it, or if you do it the other way around, use a specification language which can generate verified code. br, uli

I'm no expert with compcert, but as I understand it their approach is to only do semantic preserving changes to the program at each step in the translation from C source to binary. I'm not sure what they used as their specification of C and it seems like the correctness of their approach would still depend on a spec.
The CompCert C compiler's verified pathway starts at C-Light, passes through a number of intermediate languages, and ends at ARM/Power PC assembly code. The existence of a reliable assembler for Power PC or ARM assembler is assumed, as is the correctness of the CIL tool which transforms C code into C-Light. Blazy and Leroy produced an operational semantics of C-Light in [1]. Naturally, you can then ask how reliable a model of a "stripped-down C" C-Light really is. I believe Regehr's failure to find any bugs in the verified "middle-end" of the CompCert compiler goes some way to clearing that up (he did find bugs in the unverified front-end, though) [2]! Note, in regards to CompCert, what you mean by "semantics preserving" is actually preservation of a program's extensional semantics. There's no guarantee that the CompCert C compiler will not transform an efficient piece of C code into some computational atrocity, albeit with the same extensional "meaning", at the assembly level. We consider this problem in CerCo [3], a related project to CompCert, which uses essentially the same intermediate languages as CompCert (including starting at C-Light), but also proving that each transformation step also preserves, or improves, the concrete complexity of the code under transformation.
There's a second (haha) approach, which I use basically every day.
Use the typing language fragment from a strongly typed programming language to express a specification, and then rely on "free" functions/theorems and the Howard-Curry isomorphism theorem to guarantee correctness of implementation relative to the specification.
How does this count as a distinct approach to the problem? It's essentially what happens when you verify a program in Coq. Further, there's much too sharp a distinction in the OP's mind between constructing a verified program in a proof assistant and verifying an existing program. Every large scale verification effort starts out with a prototype written in a high-level language, as far as I can tell. seL4's verification started with a Haskell prototype, IIRC, and CerCo's compiler started as an O'Caml prototype, before we began translating it into Matita's internal language [4]. CompCert, AFAIK, did exactly the same thing as we do, using an O'Caml prototype. It is *much* cheaper to make large scale design changes in a prototype written in a high-level programming language than it is in the internal language of a proof assistant. [1]: http://gallium.inria.fr/~xleroy/publi/Clight.pdf http://gallium.inria.fr/%7Exleroy/publi/Clight.pdf [2]: http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf http://www.cs.utah.edu/%7Eregehr/papers/pldi11-preprint.pdf [3]: http://cerco.cs.unibo.it [4]: http://matita.cs.unibo.it

On Wed, Jun 22, 2011 at 8:11 AM, Dominic Mulligan < dominic.p.mulligan@googlemail.com> wrote:
There's a second (haha) approach, which I use basically every day.
Use the typing language fragment from a strongly typed programming language to express a specification, and then rely on "free" functions/theorems and the Howard-Curry isomorphism theorem to guarantee correctness of implementation relative to the specification.
How does this count as a distinct approach to the problem? It's essentially what happens when you verify a program in Coq.
Further, there's much too sharp a distinction in the OP's mind between constructing a verified program in a proof assistant and verifying an existing program.
Yes, I agree about your further point. And if we agree there is little-to-no distinction between using an external tool and an internal sub-language, my point becomes even weaker. But I do think we can agree there is some difference between a total language (i.e., a proof assistant) versus a partial language with strong typing (like Haskell) versus a memory-poking-and-peeking magma (like C). My point was that we don't necessarily have to go for a total language to get logical proof. We can instead rely on derivable/free functions for most of the verification, and paper-and-pencil proof/proof by inspection/etc. for the rest.
participants (4)
-
Alexander Solla
-
Dominic Mulligan
-
Jason Dagit
-
Uli Kastlunger