Formal verification of containers

older
Preventing double-free error with...

Joachim Breitner

1 Feb 2018 1 Feb '18

4:17 a.m.

Hi containers maintainers, you might have already read it between the lines, but some people here at Penn are currently using hs-to-coq to translate the containers library to Coq and verify it. We just reached a milestone by verifying a portion of Data.IntSet large enough to instantiate Coq’s FSetInterface: https://github.com/antalsz/hs-to-coq/blob/master/examples/containers/theorie... There is work on verifying Data.Set as well: https://github.com/antalsz/hs-to-coq/blob/master/examples/containers/theorie... Isn’t that exciting? It raises the question whether there is value in integrating these proofs in some way into the development process of containers. One could imagine adding a job to travis that translates container’s master into Coq and checks if the proofs go through. On the one hand, that’s of course great, because it’d make containers one of the most thoroughly verified pieces of Haskell in use out there. On the other hand, we’d have to choose the least bad of a bunch of bad options when it comes to future contributions (I am listing unfeasible options as well): - we essentially freeze the code - we’d lose the verification again quickly, once someone makes a non- trivial change to the code - we’d have to restrict contributions to those who are able to update the proofs - we’d still welcome contributions that break the proofs, but let them sit in pull requests until someone updates the proofs on the PR and then merge - (maybe a bit far fetched, but who knows) we get funding to pay someone to keep the proofs in sync, without slowing down other contributions I honestly don’t know which of these are the most viable, or if none of them make any sense and we should not hook up the verification with container’s development. What do you think? Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

Attachments:

signature.asc (application/pgp-signature — 833 bytes)

Show replies by date

David Feuer

1 Feb 1 Feb

5:02 a.m.

I don't think we should limit containers development like that. I do imagine it would be valuable to have a separate package, perhaps called verified-containers, that tries to stay a few weeks or months behind containers. On Jan 31, 2018 11:18 PM, "Joachim Breitner" wrote:

...

Hi containers maintainers,

you might have already read it between the lines, but some people here at Penn are currently using hs-to-coq to translate the containers library to Coq and verify it. We just reached a milestone by verifying a portion of Data.IntSet large enough to instantiate Coq’s FSetInterface: https://github.com/antalsz/hs-to-coq/blob/master/examples/ containers/theories/IntSetProofs.v

There is work on verifying Data.Set as well: https://github.com/antalsz/hs-to-coq/blob/master/examples/ containers/theories/SetProofs.v

Isn’t that exciting?

It raises the question whether there is value in integrating these proofs in some way into the development process of containers. One could imagine adding a job to travis that translates container’s master into Coq and checks if the proofs go through.

On the one hand, that’s of course great, because it’d make containers one of the most thoroughly verified pieces of Haskell in use out there.

On the other hand, we’d have to choose the least bad of a bunch of bad options when it comes to future contributions (I am listing unfeasible options as well):

- we essentially freeze the code - we’d lose the verification again quickly, once someone makes a non- trivial change to the code - we’d have to restrict contributions to those who are able to update the proofs - we’d still welcome contributions that break the proofs, but let them

sit in pull requests until someone updates the proofs on the PR and then merge - (maybe a bit far fetched, but who knows) we get funding to pay someone to keep the proofs in sync, without slowing down other contributions

I honestly don’t know which of these are the most viable, or if none of them make any sense and we should not hook up the verification with container’s development.

What do you think?

Cheers, Joachim

-- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Henning Thielemann

10:42 a.m.

On Thu, 1 Feb 2018, David Feuer wrote:

...

I don't think we should limit containers development like that. I do imagine it would be valuable to have a separate package, perhaps called verified-containers, that tries to stay a few weeks or months behind containers.

I am happy to change all my dependencies from 'containers' to 'containers-verified' (for easier association in a lexicographically ordered list).

Joachim Breitner

2:15 p.m.

Hi Am Donnerstag, den 01.02.2018, 11:42 +0100 schrieb Henning Thielemann:

...

On Thu, 1 Feb 2018, David Feuer wrote:

...
I don't think we should limit containers development like that. I do imagine it would be valuable to have a separate package, perhaps called verified-containers, that tries to stay a few weeks or months behind containers.

I am happy to change all my dependencies from 'containers' to 'containers-verified' (for easier association in a lexicographically ordered list).

that is an interesting idea, and a nicely generalizable pattern, David. I guess there are a view variants: Should containers-verified export only the verified subset of the API, or all of it? The former is more honest, the latter easier to switch to for users. Should it a) simply depend on a particular version of containers, and re-export the functions, or b) be a real code copy, or c) be a code copy with the exception of the data types, so that you can interoperate with the existing containers library? b (maybe even with hiding the ….Internals module) prevents users from creating non-well-formed trees, but it would prevent you from using containers-verified in a library that exports the types as part of its API. Maybe one could add “toVerified” and “fromVerified” functions that convert between the two, with run-time checks? Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

Henning Thielemann

2:35 p.m.

On Thu, 1 Feb 2018, Joachim Breitner wrote:

...

I guess there are a view variants:

Should containers-verified export only the verified subset of the API, or all of it? The former is more honest, the latter easier to switch to for users.

... then the question: Should containers-verified export Data.Map or Data.Map.Verified? If it exports Data.Map but not all other modules of 'containers', then there would be no way to access all modules of 'containers'. So maybe containers-verified should export Data.Map.Verified. But then you can as well add Data.Map.Verified to 'containers'.

Joachim Breitner

4:09 p.m.

Hi, Am Donnerstag, den 01.02.2018, 15:35 +0100 schrieb Henning Thielemann:

...

On Thu, 1 Feb 2018, Joachim Breitner wrote:

...
I guess there are a view variants:

Should containers-verified export only the verified subset of the API, or all of it? The former is more honest, the latter easier to switch to for users.

... then the question: Should containers-verified export Data.Map or Data.Map.Verified? If it exports Data.Map but not all other modules of 'containers', then there would be no way to access all modules of 'containers'. So maybe containers-verified should export Data.Map.Verified.

Valid point. I guess we should just verify all of it :-) There is the option of using package-qualified imports! https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts... The question would be whether we put a little burden on those users who need to mix-and-match, or on those who would be happy with the portion of the API provided by containers-verified. Not renaming also makes it easier to merge changes from containers, but not by a lot.

...

But then you can as well add Data.Map.Verified to 'containers'.

Not really, the main reason to split it off is to not impose the burden of the verification infrastructure on container’s maintainers and/or contributors. Cheers, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de https://www.joachim-breitner.de/

David Feuer

4:58 p.m.

containers-verified should only export the verified part, for sure. I think it should mostly re-export, but if it needs to use a different version of a function (perhaps temporarily) then it can do so. For example, if containers makes a function faster at the expense of some complication, then containers-verified may want to stick with the old version until the new one can be verified (or someone can come up with a simpler fast one). On Feb 1, 2018 9:15 AM, "Joachim Breitner" wrote:

...

Hi

Am Donnerstag, den 01.02.2018, 11:42 +0100 schrieb Henning Thielemann:

...
On Thu, 1 Feb 2018, David Feuer wrote:

...
I don't think we should limit containers development like that. I do imagine it would be valuable to have a separate package, perhaps called verified-containers, that tries to stay a few weeks or months behind containers.

I am happy to change all my dependencies from 'containers' to 'containers-verified' (for easier association in a lexicographically ordered list).

that is an interesting idea, and a nicely generalizable pattern, David.

I guess there are a view variants:

Should containers-verified export only the verified subset of the API, or all of it? The former is more honest, the latter easier to switch to for users.

Should it a) simply depend on a particular version of containers, and re-export the functions, or b) be a real code copy, or c) be a code copy with the exception of the data types, so that you can interoperate with the existing containers library?

b (maybe even with hiding the ….Internals module) prevents users from creating non-well-formed trees, but it would prevent you from using containers-verified in a library that exports the types as part of its API. Maybe one could add “toVerified” and “fromVerified” functions that convert between the two, with run-time checks?

Cheers, Joachim

-- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Tom Murphy

2 Feb 2 Feb

2:50 a.m.

(My 2 cents, not as a containers developer) First of all, bravo! This is great news. On Wed, Jan 31, 2018 at 11:17 PM, Joachim Breitner

...

wrote:

...

- we’d still welcome contributions that break the proofs, but let them sit in pull requests until someone updates the proofs on the PR and then merge

If we imagine, as David does, that updating proofs will take a couple of weeks (or months), I'd imagine an advantage of this option is that: - The percent of containers users who only use features older than weeks or months from bleeding edge: 95+% - The percent of end-users who will realistically switch their code from containers to containers-verified: certainly less than 20% If we can seamlessly switch the vast majority of users to this new verified code, and be able to say that a ubiquitous component of Haskell development is formally verified, I'd say that's a huge accomplishment. An alternative proposal would be to have a "containers-unverified" *upstream* instead of a "containers-verified" downstream. This would also neatly sidestep the problem of if "containers-verified" should export unverified code: it's just "containers" as usual, just with some extra verification. Tom

David Feuer

3:03 a.m.

It sounds nice, but keeping the verified version up to date requires a steady supply of very highly skilled labor. I personally don't think it reasonable to even consider making containers itself the verified version unless the verifiers are able to 1. Stay fairly up to date for a good while (say five or six years, after the current researchers have gotten their degrees/professorships), AND 2. At the end of that period, explain their plan for continuing the effort. On Feb 1, 2018 9:50 PM, "Tom Murphy" wrote:

...

(My 2 cents, not as a containers developer)

First of all, bravo! This is great news.

On Wed, Jan 31, 2018 at 11:17 PM, Joachim Breitner < mail@joachim-breitner.de> wrote:

...
- we’d still welcome contributions that break the proofs, but let them sit in pull requests until someone updates the proofs on the PR and then merge

If we imagine, as David does, that updating proofs will take a couple of weeks (or months), I'd imagine an advantage of this option is that: - The percent of containers users who only use features older than weeks or months from bleeding edge: 95+% - The percent of end-users who will realistically switch their code from containers to containers-verified: certainly less than 20%

If we can seamlessly switch the vast majority of users to this new verified code, and be able to say that a ubiquitous component of Haskell development is formally verified, I'd say that's a huge accomplishment.

An alternative proposal would be to have a "containers-unverified" *upstream* instead of a "containers-verified" downstream. This would also neatly sidestep the problem of if "containers-verified" should export unverified code: it's just "containers" as usual, just with some extra verification.

Tom

_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

Gershom B

3:17 a.m.

In the initial message, Joachim suggested “One could imagine adding a job to travis that translates container’s master into Coq and checks if the proofs go through.” It would seem to me that an ideal situation would be if there were “proof-tests” similar to doctests. Essentially, portions of code that have corresponding Coq proofs about them have an annotation that indicates the proofs, and indicates to the automated job that they are to be tested against those proofs. When changes are made that do not have corresponding updates to the proofs, then the corresponding annotations would need to be commented-out (or perhaps, replaced with some other annotation that explicitly indicates a skipped or missing proof). This would let the proper containers core be verified, with mechanically-checked annotations indicating where the verification hasd and has not been performaed. Some sort of re-exported subset of only the verified stuff could then be automatically extracted from this setup for users that wanted to restrict themselves to only this subset. —g

David Feuer

3:25 a.m.

Oh, I'm perfectly fine with *checking* everything. But to give the users who want verification a consistent API, the verified package will probably have to include old or modified versions of some functions or even modules. On Feb 1, 2018 10:17 PM, "Gershom B" wrote:

...

In the initial message, Joachim suggested “One could imagine adding a job to travis that translates container’s master into Coq and checks if the proofs go through.”

It would seem to me that an ideal situation would be if there were “proof-tests” similar to doctests. Essentially, portions of code that have corresponding Coq proofs about them have an annotation that indicates the proofs, and indicates to the automated job that they are to be tested against those proofs. When changes are made that do not have corresponding updates to the proofs, then the corresponding annotations would need to be commented-out (or perhaps, replaced with some other annotation that explicitly indicates a skipped or missing proof).

This would let the proper containers core be verified, with mechanically-checked annotations indicating where the verification hasd and has not been performaed. Some sort of re-exported subset of only the verified stuff could then be automatically extracted from this setup for users that wanted to restrict themselves to only this subset.

—g

Joachim Breitner

4:01 a.m.

Hi, Am Donnerstag, den 01.02.2018, 22:25 -0500 schrieb David Feuer:

...

Oh, I'm perfectly fine with *checking* everything. But to give the users who want verification a consistent API, the verified package will probably have to include old or modified versions of some functions or even modules.

at the moment, we don’t have to modify containers. To be precise, we don’t have to modify it’s source. The translation to Coq does a few modification, such as highestBitMask, which uses raw pointers, which we cannot translate. I don’t expect that this will change. But it is true that if containers changes their code in a way that breaks the proofs (which is not every change – we work on what GHC sees after parsing and renaming, so a fair amount of changes might not even affect the proofs), then containers- verified would have to stick with the old version. A viable approach might be a containers-verified package that depends on a precise version of containers and re-exports the verified subset of containers. It might lag a few versions behind, but for something stable like containers, that might still be useful for people who want bragging rights about using verified code. Practically, the verification has not uncovered any bugs yet (unless one counts the incomplete implementation of `valid` for IntSet, but that only affects the test suite), so users will not gain too much… Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

Joachim Breitner

8 Feb 8 Feb

11:21 p.m.

Hi, Am Donnerstag, den 01.02.2018, 23:01 -0500 schrieb Joachim Breitner:

...

A viable approach might be a containers-verified package that depends on a precise version of containers and re-exports the verified subset of containers. It might lag a few versions behind, but for something stable like containers, that might still be useful for people who want bragging rights about using verified code.

here is how this could look like: https://hackage.haskell.org/package/containers-verified-0.5.11.0/candidate https://github.com/nomeata/containers-verified I hope that we can extend the covered API somemore in the next weeks :-) Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

David Feuer

11:50 p.m.

Interesting! I look forward to seeing more. One possible twist would be to wrap the re-exported types in newtype wrappers to hide the unverified instances, perhaps in a separate Wrapped module. But that might be more trouble than it's worth. When you get to it, there are some hard-coded size limits in Data.Map.alterF for word sizes below 61 bits that I would *love* to see checked formally; I trust those less than pretty much anything else in the package, and testing them is hard. On Feb 8, 2018 6:21 PM, "Joachim Breitner" wrote:

...

Hi,

Am Donnerstag, den 01.02.2018, 23:01 -0500 schrieb Joachim Breitner:

...
A viable approach might be a containers-verified package that depends on a precise version of containers and re-exports the verified subset of containers. It might lag a few versions behind, but for something stable like containers, that might still be useful for people who want bragging rights about using verified code.

here is how this could look like: https://hackage.haskell.org/package/containers-verified-0.5.11.0/candidate https://github.com/nomeata/containers-verified

I hope that we can extend the covered API somemore in the next weeks :-)

Cheers, Joachim

-- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

2701

Age (days ago)

2708

Last active (days ago)

List overview

Download

13 comments

5 participants

participants (5)

David Feuer
Gershom B
Henning Thielemann
Joachim Breitner
Tom Murphy

Formal verification of containers

tags

participants (5)