Enforcing data structures invariant in the type system

Hi all. I’m very interested in what can be done with Haskell’s type system regarding the enforcing of algorithms and data structures invariants. For example, I remember I saw a paper about an implementation of a red-black tree in Haskell where the types guaranteed the invariant about the alternation of rad and black nodes. I would like to learn those techniques, in particular the use of language features that enable this kind of things and then how to do them in practice. In the past tried hacking something with GADTs and Data Kinds but I've always got stuck with GHC errors that I could not understand, so I think I’m missing something. Where could I start from? Are there simple walk-through examples and/or fundational material? For example is it possible to write a sorting function that guarantees in the types that the output is sorted? I don’t need this specific example but you got the point. Thank you very much, Nicola

Hello, First of all, if you don't mind me not answering your question directly, I'd strongly suggest learning Agda or Idris before you start using dependent types in Haskell. (for example try the first two tutorials from here: http://wiki.portal.chalmers.se/agda/pmwiki.php?n=Main.Othertutorials) These are true dependently typed languages, and while GHC's extensions can get Haskell remarkably close, the result is still somewhat clunky. IMHO learning the basics in a cleaner environment is going to be much easier. Secondly, have a look at Liquid Haskell: http://ucsd-progsys.github.io/liquidhaskell-tutorial/01-intro.html Refinement types are a restricted kind of dependent types that can be solved automagically by SMT solvers. Thirdly, as for your actual question, have a look at this series of tutorials: https://www.fpcomplete.com/user/konn/prove-your-haskell-for-great-safety/dep... Best regards, Marcin Mrotek

Il giorno 30/mar/2015, alle ore 19:14, Marcin Mrotek
ha scritto: Hello,
Hi!
First of all, if you don't mind me not answering your question directly, I'd strongly suggest learning Agda or Idris before you start using dependent types in Haskell. (for example try the first two tutorials from here: http://wiki.portal.chalmers.se/agda/pmwiki.php?n=Main.Othertutorials) These are true dependently typed languages, and while GHC's extensions can get Haskell remarkably close, the result is still somewhat clunky. IMHO learning the basics in a cleaner environment is going to be much easier.
I appreciate this suggestion! Agda has been on the todo list for a while. Which do you suggest to start from, choosing between agda and idris? Are there any major differences?
Secondly, have a look at Liquid Haskell: http://ucsd-progsys.github.io/liquidhaskell-tutorial/01-intro.html Refinement types are a restricted kind of dependent types that can be solved automagically by SMT solvers.
I’ve read a paper about liquid haskell some time ago and I find it amazing. What I’d like to ask now is: is liquid haskell strictly more expressive than the haskell type system or is “only” a more convenient way of writing the constraints in a way that can be efficiently solved?
Thirdly, as for your actual question, have a look at this series of tutorials: https://www.fpcomplete.com/user/konn/prove-your-haskell-for-great-safety/dep...
Thank you! Those are exactly the droids I was looking for.
Best regards, Marcin Mrotek
Greetings, Nicola

Hello,
I appreciate this suggestion! Agda has been on the todo list for a while. Which do you suggest to start from, choosing between agda and idris? Are there any major differences?
I admit I only have experience in using Agda, I've name-checked Idris after only reading the docs (the last time I tried to install the compiler, cabal didn't let mi do it because of package conflits; now I tried again and it's doing just fine). Idris is newer and, apparently, more oriented towards practical programming than theorem proving, that's all I know.
I’ve read a paper about liquid haskell some time ago and I find it amazing. What I’d like to ask now is: is liquid haskell strictly more expressive than the haskell type system or is “only” a more convenient way of writing the constraints in a way that can be efficiently solved?
It depends on what you mean by "Haskell's type system". It's more capable than Haskell's type system without extensions, and less than full-on dependent typing (delibrately so, because dependent type equivalence is undecidable). I don't know where exactly does Haskell-with-GHC-extensions type system lie. I think it's the same as, let's say, Agda's, only requiring more kludges when moving back and forth between type- and value-level, but I'm not sure. I'm just a "self"-taught programmer, not a computer scientist ;) Best regards, Marcin Mrotek

I thought I'd weigh in here, as my dissertation is all about how to encoding invariants into Haskell's type system better. Here are some reactions to this thread: - In my opinion, Haskell is very capable of expressing invariants in its type system today -- you don't need to go to Idris or Agda to get there. In certain cases, this encoding can be quite natural and unobtrusive, and sometimes easier to work with than in full-on dependently typed languages. The problem is that it's very hard for a beginner in this space to tell which invariants are easy to encode and which are hard. The red-black tree example is a great one for an "easy-to-encode" invariant. The "Hasochism" paper [1] shows another. On the other hand, my two (successful) attempts at "proving" a sorting algorithm correct are very painful (insertion sort [2] and merge sort [3]). I'm sure one component in the difference in level of elegance is that Conor and Sam are better dependently-typed programmers than I am! [1]: https://personal.cis.strath.ac.uk/conor.mcbride/pub/hasochism.pdf [2]: https://github.com/goldfirere/singletons/blob/master/tests/compile-and-dump/... [3]: https://github.com/goldfirere/nyc-hug-oct2014/blob/master/OrdList.hs - The `singletons` package [4], can be helpful in doing dependently typed programming in Haskell, and is cited in references previously mentioned in this thread. But I wouldn't rely on it too much -- once you need lots of singletons (more than, say, for numbers and Bools), your code will get very bogged down. It's either time to come up with a different way of encoding for your invariant, give up on compile-time verification, or switch to Idris. [4]: http://hackage.haskell.org/package/singletons - The error messages are verbose. Haskell really needs a more interactive mode for dependently typed programming! I'll answer one oft-answered question here: A "skolem" is just a *bound* type variable. This is different from other type variables GHC uses during type inference, which are free, *unification* type variables. The point of a unification variable is as a placeholder until GHC figures out the right type. A skolem variable is one that is actually bound in the source code. Here's a tiny example:
data Exists where MkE :: forall a. a -> Exists
foo :: Exists -> Bool foo (MkE x) = not x
This reports an error about a skolem(*) variable a not being Bool. Of course, we can't know, in the body of `foo`, that the type of `x` is `Bool` -- `x` has type `a`, for some unknown `a`. In this case `a` is a skolem.
(*): Sadly, the error message (in 7.8.3) just says "rigid" without "skolem". But the choice of words in the error message is a little capricious, and other similar cases can say "skolem".
- Liquid Haskell goes a very different way than do all of the approaches above. It uses a tool, called an SMT solver, outside of GHC to verify annotations in code. Currently, Liquid Haskell can verify conditions that other approaches can only dream of, including in Agda and Idris. (In particular, LH is very good around integers, something that is otherwise a challenge to reason about in types.) But, LH has two key limitations:
1) It reasons only about refinement types, which are restrictions on existing types. Examples are "odd integers", "lists of Bools with either 2 or 3 elements", "integers equal to 42", and "an integer `y` that is bigger than that other integer `x`". Often, this is exactly what you want. But sometimes it's not.
2) Its power is limited by the solving power of the attached SMT solver. SMT solvers can reason about a lot, but -- of course -- they can't do everything. If the SMT solver can't figure out your code, it's very hard to know what to do to make things work again. However, this rarely comes up in practice.
I, personally, look forward to a future (years away) where we can have full dependent types that also use an SMT solver. This will make the easy things easy, but the hard things possible.
Richard
On Mar 30, 2015, at 3:01 PM, Marcin Mrotek
Hello,
I appreciate this suggestion! Agda has been on the todo list for a while. Which do you suggest to start from, choosing between agda and idris? Are there any major differences?
I admit I only have experience in using Agda, I've name-checked Idris after only reading the docs (the last time I tried to install the compiler, cabal didn't let mi do it because of package conflits; now I tried again and it's doing just fine). Idris is newer and, apparently, more oriented towards practical programming than theorem proving, that's all I know.
I’ve read a paper about liquid haskell some time ago and I find it amazing. What I’d like to ask now is: is liquid haskell strictly more expressive than the haskell type system or is “only” a more convenient way of writing the constraints in a way that can be efficiently solved?
It depends on what you mean by "Haskell's type system". It's more capable than Haskell's type system without extensions, and less than full-on dependent typing (delibrately so, because dependent type equivalence is undecidable). I don't know where exactly does Haskell-with-GHC-extensions type system lie. I think it's the same as, let's say, Agda's, only requiring more kludges when moving back and forth between type- and value-level, but I'm not sure. I'm just a "self"-taught programmer, not a computer scientist ;)
Best regards, Marcin Mrotek _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Il giorno 31/mar/2015, alle ore 15:41, Richard Eisenberg
ha scritto: I thought I'd weigh in here, as my dissertation is all about how to encoding invariants into Haskell's type system better. Here are some reactions to this thread:
Hi! is you dissertation available somewhere (or is still work in progress)?
- In my opinion, Haskell is very capable of expressing invariants in its type system today -- you don't need to go to Idris or Agda to get there. In certain cases, this encoding can be quite natural and unobtrusive, and sometimes easier to work with than in full-on dependently typed languages. The problem is that it's very hard for a beginner in this space to tell which invariants are easy to encode and which are hard. The red-black tree example is a great one for an "easy-to-encode" invariant. The "Hasochism" paper [1] shows another. On the other hand, my two (successful) attempts at "proving" a sorting algorithm correct are very painful (insertion sort [2] and merge sort [3]). I'm sure one component in the difference in level of elegance is that Conor and Sam are better dependently-typed programmers than I am!
[1]: https://personal.cis.strath.ac.uk/conor.mcbride/pub/hasochism.pdf [2]: https://github.com/goldfirere/singletons/blob/master/tests/compile-and-dump/... [3]: https://github.com/goldfirere/nyc-hug-oct2014/blob/master/OrdList.hs
Very useful material, thanks!
- The `singletons` package [4], can be helpful in doing dependently typed programming in Haskell, and is cited in references previously mentioned in this thread. But I wouldn't rely on it too much -- once you need lots of singletons (more than, say, for numbers and Bools), your code will get very bogged down. It's either time to come up with a different way of encoding for your invariant, give up on compile-time verification, or switch to Idris.
I have still to grasp where singletons are required and when they aren’t. The paper you linked above helps somehow.
- The error messages are verbose. Haskell really needs a more interactive mode for dependently typed programming! I'll answer one oft-answered question here: A "skolem" is just a *bound* type variable. This is different from other type variables GHC uses during type inference, which are free, *unification* type variables. The point of a unification variable is as a placeholder until GHC figures out the right type. A skolem variable is one that is actually bound in the source code. Here's a tiny example:
data Exists where MkE :: forall a. a -> Exists
foo :: Exists -> Bool foo (MkE x) = not x
This reports an error about a skolem(*) variable a not being Bool. Of course, we can't know, in the body of `foo`, that the type of `x` is `Bool` -- `x` has type `a`, for some unknown `a`. In this case `a` is a skolem.
(*): Sadly, the error message (in 7.8.3) just says "rigid" without "skolem". But the choice of words in the error message is a little capricious, and other similar cases can say "skolem”.
Thank you for the explaination!
- Liquid Haskell goes a very different way than do all of the approaches above. It uses a tool, called an SMT solver, outside of GHC to verify annotations in code. Currently, Liquid Haskell can verify conditions that other approaches can only dream of, including in Agda and Idris. (In particular, LH is very good around integers, something that is otherwise a challenge to reason about in types.) But, LH has two key limitations: 1) It reasons only about refinement types, which are restrictions on existing types. Examples are "odd integers", "lists of Bools with either 2 or 3 elements", "integers equal to 42", and "an integer `y` that is bigger than that other integer `x`". Often, this is exactly what you want. But sometimes it's not. 2) Its power is limited by the solving power of the attached SMT solver. SMT solvers can reason about a lot, but -- of course -- they can't do everything. If the SMT solver can't figure out your code, it's very hard to know what to do to make things work again. However, this rarely comes up in practice.
I, personally, look forward to a future (years away) where we can have full dependent types that also use an SMT solver. This will make the easy things easy, but the hard things possible.
That would be an interesting future without doubts!
Richard
Thank you very much, Nicola

On Mar 31, 2015, at 12:46 PM, Nicola Gigante
Hi! is you dissertation available somewhere (or is still work in progress)?
It's work not-yet-in-progress. Though (I think) I have a clear idea of where it's going, I have yet to write a proposal. The proposal is due in about a month and a half, and I expect the work to be done by summer 2016. If things go well, expect more announcements in this space when there are things to announce. :) Richard

You're in the right ballpark that's for sure - and your experience with
hard-to-decode error messages is not a new one either! Haskell is somewhat
"experiemental" in this domain, as its only now starting to learn the
features necessary to do what you want to do.
I think some of the best work to learn the latest techniques is described
here: https://github.com/jstolarek/dep-typed-wbl-heaps-hs
-- ocharles
On Mon, Mar 30, 2015 at 6:00 PM, Nicola Gigante
Hi all.
I’m very interested in what can be done with Haskell’s type system regarding the enforcing of algorithms and data structures invariants.
For example, I remember I saw a paper about an implementation of a red-black tree in Haskell where the types guaranteed the invariant about the alternation of rad and black nodes.
I would like to learn those techniques, in particular the use of language features that enable this kind of things and then how to do them in practice. In the past tried hacking something with GADTs and Data Kinds but I've always got stuck with GHC errors that I could not understand, so I think I’m missing something.
Where could I start from? Are there simple walk-through examples and/or fundational material? For example is it possible to write a sorting function that guarantees in the types that the output is sorted? I don’t need this specific example but you got the point.
Thank you very much, Nicola
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Il giorno 30/mar/2015, alle ore 19:17, Oliver Charles
ha scritto: You're in the right ballpark that's for sure - and your experience with hard-to-decode error messages is not a new one either! Haskell is somewhat "experiemental" in this domain, as its only now starting to learn the features necessary to do what you want to do.
Having some compilers background I understand very well the difficulties in making sensible error messages, especially when dealing with such advanced features. The point was more that I don’t know the basics on how those features work (and what is involved in the job of the type checker) so I can’t even parse what GHC try to tell me (skolem what?).
I think some of the best work to learn the latest techniques is described here: https://github.com/jstolarek/dep-typed-wbl-heaps-hs https://github.com/jstolarek/dep-typed-wbl-heaps-hs
Thank you that looks very useful!
-- ocharles
Greetings, Nicola

Nicola Gigante
Il giorno 30/mar/2015, alle ore 19:17, Oliver Charles
ha scritto: You're in the right ballpark that's for sure - and your experience with hard-to-decode error messages is not a new one either! Haskell is somewhat "experiemental" in this domain, as its only now starting to learn the features necessary to do what you want to do.
Having some compilers background I understand very well the difficulties in making sensible error messages, especially when dealing with such advanced features. The point was more that I don’t know the basics on how those features work (and what is involved in the job of the type checker) so I can’t even parse what GHC try to tell me (skolem what?).
I think some of the best work to learn the latest techniques is described here: https://github.com/jstolarek/dep-typed-wbl-heaps-hs https://github.com/jstolarek/dep-typed-wbl-heaps-hs
Thank you that looks very useful!
-- ocharles
Greetings, Nicola _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
Stephanie Stephanie Weirich gave an invited talk about maintaining color and height invariants for Red-Black trees at ICFP this year [1]. Connor McBride had a talk and paper about also maintinging order invariants [2]. You might want to have a look at them. As for getting used to the error messages. I think the only way to learn is to get your hands dirty yourself. I can highly reccommend also just doing the red-black trees yourself. Getting insertions to work is not too difficult. Deletions are a bit harder. I also did so some time ago [3] (before seeing the ICFP talk even; I found it very cool that I had pretty much the same solution :)). Regards, [1] https://www.youtube.com/watch?v=rhWMhTjQzsU&index=22&list=PL4UWOFngo5DVBqifWX6ZlXJ7idxWaOP69 [2] https://www.youtube.com/watch?v=pNBPCnZEdSs [3] http://fstaals.net/RedBlackTree.html -- - Frank
participants (5)
-
Frank Staals
-
Marcin Mrotek
-
Nicola Gigante
-
Oliver Charles
-
Richard Eisenberg