Type checker for haskell-src-exts (was: Typechecking Using GHC API)

On Thu, Dec 15, 2011 at 11:07, Niklas Broberg wrote:
Envisioned: The function you ask for can definitely be written for haskell-src-exts, which I know you are currently using. I just need to complete my type checker for haskell-src-exts first. Which is not a small task, but one that has been started.
That's good to know! I presume it's something like Haskell98 to start with? I'd be even more impressed (and possibly also concerned for your health) if you were going to tackle all of the extensions! I've been trying to find a student to write a haskell-src-exts type checker for me. It should use a particular kind of mechanism though, using constraints similar to [1]. Then, I want to adapt that to do transformations. What approach are you using? Maybe I can somehow steal your work... ;) Regards, Sean [1] http://www.staff.science.uu.nl/~heere112/phdthesis/

What exactly are the hopes for such a type checker? I can understand
it being interesting as a research project, but as a realistic tools
there are two huge issues:
1. It's going to take a LOT of time to reach feature parity with
GHC's type checker.
2. Assuming that can be done, how is it going to be maintained and
kept up to date with GHC?
If it is going to be used as a development tool, both of these are a
major requirement. I haven't looked into the issues, but I'd expect
it would be more realistic (although definitely not trivial) to
translate from GHC's internal AST into an annotated haskell-src-exts
AST.
On 15 December 2011 16:33, Sean Leather
On Thu, Dec 15, 2011 at 11:07, Niklas Broberg wrote:
Envisioned: The function you ask for can definitely be written for haskell-src-exts, which I know you are currently using. I just need to complete my type checker for haskell-src-exts first. Which is not a small task, but one that has been started.
That's good to know! I presume it's something like Haskell98 to start with? I'd be even more impressed (and possibly also concerned for your health) if you were going to tackle all of the extensions!
I've been trying to find a student to write a haskell-src-exts type checker for me. It should use a particular kind of mechanism though, using constraints similar to [1]. Then, I want to adapt that to do transformations. What approach are you using? Maybe I can somehow steal your work... ;)
Regards, Sean
[1] http://www.staff.science.uu.nl/~heere112/phdthesis/
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Push the envelope. Watch it bend.

On Thu, Dec 15, 2011 at 6:24 PM, Thomas Schilling
What exactly are the hopes for such a type checker? I can understand it being interesting as a research project, but as a realistic tools there are two huge issues:
1. It's going to take a LOT of time to reach feature parity with GHC's type checker.
2. Assuming that can be done, how is it going to be maintained and kept up to date with GHC?
If it is going to be used as a development tool, both of these are a major requirement. I haven't looked into the issues, but I'd expect it would be more realistic (although definitely not trivial) to translate from GHC's internal AST into an annotated haskell-src-exts AST.
With all due respect, the sentiments you give voice to here are a large part of what drives me to do this project in the first place. Haskell is not GHC, and I think that the very dominant position of GHC many times leads to ill thought-through extensions. Since GHC has no competitor, often extensions (meaning behavior as denoted and delimited by some -X flag) are based on what is most convenient from the implementation point of view, rather than what would give the most consistent, logical and modular user experience (not to mention third-party-tool-implementor-trying-to-support-GHC-extensions experience). As such, I'm not primarily doing this project to get a development tool out, even if that certainly is a very neat thing. I'm just as much doing it to provide a Haskell (front-end) implementation that can serve as a better reference than GHC, one that very explicitly does *not* choose the convenient route to what constitutes an extension, and instead attempts to apply rigid consistency and modularity between extensions. Also, just like for haskell-src-exts I hope to build the type checker from the roots with the user interface as a core design goal, not as a tacked-on afterthought. Mind you, in no way do I intend any major criticism towards GHC or its implementors. GHC is a truly amazing piece of software, indeed it's probably my personal favorite piece of software of all times. That does not mean it comes free of quirks and shady corners though, and it is my hope that by doing what I do I can help shine a light on them. Answering your specific issues: 1) Yes, it's a lot of work. Probably not half as much as you'd think though, see my previous mail about walking in the footsteps of giants. But beyond that, I think it's work that truly needs to be done, for the greater good of Haskell. 2) Well, I think I've done a reasonably good job keeping haskell-src-exts up to date so far, even if the last year has been pretty bad in that regard (writing a PhD thesis will do that to you). I'll keep doing it for the type checker as well. But I am but one man, so if anyone else feels like joining the project then they are more than welcome to. Sort-of-3) Yes, both implementation and maintenance are likely going to be far more costly than the alternative to do a translation via the GHC API. I'm not interested in that alternative though. Cheers, /Niklas

On 16 December 2011 17:44, Niklas Broberg
With all due respect, the sentiments you give voice to here are a large part of what drives me to do this project in the first place. Haskell is not GHC, and I think that the very dominant position of GHC many times leads to ill thought-through extensions. Since GHC has no competitor, often extensions (meaning behavior as denoted and delimited by some -X flag) are based on what is most convenient from the implementation point of view, rather than what would give the most consistent, logical and modular user experience (not to mention third-party-tool-implementor-trying-to-support-GHC-extensions experience).
I agree. Various record proposals have been rejected because of the "not easily implementable in GHC" constraint. Of course, ease of implementation (and maintenance) is a valid argument, but it has an unusual weight if GHC is the (in practise) only implementation. Other extensions seem to just get added on (what feels like) a whim (e.g., RecordPuns).
As such, I'm not primarily doing this project to get a development tool out, even if that certainly is a very neat thing. I'm just as much doing it to provide a Haskell (front-end) implementation that can serve as a better reference than GHC, one that very explicitly does *not* choose the convenient route to what constitutes an extension, and instead attempts to apply rigid consistency and modularity between extensions. Also, just like for haskell-src-exts I hope to build the type checker from the roots with the user interface as a core design goal, not as a tacked-on afterthought.
Mind you, in no way do I intend any major criticism towards GHC or its implementors. GHC is a truly amazing piece of software, indeed it's probably my personal favorite piece of software of all times. That does not mean it comes free of quirks and shady corners though, and it is my hope that by doing what I do I can help shine a light on them.
Weeeell... I've gotten a little bit of a different perspective on this since working at a company with very high code quality standards (at least for new code). There is practically no observable code review happening. I'm sure Dimitrios and Simon PJ review most of each other's code every now and then, but overall there is very little code review happening (and no formal, recorded code review whatsoever). Cleaning up the GHC code base is a huge task -- it uses lots of dirty tricks (global variables, hash tables, unique generation is non-deterministic, ...) which often complicate efforts tremendously (I tried). The lack of a unit tests doesn't help (just rewriting code so that it can be tested would help quite a bit). Don't get me wrong, I certainly appreciate the work the GHC team is doing, but GHC strikes a fine balance between industrial needs and research needs. I'm just wondering whether the balance is always right.
Answering your specific issues:
1) Yes, it's a lot of work. Probably not half as much as you'd think though, see my previous mail about walking in the footsteps of giants. But beyond that, I think it's work that truly needs to be done, for the greater good of Haskell.
Right OutsideIn(X) (the Journal paper description, not the ICFP'09 version) seems like the right way to go. I wasn't aware of the other paper (the original work on bidirectional type inference seemed very unpredictable in terms of when type annotations are needed, so I'm looking forward to how this new paper handles things).
2) Well, I think I've done a reasonably good job keeping haskell-src-exts up to date so far, even if the last year has been pretty bad in that regard (writing a PhD thesis will do that to you). I'll keep doing it for the type checker as well. But I am but one man, so if anyone else feels like joining the project then they are more than welcome to.
Sort-of-3) Yes, both implementation and maintenance are likely going to be far more costly than the alternative to do a translation via the GHC API. I'm not interested in that alternative though.
Fair enough. As I am interested in building reliable (and maintainable) development tools my priorities are obviously different. For that purpose, using two different implementations can lead to very confusing issues for the user (that's why I was asking about bug compatibility). Apart from the bus factor, there is also the bitrotting issue due to GHC's high velocity. For example, even though HaRe does build again it doesn't support many commonly used GHC extensions and it is difficult to add them into the existing code base (which isn't pretty). Anyway, good luck with your endeavours. / Thomas -- Push the envelope. Watch it bend.

On Dec 17, 2011, at 9:58 AM, Thomas Schilling wrote:
Weeeell... I've gotten a little bit of a different perspective on this since working at a company with very high code quality standards (at least for new code). There is practically no observable code review happening. I'm sure Dimitrios and Simon PJ review most of each other's code every now and then, but overall there is very little code review happening (and no formal, recorded code review whatsoever). Cleaning up the GHC code base is a huge task -- it uses lots of dirty tricks (global variables, hash tables, unique generation is non-deterministic, ...) which often complicate efforts tremendously (I tried). The lack of a unit tests doesn't help (just rewriting code so that it can be tested would help quite a bit).
So in other words, would it be helpful it we recruited GHC janitors? That is, similar to how we have the Trac which gives people bug reports to pick out and work on, would it make sense to have a Trac or some other process which gives people chunks of code to clean up and/or make easier to test? (I am of course inspired in suggesting this by the Linux kernel janitors, though it doesn't look like the project has survived, and maybe that portends ill for trying to do the same for GHC...) Cheers, Greg

On 17 December 2011 05:39, Gregory Crosswhite
On Dec 17, 2011, at 9:58 AM, Thomas Schilling wrote:
Weeeell... I've gotten a little bit of a different perspective on this since working at a company with very high code quality standards (at least for new code). There is practically no observable code review happening. I'm sure Dimitrios and Simon PJ review most of each other's code every now and then, but overall there is very little code review happening (and no formal, recorded code review whatsoever). Cleaning up the GHC code base is a huge task -- it uses lots of dirty tricks (global variables, hash tables, unique generation is non-deterministic, ...) which often complicate efforts tremendously (I tried). The lack of a unit tests doesn't help (just rewriting code so that it can be tested would help quite a bit).
So in other words, would it be helpful it we recruited GHC janitors? That is, similar to how we have the Trac which gives people bug reports to pick out and work on, would it make sense to have a Trac or some other process which gives people chunks of code to clean up and/or make easier to test?
(I am of course inspired in suggesting this by the Linux kernel janitors, though it doesn't look like the project has survived, and maybe that portends ill for trying to do the same for GHC...)
I'm not sure that would work too well. GHC is a bit daunting to start with (it gets better after a while) and just cleaning up after other people is little fun. I would be more interested in setting up a process that enables a clean code base (and gradually cleans up existing shortcomings). Of course, I'd prefer to do rather than talk, so I'm not pushing this at this time. At the moment, I think we should: 1. Find a plan to get rid of the existing bigger design issues, namely: the use of global variables for static flags (may require extensive refactorings throughout the codebase), the use of nondeterministic uniques for symbols (may cost performance) 2. Build up a unit test suite (may include QuickCheck properties). The idea is that if our code needs to be tested from within Haskell (and not the just the command line) then that encourages a design that can be used better as a library. It may also catch some bugs earlier and make it easier to change some things. (Note: the high-level design of GHC is indeed very modular, but the implementation isn't so much.) 3. Set up a code review system. Every commit should have to go through code review -- even by core contributors. Even experienced developers don't produce perfect code all the time. Currently, we have some post-commit review. A possible code review system for Git is Gerrit. Of course, the GHC developers would need to get on board with this. As I said, I currently don't have the time to pursue this any further, but I'm planning to apply this to my projects as much as possible. / Thomas -- Push the envelope. Watch it bend.

On Thu, Dec 15, 2011 at 5:33 PM, Sean Leather
On Thu, Dec 15, 2011 at 11:07, Niklas Broberg wrote:
Envisioned: The function you ask for can definitely be written for haskell-src-exts, which I know you are currently using. I just need to complete my type checker for haskell-src-exts first. Which is not a small task, but one that has been started.
That's good to know! I presume it's something like Haskell98 to start with? I'd be even more impressed (and possibly also concerned for your health) if you were going to tackle all of the extensions!
Actually, no. Starting with Haskell98 would just lead to great pains later on, since many of the extensions require very invasive changes compared to a H98 checker. My starting point has been to identify the core structural and algorithmical requirements for a type checker that would indeed tackle "all" of the extensions. My current checker (which I wouldn't even call half-complete) is based on a merge of the algorithms discussed in [1] and [2]. Between the two, they give, on the one hand, the bidirectional inference needed to handle arbitrary-rank types, and, on the other hand, the power of local assumptions needed to handle GADTs and type families. I thank you for your concerns for my health. However, I assure you they are quite misplaced. Typing Haskell-with-extensions in Haskell is not only a far smaller beast than what it's made out to be, when walking in the well-documented footsteps of giants. It is also lots of fun. :-)
I've been trying to find a student to write a haskell-src-exts type checker for me. It should use a particular kind of mechanism though, using constraints similar to [1]. Then, I want to adapt that to do transformations. What approach are you using? Maybe I can somehow steal your work... ;)
As you note from my starting references, I'm using the same approach and base algorithms as GHC does. I looked at some alternatives briefly, including the work you reference, but I discarded them all since none of them had support for everything I wanted to cover. In particular, it is not at all clear how these systems would merge with bi-directional inference or local assumptions. Tackling *that* problem would be a large and very interesting research topic I'm sure, but not one I have time to dig into at the current time. Indeed, the feasibility of my project (and thus the sanctity of my health) relies very heavily on the "footsteps-of-giants" factor... So as not to give anyone false hopes, I should point out that I currently have next to no time at all to devote to this, and the project has been dormant since August. I'll return to it for sure, but at the moment cannot tell when. Cheers, /Niklas [1] S. Peyton Jones, D. Vytiniotis, S. Weirich, and M. Shields. Practical type inference for arbitrary-rank types. [2] D. Vytiniotis, S. Peyton Jones, T. Schrijvers, M. Sulzmann. OutsideIn(X) – Modular type inference with local assumptions
participants (4)
-
Gregory Crosswhite
-
Niklas Broberg
-
Sean Leather
-
Thomas Schilling