I'm sorry you've been having such a torrid time, Sam.

We (the GHC developers) are acutely aware of the difficulties surrounding the GHC API.  It was discussed quite a bit at the Haskell Implementers Meeting at last year's ICFP.

The difficulty is that there simply is no defined GHC API.  GHC defines many thousands of functions spread across over 700 modules.  Of course, we change them all the time!  And GHC API users simply reach out and use the most convenient function for them -- but of course we don't know which functions they are using, and so we may change them without warning.  And they may be entirely inappropriate "seeing too much of the internals" functions anyway.

What we need is
Even if we had all that, I'd expect the API to shift quite rapidly.  For example, the new error-message infrastructure was developed specifically in response to client needs (esp HLS) but those changes necessarily affect all clients, yourself included.  At least, with a defined API and process, you might get more warning and migration advice.

But to achieve that we need a group of people with the expertise and commitment to create and curate that API.  It is possible that the Haskell Foundation will help with this, but meanwhile I'm just sharing the need again in case anyone says "oh I'd love to help with that".  This is how open source works: we work together towards common goals.  But it does need volunteers, and everyone is busy.

Everyone: if you'd like to contribute to a GHC API design and curation process, please do say.   Even if you don't feel able to lead it, saying "I'd be happy to help" is a big thing.

I know this isn't helping Sam much in the short term -- apologies for that.

Simon

On Fri, 19 May 2023 at 03:15, Vaibhav Sagar <vaibhavsagar@gmail.com> wrote:
Hi Sam,

I have some experience with GHCI API version migrations from maintaining IHaskell (https://github.com/IHaskell/IHaskell/). During my maintainership I've upgraded to every release of GHC from 8.2 to 9.6 (9 unless I'm counting incorrectly). In my experience, doing an incremental version upgrade takes me a relaxed weekend or less each time (for a project of IHaskell's size), but I can see it being incredibly frustrating if you put your code down for enough time that you have to jump multiple versions. I have a workflow that I'm pretty happy with, which I outline in https://vaibhavsagar.com/blog/2021/05/02/updating-ihaskell-newer-ghc/ (mostly so I don't forget how I did things 6 months later).

I'm personally happy with the pace of GHC development and refactors and I see the breaking changes as the necessary cost of improving the codebase and fixing (sometimes long-standing) issues.

I think you are asking a lot of the compiler developers here. I don't think that merely consuming an API brings with it an expectation that the API developers should proactively fix your code when they make breaking changes (unless that is clearly outlined at the beginning, which I don't think it ever has been for GHC), and I think this would introduce a lot of friction into the development process which I personally would not want. I think a "community build" would be a great idea, but only for a small handful of projects that have outsized importance to the Haskell ecosystem (e.g. Pandoc, ShellCheck, HLS etc.) which excludes my project (and IMHO yours).

I'm sure there are people who are qualified to help you keep your project up-to-date without impacting GHC development or including your project in the codebase.

Thanks,
Vaibhav

On Fri, May 19, 2023 at 7:04 AM Sam Halliday <sam.halliday@gmail.com> wrote:
Hi all,

A few years ago I wrote a tool (under a pseudonym) that uses the ghc
api. I have not been working in Haskell and I've found that it's
bitrotted since the 8.10.x series that I was most recently using.

The impact of refactors to the ghc codebase have hit me really badly,
and with the help of Sylvain, I've been able to get it to the point
where it compiles with 9.0.2 and 9.2.7. I'm currently working my way
through 9.4.5 support but this is becoming extremely time consuming and
draining, so I'm asking for help from anybody who has been involved in
these refactors if they could help further. There's only so much git
pickaxe can help with, when functions names, types and parameters are
moved around in a way that is hard to keep track of. Especially with the
changes impacting the way packages, units, errors, and compiler
invocation all work.

The code is at https://gitlab.com/tseenshe/hsinspect (and also on
hackage) and I'm doing all my work-in-progress under the ghc9 branch.


As a follow up question, I have a few ideas for how this level of
disruption to my codebase could be avoided and I was hoping to get some
thoughts on it:

1. some programming language communities have a "community build" that
   is periodically built by snapshots of the compiler. This allows
   unexpected regressions to be caught early in the dev cycle and would
   allow the author of refactor changes to send a courtesy patch to keep
   the broken code running if the change is intended to be kept in the
   compiler. I'd like to propose hsinspect for such a community build.

2. propose that my code is merged into the ghc api, so that my code
   becomes trivial to maintain from that moment on because I've handed
   the responsibility on, very explicitly, to whoever wishes to make
   breaking changes in the compiler. I'd like to propose adding all my
   modules (minus the compiler plugin and machine readable format stuff)
   to the compiler code tree.

are either of these options realistic?

I don't think I would be interested in using Haskell anymore if my
tooling stopped working: I've invested so much time and energy into it
at this point that to start again or have to set my tools aside would be
too much of a disincentive. I really love the Haskell language, so I
hope that this is not the case.


For those interested further in my tool and/or helping...

As a quick architectural overview (motivations and goals are described
well enough in the README to not need repeating): my tool is very
simple:

- the plugin is a ghc compiler plugin, all it does is dump out the flags
  that the batch compiler was invoked with. I understand that HLS has
  its own solution these days that involves extracting this information
  from the build tool, but I still prefer getting it from the compiler
  because it's simple, build tool agnostic, and guaranteed to be
  correct. I spent many years doing it the build tool route for Scala
  tooling and I think that was a mistake.

- the runtime binary, which is user-invoked (rarely) during their dev
  cycle, and has several features. This binary is able parse the user's
  Haskell file's pragmas, on top of the flags that the compiler was
  invoked with, to extract the exact dflags to use for any interactive
  compiler usage. The specific features are then:

  - find all the packages that are depended upon by the home unit. Go
    and find all the symbols contained therein, including their type
    signatures. Dump it all out to a machine readable format.

  - parse just the imports section of the current file to extract the
    full list of imports in use. Then go and lookup all the symbol names
    that it implies and their package name, and dump it all out to a
    machine readable format.

  - parse the file, find all the data types that the user has defined,
    and output it into a simpler AST that is designed for code
    navigation and boilerplate generation tools, comments are not
    preserved. An example tool that can use this is at
    https://gitlab.com/tseenshe/boilerplate (and on hackage)

On top of these machine readable files I have Emacs tooling that can
provide me with all the semantic code editor support that I need, and
leaves the door open for some things I haven't implemented yet (such as
Hoogle like search within the project and dependencies). Code for the
Emacs stuff is at https://gitlab.com/tseenshe/haskell-tng.el under the
haskell-tng-hsinspect.el file

The approach has some limitations, for example I cannot support
RecordDotSyntax because completing on a dot would mean knowing the type
under the dot.

I also wrote an LSP so that my VSCode colleagues could use it too.
Everybody that used it with me was free to use HIE (so called at the
time) if they wished. Those of us who used hsinspect preferred it
because it's basically zero overhead and is therefore really kind on
CPU, RAM and battery life.

--
Best regards,
Sam
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs