
Hi all, I was looking at removing the `BlockId` type synonym in favor of Hoopl's `Label` (there was already a TODO and it is a bit confusing). But once I've started making the changes, I've realized that in a bunch of places this makes the code *less* readable. Mostly because of `CLabel` (sounds similar but is something quite different and having to rename local variables from `label` to `clabel` is not great). I started to look at alternatives and noticed that in general the interface between GHC and Hoopl is quite noisy and confusing: - Hoopl has `Label` which is GHC's `BlockId` but different than GHC's `CLabel` - Hoopl has `Unique` which is different than GHC's `Unique` - Hoopl has `Unique{Map,Set}` which are different than GHC's `Uniq{FM,Set}` - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is needed just to filter the exposed functions (filter out some of the Hoopl's and add the GHC ones). - Working in `cmm/` requires constant switching between GHC code and Hoopl (`CmmNode`/`CmmGraph`/`CmmBlock` and dataflow stuff is in GHC, the actual implementation of `Block`/`Graph` are defined in Hoopl, etc.) GHC is actually using only a small subset of Hoopl (e.g., the fixpoint computation is copied/specialized: `cmm/Hoopl/Dataflow`). So I was wondering - maybe it's worth to simply drop the dependency on Hoopl? (and copy the code that is actually necessary in GHC) I've done an experiment in [1] (to see how much we'd need to actually copy) and I really like the result: - We can remove one external dependency and git submodule at the cost of only 5 new modules in `cmm/Hoopl` (net gain of only 4 modules: we add 5 new but can remove `cmm/Hoopl`, which is no longer needed) - We should be able to fix all of the above issues and make the code easier to understand (less code, everything in one repo, fewer concepts). - It's going to be easier to change things since we don't need to worry about changing the public interface of Hoopl (it's a standalone package on Hackage and other people already depend on the current behavior). What do you think? Does anyone think we shouldn't do this? Thanks, Michal [1] Branch: https://github.com/michalt/ghc/tree/hoopl/no-hoopl Diff: https://github.com/ghc/ghc/compare/master...michalt:hoopl/no-hoopl For now I just copied the code/updated imports and didn't do any cleanups, but I'd be happy to do them in subsequent PRs

On 2017-05-27 at 19:58:11 +0200, Michal Terepeta wrote: [...]
I've done an experiment in [1] (to see how much we'd need to actually copy) and I really like the result: - We can remove one external dependency and git submodule at the cost of only 5 new modules in `cmm/Hoopl` (net gain of only 4 modules: we add 5 new but can remove `cmm/Hoopl`, which is no longer needed) - We should be able to fix all of the above issues and make the code easier to understand (less code, everything in one repo, fewer concepts). - It's going to be easier to change things since we don't need to worry about changing the public interface of Hoopl (it's a standalone package on Hackage and other people already depend on the current behavior).
What do you think? Does anyone think we shouldn't do this?
It appears to me that in this case, the benefits in gained flexibility outweight the cost of independent development and potential loss of synergies. So I'm +1 on this.

Michal Terepeta
Hi all,
...
What do you think? Does anyone think we shouldn't do this?
I think this seems quite reasonable. Given that hoopl will need changes to be truly useful to GHC, it seems quite reasonable to take the parts we need and iterate independently on the rest. Cheers, - Ben

Michal Terepeta wrote:
What do you think? Does anyone think we shouldn't do this?
Makes sense. I'm +1 on this. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

Cool, thanks for quick replies! I've sent out https://phabricator.haskell.org/D3616 Cheers, Michal

Is there really a compelling case for forking Hoopl? I was talking to Kavon last week about doing exactly the opposite: using Hoopl more wholeheartedly!
Before going ahead with this, let’s remember the downsides
· If we fork Hoopl, improvements in one place will not be seen in the other. GHC originally used its own containers library but now uses ‘containers’, most of which is irrelevant to GHC, just to pick up the work that has been done to make ‘containers’ fast. Similarly, GHC has a clone of ‘pretty’, but someone is working (I think) to make GHC use ‘pretty’.
· It’s not clear to me why GHC has a clone of parts of Hoopl. Would it not be better just to make Hoopl faster?
If anything I ‘d like to use Hoopl more in Cmm optimisation passes in GHC, so we may want to use more of Hoopl’s facilities.
The main reason you suggest for forking is that there are some awkward name clashes. Surely we could resolve these? e.g we could change CLabel in GHC; or agree with Hoopl maintainers that BlockId would be more helpful than Label.
You mention that Hoopl uses Unique set/map. Why not use ‘containers’ for that? (Like GHC!)
Let’s discuss this a bit more before executing
I’m also interested to know:
· who is actively working on Hoopl (Michael, Sophie, …)?
· how are you using it (within GHC, or somewhere else)?
It’d be good to review and update https://ghc.haskell.org/trac/ghc/wiki/Hoopl/Cleanup. Are there any other improvements planned?
Simon
From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Michal Terepeta
Sent: 27 May 2017 18:58
To: ghc-devs

On Sun, May 28, 2017 at 11:30 PM Simon Peyton Jones
Is there really a compelling case for forking Hoopl? I was talking to Kavon last week about doing exactly the opposite: using Hoopl more wholeheartedly!
Before going ahead with this, let’s remember the downsides
· If we fork Hoopl, improvements in one place will not be seen in the other. GHC originally used its own containers library but now uses ‘containers’, most of which is irrelevant to GHC, just to pick up the work that has been done to make ‘containers’ fast. Similarly, GHC has a clone of ‘pretty’, but someone is working (I think) to make GHC use ‘pretty’.
· It’s not clear to me why GHC has a clone of parts of Hoopl. Would it not be better just to make Hoopl faster?
If anything I ‘d like to use Hoopl more in Cmm optimisation passes in GHC, so we may want to use more of Hoopl’s facilities.
The main reason you suggest for forking is that there are some awkward name clashes. Surely we could resolve these? e.g we could change CLabel in GHC; or agree with Hoopl maintainers that BlockId would be more helpful than Label.
You mention that Hoopl uses Unique set/map. Why not use ‘containers’ for that? (Like GHC!)
Let’s discuss this a bit more before executing
I’m also interested to know:
· who is actively working on Hoopl (Michael, Sophie, …)?
· how are you using it (within GHC, or somewhere else)?
It’d be good to review and update https://ghc.haskell.org/trac/ghc/wiki/Hoopl/Cleanup. Are there any other improvements planned?
Simon
Hi Simon, Thanks for chiming in! Let me try to clarify the current situation and the motivation for my changes. 1) Initial fork of Hoopl Note that what I’m actually advocating is to *finish* forking Hoopl. The fork really started in ~2012 when the “new Cmm backend” was being finished. IIRC the main reason was the unacceptable performance and it seems that even Simon Marlow had trouble making it run fast enough: https://plus.google.com/107890464054636586545/posts/dBbewpRfw6R https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/HooplPerformance The end result is pretty sad: GHC has its own forked/specialized `Hoopl.Dataflow` module and is using Hoopl only for definitions of `Block`/`Graph` and maps/sets (if you look at my commit, it’s pretty clear what I’m copying). In particular it’s not using *any* of dataflow analysis or rewriting capabilities of the Hoopl package. 2) Reasons to finish forking The reasons I listed in my previous email already assumed the we have the forked `Hoopl.Dataflow` module in GHC. But if we want to discuss what are reasons for forking in general, then apart from the performance (as noted above), there’s the issue of Hoopl’s interface. IMHO the node-oriented approach taken by Hoopl is both not flexible enough and it makes it harder to optimize it. That’s why I’ve already changed GHC’s `Hoopl.Dataflow` module to operate “block-at-a-time” (https://github.com/ghc/ghc/commit/679ccd1c8860f1ef4b589c9593b74d04c97ae836) Some concrete examples: - For proc-point analysis it was necessary to introduce a hack to GHC’s `Dataflow` module to expose a separate analysis function that *ignores* the middle nodes (since for proc-points they’re irrelevant). My change to go “block-at-a-time” allowed us to remove that hack. - I’m trying to fix non-linearity of `CmmLayoutStack` in (https://phabricator.haskell.org/D3586) and again the block-oriented interface is useful - I want to do different rewrites based on which block is being considered (whether it’s a proc-point or not). This is not easily possible if I don’t know which block I’m in (which is the case for the node-oriented interface). I also don’t think that name clashes and the tension between Hoopl’s interface and GHC are easy to solve. Hoopl is a public, stand-alone package, so we can’t just change things without considering compatibility. For instance, we can’t use GHC’s `Unique` in Hoopl. But should we switch all of GHC to use Hoopl’s? Also having closely related concepts spread around GHC and Hoopl is not helping when trying to understand what’s happening. Finally, any changes to both GHC & Hoopl have much higher overhead than just changing GHC. In general, it really seems to me that Hoopl has been released simply too early, with not enough real-world usage and testing. When you say that we should “just fix Hoopl”, it sounds to me that we’d really need to rewrite it from scratch. And it’s much easier to do that if we can just experiment within GHC without worrying about breaking other existing Hoopl users. Only once we’re happy with the result, we should be considering separating it into a stand-alone package. 3) Difference between pretty/containers and Hoopl I also think that the situation with pretty/containers is quite different than Hoopl. They are much more general-purpose libraries, *far* more widely used and with more contributors. Take containers - the package is still very actively developed and constantly improved. Whereas Hoopl hasn’t really seen much activity in the last 5 years. So the benefit-cost ratio is much better - yes there is some cost in having containers as a dependency, but the benefits from the regular stream of improvements easily outweigh it. I don’t think that’s the case for Hoopl. Does this help understand my motivation? Let me know if anything is still unclear! Thanks, Michal

Michael
Sorry to be slow.
Note that what I’m actually advocating is to *finish* forking Hoopl. The
fork really started in ~2012 when the “new Cmm backend” was being
finished.
Yes, I know. But what I’m suggesting is to revisit the reasons for that fork, and re-join if possible. Eg if Hoopl is too slow, can’t we make it faster? Why is GHC’s version faster?
apart from the performance
(as noted above), there’s the issue of Hoopl’s interface. IMHO the
node-oriented approach taken by Hoopl is both not flexible enough and it
makes it harder to optimize it. That’s why I’ve already changed GHC’s
`Hoopl.Dataflow` module to operate “block-at-a-time”
Well that sounds like an argument to re-engineer Hoopl’s API, rather an argument to fork it. If it’s a better API, can’t we make it better for everyone? I don’t yet understand what the “block-oriented” API is, or how it differs, but let’s have the conversation.
When you say
that we should “just fix Hoopl”, it sounds to me that we’d really need
to rewrite it from scratch. And it’s much easier to do that if we can
just experiment within GHC without worrying about breaking other
existing Hoopl users
Fine. But then let’s call it hoopl2, make it a separate package (perhaps with GHC as its only client for now), and declare that it’s intended to supersede hoopl.
But do we even need to do that much? After all, a major version bump on a package is allowed to introduce breaking changes to the API. Anyone who wants the old API can use the old package.
I wonder if you could start a wiki page somewhere (eg on the GHC wiki) listing all the changes you’d like to make in a “rewrite from scratch” story? That would help to “ground” the conversation.
Thanks
Simon
From: Michal Terepeta [mailto:michal.terepeta@gmail.com]
Sent: 29 May 2017 12:53
To: Simon Peyton Jones

On Wed, Jun 7, 2017 at 7:05 PM Simon Peyton Jones
wrote: Michael Sorry to be slow.
Note that what I’m actually advocating is to *finish* forking Hoopl. The fork really started in ~2012 when the “new Cmm backend” was being finished.
Yes, I know. But what I’m suggesting is to revisit the reasons for that fork, and re-join if possible. Eg if Hoopl is too slow, can’t we make it faster? Why is GHC’s version faster?
apart from the performance (as noted above), there’s the issue of Hoopl’s interface. IMHO the node-oriented approach taken by Hoopl is both not flexible enough and it makes it harder to optimize it. That’s why I’ve already changed GHC’s `Hoopl.Dataflow` module to operate “block-at-a-time”
Well that sounds like an argument to re-engineer Hoopl’s API, rather an argument to fork it. If it’s a better API, can’t we make it better for everyone? I don’t yet understand what the “block-oriented” API is, or how it differs, but let’s have the conversation.
Sure, but re-engineering the API of a publicly use package has significant cost for everyone involved: - GHC: we might need to wait longer for any improvements and spend more time discussing various options (and compromises - what makes sense for GHC might not make sense for other people) - Hoopl users: will need to migrate to the new APIs potentially multiple times - Hoopl maintainers: might need to maintain more than one branches of Hoopl for a while And note that just bumping a version number might not be enough. IIRC Stackage only allows one version of each package and since Hoopl is a boot package for GHC, the new version will move to Stackage along with GHC. So any users of Hoopl that want to use the old package, will not be able to use that version of Stackage.
When you say that we should “just fix Hoopl”, it sounds to me that we’d really need to rewrite it from scratch. And it’s much easier to do that if we can just experiment within GHC without worrying about breaking other existing Hoopl users
Fine. But then let’s call it hoopl2, make it a separate package (perhaps with GHC as its only client for now), and declare that it’s intended to supersede hoopl.
Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place? I've pointed multiple reasons why I think it has a significant cost. But I don't really see any major benefits. Looking at the commit history of Hoopl there hasn't been much development on it since 2012 when Simon M was trying to get the new GHC backend working (since then, it's mostly maintenance patches to keep up with changes in `base`, etc). Extracting a core part of any project to a shared library has some real costs, so there should be equally real benefits that outweigh that cost. (If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?) I also do think this is quite different than a dependency on, say, `binary`, `containers` or `pretty`, where the API of the library is smaller (at least conceptually), much better understood and established. Cheers, Michal

Michal Terepeta
Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
I've pointed multiple reasons why I think it has a significant cost. But I don't really see any major benefits. Looking at the commit history of Hoopl there hasn't been much development on it since 2012 when Simon M was trying to get the new GHC backend working (since then, it's mostly maintenance patches to keep up with changes in `base`, etc). Extracting a core part of any project to a shared library has some real costs, so there should be equally real benefits that outweigh that cost. (If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?)
One way forward here would be to ask those who would be affected by a API rework whether they would be open to change. I don't believe there are too many hoopl users at the moment but I recall that previous efforts to change the library's interface were met with some resistance. However, even if we found that hoopl's current user-base is agreeable to change we would still need to account for the fact that advancing GHC in lockstep with an out-of-tree hoopl will take more effort than advancing it under Michal's merge proposal. Admittedly, with submodules this additional effort isn't too large, but it's still more than having hoopl and GHC under one tree. Cheers, - Ben

Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
One reason only: because it makes Hoopl usable by compilers other than GHC. And, dually, efforts by others to improve Hoopl will benefit GHC.
If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
A re-usable library should be
a) a significant chunk of code,
b) that can plausibly be re-purposed by others
c) and that has an explicable API
I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold. But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds. It’s designed to be re-usable. Whether it is actually re-used is another matter, of course. But if it’s part of GHC, it can’t be.
Stackage only allows one version of each package
I didn’t know that, but I can see it makes sense. That makes a strong case for re-doing it as a new package hoopl2, if the API needs to change substantially (something we have yet to discuss).
I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me? If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.
Thanks!
Simon
From: Michal Terepeta [mailto:michal.terepeta@gmail.com]
Sent: 08 June 2017 19:59
To: Simon Peyton Jones
On Wed, Jun 7, 2017 at 7:05 PM Simon Peyton Jones
mailto:simonpj@microsoft.com> wrote: Michael Sorry to be slow.
Note that what I’m actually advocating is to *finish* forking Hoopl. The fork really started in ~2012 when the “new Cmm backend” was being finished.
Yes, I know. But what I’m suggesting is to revisit the reasons for that fork, and re-join if possible. Eg if Hoopl is too slow, can’t we make it faster? Why is GHC’s version faster?
apart from the performance (as noted above), there’s the issue of Hoopl’s interface. IMHO the node-oriented approach taken by Hoopl is both not flexible enough and it makes it harder to optimize it. That’s why I’ve already changed GHC’s `Hoopl.Dataflow` module to operate “block-at-a-time”
Well that sounds like an argument to re-engineer Hoopl’s API, rather an argument to fork it. If it’s a better API, can’t we make it better for everyone? I don’t yet understand what the “block-oriented” API is, or how it differs, but let’s have the conversation.
Sure, but re-engineering the API of a publicly use package has significant cost for everyone involved: - GHC: we might need to wait longer for any improvements and spend more time discussing various options (and compromises - what makes sense for GHC might not make sense for other people) - Hoopl users: will need to migrate to the new APIs potentially multiple times - Hoopl maintainers: might need to maintain more than one branches of Hoopl for a while And note that just bumping a version number might not be enough. IIRC Stackage only allows one version of each package and since Hoopl is a boot package for GHC, the new version will move to Stackage along with GHC. So any users of Hoopl that want to use the old package, will not be able to use that version of Stackage.
When you say that we should “just fix Hoopl”, it sounds to me that we’d really need to rewrite it from scratch. And it’s much easier to do that if we can just experiment within GHC without worrying about breaking other existing Hoopl users
Fine. But then let’s call it hoopl2, make it a separate package (perhaps with GHC as its only client for now), and declare that it’s intended to supersede hoopl.
Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place? I've pointed multiple reasons why I think it has a significant cost. But I don't really see any major benefits. Looking at the commit history of Hoopl there hasn't been much development on it since 2012 when Simon M was trying to get the new GHC backend working (since then, it's mostly maintenance patches to keep up with changes in `base`, etc). Extracting a core part of any project to a shared library has some real costs, so there should be equally real benefits that outweigh that cost. (If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?) I also do think this is quite different than a dependency on, say, `binary`, `containers` or `pretty`, where the API of the library is smaller (at least conceptually), much better understood and established. Cheers, Michal

Lemme toss in my 2 cents as an outsider who likes to dabble in programming language and compilers: I would *love* to be able just drop in (parts) of GHC's optimisation into my toy compilers. Optimisation is complicated, lots of work, and not really the part I care about when toying with languages. I wasn't really aware of Hoopl before this thread, so now that I do I'm kinda sad by the idea of this reusable infrastructure being tossed out. I don't really have any vested interest/opinion on how to deal with the current Hoopl situation, so if it's decided to write a Hoopl2.0 instead, without backwards compatibility, I would still consider that a win. Cheers, Merijn
On 9 Jun 2017, at 9:50, Simon Peyton Jones via ghc-devs
wrote: Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
One reason only: because it makes Hoopl usable by compilers other than GHC. And, dually, efforts by others to improve Hoopl will benefit GHC.
If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
A re-usable library should be a) a significant chunk of code, b) that can plausibly be re-purposed by others c) and that has an explicable API
I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold. But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds. It’s designed to be re-usable. Whether it is actually re-used is another matter, of course. But if it’s part of GHC, it can’t be.
Stackage only allows one version of each package
I didn’t know that, but I can see it makes sense. That makes a strong case for re-doing it as a new package hoopl2, if the API needs to change substantially (something we have yet to discuss).
I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me? If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.
Thanks!
Simon
From: Michal Terepeta [mailto:michal.terepeta@gmail.com] Sent: 08 June 2017 19:59 To: Simon Peyton Jones
; ghc-devs Cc: Kavon Farvardin Subject: Re: Removing Hoopl dependency? On Wed, Jun 7, 2017 at 7:05 PM Simon Peyton Jones
wrote: Michael
Sorry to be slow.
Note that what I’m actually advocating is to *finish* forking Hoopl. The
fork really started in ~2012 when the “new Cmm backend” was being
finished.
Yes, I know. But what I’m suggesting is to revisit the reasons for that fork, and re-join if possible. Eg if Hoopl is too slow, can’t we make it faster? Why is GHC’s version faster?
apart from the performance
(as noted above), there’s the issue of Hoopl’s interface. IMHO the
node-oriented approach taken by Hoopl is both not flexible enough and it
makes it harder to optimize it. That’s why I’ve already changed GHC’s
`Hoopl.Dataflow` module to operate “block-at-a-time”
Well that sounds like an argument to re-engineer Hoopl’s API, rather an argument to fork it. If it’s a better API, can’t we make it better for everyone? I don’t yet understand what the “block-oriented” API is, or how it differs, but let’s have the conversation.
Sure, but re-engineering the API of a publicly use package has significant
cost for everyone involved:
- GHC: we might need to wait longer for any improvements and spend
more time discussing various options (and compromises - what makes
sense for GHC might not make sense for other people)
- Hoopl users: will need to migrate to the new APIs potentially
multiple times
- Hoopl maintainers: might need to maintain more than one branches of
Hoopl for a while
And note that just bumping a version number might not be enough. IIRC
Stackage only allows one version of each package and since Hoopl is a
boot package for GHC, the new version will move to Stackage along with
GHC. So any users of Hoopl that want to use the old package, will not
be able to use that version of Stackage.
When you say
that we should “just fix Hoopl”, it sounds to me that we’d really need
to rewrite it from scratch. And it’s much easier to do that if we can
just experiment within GHC without worrying about breaking other
existing Hoopl users
Fine. But then let’s call it hoopl2, make it a separate package (perhaps with GHC as its only client for now), and declare that it’s intended to supersede hoopl.
Maybe this is the core of our disagreement - why is it a good idea to
have Hoopl as a separate package in the first place?
I've pointed multiple reasons why I think it has a significant cost.
But I don't really see any major benefits. Looking at the commit
history of Hoopl there hasn't been much development on it since 2012
when Simon M was trying to get the new GHC backend working (since
then, it's mostly maintenance patches to keep up with changes in
`base`, etc).
Extracting a core part of any project to a shared library has some
real costs, so there should be equally real benefits that outweigh
that cost. (If I proposed extracting parts of Core optimizer to a
separate package, wouldn't you expect some really good reasons for
doing this?)
I also do think this is quite different than a dependency on, say,
`binary`, `containers` or `pretty`, where the API of the library is
smaller (at least conceptually), much better understood and
established.
Cheers,
Michal
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi Simon, On 2017-06-09 at 09:50:51 +0200, Simon Peyton Jones via ghc-devs wrote: [...]
Stackage only allows one version of each package
I didn’t know that, but I can see it makes sense. That makes a strong case for re-doing it as a new package hoopl2
The limitations of Stackage's design shouldn't drive nor limit library design. Cabal has been moving to finally allow us to have multiple versions and even multiple configurations/instances of the same version of a package registered in the package db at the same time, and subjecting ourselves to Stackage's limitations after all the work done (and more in that direction is being considered to push the boundaries even further) to that effect *now* seems quite backward to me. If we push the idea to its conclusion, that we shall rather publish a new package rather than release a new major version of a package to workaround Stackage, you'd see a proliferation of number-suffixed packages on Hackage. Moreover, packages which can easily support multiple major versions of a package would have to use conditional logic boilerplate in their .cabal files (which again would be incompatible with Stackage's inherent limitations, as it allows only *one configuration* of a given package version). We should build upon the facilities we already have in place; and major versions are here to encode the epoch/generation of an API; moreover, as a big advantage over classic SemVer, we also have this 2-component major version which gives us more flexibility for versioning during developing two or more epochs of an API in parallel. So hoopl-1.* and hoopl-2.* could keep evolving independently, each branch being able to perform major version increments in their respective version namespace. Cheers, HVR

But equally, stackage is a major part of the haskell ecosystem.
As such, implications and paths forward need to be considered.
Alan
On 9 June 2017 at 11:16, Herbert Valerio Riedel
Hi Simon,
On 2017-06-09 at 09:50:51 +0200, Simon Peyton Jones via ghc-devs wrote:
[...]
Stackage only allows one version of each package
I didn’t know that, but I can see it makes sense. That makes a strong case for re-doing it as a new package hoopl2
The limitations of Stackage's design shouldn't drive nor limit library design. Cabal has been moving to finally allow us to have multiple versions and even multiple configurations/instances of the same version of a package registered in the package db at the same time, and subjecting ourselves to Stackage's limitations after all the work done (and more in that direction is being considered to push the boundaries even further) to that effect *now* seems quite backward to me.
If we push the idea to its conclusion, that we shall rather publish a new package rather than release a new major version of a package to workaround Stackage, you'd see a proliferation of number-suffixed packages on Hackage. Moreover, packages which can easily support multiple major versions of a package would have to use conditional logic boilerplate in their .cabal files (which again would be incompatible with Stackage's inherent limitations, as it allows only *one configuration* of a given package version).
We should build upon the facilities we already have in place; and major versions are here to encode the epoch/generation of an API; moreover, as a big advantage over classic SemVer, we also have this 2-component major version which gives us more flexibility for versioning during developing two or more epochs of an API in parallel. So hoopl-1.* and hoopl-2.* could keep evolving independently, each branch being able to perform major version increments in their respective version namespace.
Cheers, HVR _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones
wrote: Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
One reason only: because it makes Hoopl usable by compilers other than GHC. And, dually, efforts by others to improve Hoopl will benefit GHC.
If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
A re-usable library should be a) a significant chunk of code, b) that can plausibly be re-purposed by others c) and that has an explicable API
I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold. But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds. It’s designed to be re-usable. Whether it is actually re-used is another matter, of course. But if it’s part of GHC, it can’t be.
I agree with your characterization of a re-usable library and that Core optimizer would not be a good fit. But I do think that Hoopl also has some problems with b) and c) (although smaller): - Using an optimizer-as-a-library is not really common (I'm not aware of any compilers doing this, LLVM is to some degree close but it exposes the whole language as the interface so it's closer to the idea of extracting the whole Cmm backend). So I don't think the API for such a project is well understood. - The API is pretty wide and does put serious constraints on the IR (after all it defines blocks and graphs), making reusability potentially more tricky. So I think I understand your argument and we just disagree on whether this is worth the effort of having a separate package.
[...]
I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me? If we are free to
choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package. Having even Hoopl2 as a separate package would still entail additional work: - Hoopl2 would still need to duplicate some concepts (eg, `Unique`, etc. since it needs to be standalone) - Understanding code (esp. by newcommers) would be harder: the Cmm backend would be split between GHC and Hoopl2, with the latter necessarily being far more general/polymorphic than needed by GHC. - Getting the right performance in the presence of all this additional generality/polymorphism will likely require fair amount of additional work. - If Hoopl2 is used by other compilers, then we need to be more careful changing anything in incompatible ways, this will require more discussions & release coordination. Considering that Hoopl was never actually picked up by other compilers, I'm not convinced that this cost is justified. But I understand that other people might have a different opinion. So how about a compromise: - decouple GHC from the current Hoopl (ie, go ahead with my diff), - keep everything Hoopl related only in `compiler/cmm/Hoopl` with the long-term intention of creating a separate package, - experiment with and improve the code, - once (if?) we're happy with the results, discuss what/how to extract to a separate package. That gives us the freedom to try things out and see what works well (I simply don't have ready solutions for anything, being able to experiment is IMHO quite important). And once we reach the right performance/representation/abstraction/API we can work on extracting that. What do you think? Cheers, Michal

Hello, fellow workers!
So, I'll pop in here with my thoughts.
I'm writing an independent intermediate language library for functional
languages, and I looked at using Hoopl. I would use it, but there are
several reasons why I'm not currently doing so:
1) Combining facts from different domains through fancy lattice algorithms.
This is fairly straightforward to add to Hoopl with minimal extra API
change.
2) I wanted to write my data facts as a type-level list, `freer-effects`
style, in order to be more explicit in my types about dependencies between
analyses. This would require significantly altering the API.
3) Its own custom graph code. This is the biggest reason why I decided not
to. Some problems:
* It seems impossible to change the topology of the graph in a rewriting
step.
* I wanted to use term hypergraphs/hyperjungles due to some pretty nifty
properties
* The intermediate language I'm implementing, a derivative of Graph
Reduction Intermediate Notation, aka GRIN from UHC, is, as its name
implies, intrinsically graph-based. Thus, graph manipulation has to be
pretty easy to do.
So instead, I've decided to optimise another hypergraph library
(`graph-rewriting` - I'm going to be rewriting it to use an inductive
representation a la FGL) and implement a generic, Hoopl-esque analysis
library on top of that. (Or more accurately, that is my plan for the next
six months - I've been sidetracked getting parsing to work nice with an
effect-based stack!)
So, if Hoopl2 does become a thing, I'd be very keen on working on it, but
if I were to actually use it myself, it'd probably require a complete
rewrite. Fortunately, it's a pretty small library; and for GHC, its current
usage is a pretty straightforward usecase which shouldn't be affected too
much. That being said, if GHC were to better use Hoopl (e.g. moving some of
the optimisations on Core to be Hoopl-based passes) then it would be a
different story.
So I guess I'm volunteering to do the rewrite for a potential Hoopl2 if
it's wanted, as I'm about to do pretty much that anyway.
Cheers,
Sophie
On Fri, 9 Jun 2017 at 22:31 Michal Terepeta
On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones
wrote: Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
One reason only: because it makes Hoopl usable by compilers other than GHC. And, dually, efforts by others to improve Hoopl will benefit GHC.
If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
A re-usable library should be a) a significant chunk of code, b) that can plausibly be re-purposed by others c) and that has an explicable API
I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold. But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds. It’s designed to be re-usable. Whether it is actually re-used is another matter, of course. But if it’s part of GHC, it can’t be.
I agree with your characterization of a re-usable library and that Core optimizer would not be a good fit. But I do think that Hoopl also has some problems with b) and c) (although smaller): - Using an optimizer-as-a-library is not really common (I'm not aware of any compilers doing this, LLVM is to some degree close but it exposes the whole language as the interface so it's closer to the idea of extracting the whole Cmm backend). So I don't think the API for such a project is well understood. - The API is pretty wide and does put serious constraints on the IR (after all it defines blocks and graphs), making reusability potentially more tricky.
So I think I understand your argument and we just disagree on whether this is worth the effort of having a separate package.
[...]
I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me? If we are free to
choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.
Having even Hoopl2 as a separate package would still entail additional work: - Hoopl2 would still need to duplicate some concepts (eg, `Unique`, etc. since it needs to be standalone) - Understanding code (esp. by newcommers) would be harder: the Cmm backend would be split between GHC and Hoopl2, with the latter necessarily being far more general/polymorphic than needed by GHC. - Getting the right performance in the presence of all this additional generality/polymorphism will likely require fair amount of additional work. - If Hoopl2 is used by other compilers, then we need to be more careful changing anything in incompatible ways, this will require more discussions & release coordination.
Considering that Hoopl was never actually picked up by other compilers, I'm not convinced that this cost is justified. But I understand that other people might have a different opinion. So how about a compromise: - decouple GHC from the current Hoopl (ie, go ahead with my diff), - keep everything Hoopl related only in `compiler/cmm/Hoopl` with the long-term intention of creating a separate package, - experiment with and improve the code, - once (if?) we're happy with the results, discuss what/how to extract to a separate package. That gives us the freedom to try things out and see what works well (I simply don't have ready solutions for anything, being able to experiment is IMHO quite important). And once we reach the right performance/representation/abstraction/API we can work on extracting that.
What do you think?
Cheers, Michal
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Interesting!
Maybe there are a couple of different alternatives:
A. A rewrite of Hoopl, with all the same basic ideas and data structures, but with a better API (I’m not sure exactly in what way, but Michael has some idea, as does Sophie), and a more efficient implementation.
B. A more radical change to use hypergraphs, type-level lists etc. This sounds interesting, but it’s a more substantial change and before using it for GHC we’d need to discuss the new proposed API in some detail
There’s no reason we couldn’t do (A) and (B) in parallel.
Michael is suggesting doing (A) in GHC’s tree, but with a clearly-declared intent to bring it out as a separate library. (I’d advocate making it a separate library in GHC’s tree; we already have a number of those.
That would leave Sophie free to do (B) free of the constraints of GHC depending on it; but we could always use it later.
Does that sound plausible? Do we know of any other Hoopl users?
Simon
From: Sophie Taylor [mailto:sophie@traumapony.org]
Sent: 11 June 2017 14:09
To: Michal Terepeta
On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones
mailto:simonpj@microsoft.com> wrote: Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
One reason only: because it makes Hoopl usable by compilers other than GHC. And, dually, efforts by others to improve Hoopl will benefit GHC.
If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
A re-usable library should be a) a significant chunk of code, b) that can plausibly be re-purposed by others c) and that has an explicable API
I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold. But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds. It’s designed to be re-usable. Whether it is actually re-used is another matter, of course. But if it’s part of GHC, it can’t be.
I agree with your characterization of a re-usable library and that Core optimizer would not be a good fit. But I do think that Hoopl also has some problems with b) and c) (although smaller): - Using an optimizer-as-a-library is not really common (I'm not aware of any compilers doing this, LLVM is to some degree close but it exposes the whole language as the interface so it's closer to the idea of extracting the whole Cmm backend). So I don't think the API for such a project is well understood. - The API is pretty wide and does put serious constraints on the IR (after all it defines blocks and graphs), making reusability potentially more tricky. So I think I understand your argument and we just disagree on whether this is worth the effort of having a separate package.
[...]
I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me? If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.
Having even Hoopl2 as a separate package would still entail additional work: - Hoopl2 would still need to duplicate some concepts (eg, `Unique`, etc. since it needs to be standalone) - Understanding code (esp. by newcommers) would be harder: the Cmm backend would be split between GHC and Hoopl2, with the latter necessarily being far more general/polymorphic than needed by GHC. - Getting the right performance in the presence of all this additional generality/polymorphism will likely require fair amount of additional work. - If Hoopl2 is used by other compilers, then we need to be more careful changing anything in incompatible ways, this will require more discussions & release coordination. Considering that Hoopl was never actually picked up by other compilers, I'm not convinced that this cost is justified. But I understand that other people might have a different opinion. So how about a compromise: - decouple GHC from the current Hoopl (ie, go ahead with my diff), - keep everything Hoopl related only in `compiler/cmm/Hoopl` with the long-term intention of creating a separate package, - experiment with and improve the code, - once (if?) we're happy with the results, discuss what/how to extract to a separate package. That gives us the freedom to try things out and see what works well (I simply don't have ready solutions for anything, being able to experiment is IMHO quite important). And once we reach the right performance/representation/abstraction/API we can work on extracting that. What do you think? Cheers, Michal _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devshttps://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-devs&data=02%7C01%7Csimonpj%40microsoft.com%7Cd747eec3caa74856abe408d4b0cb1b80%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636327833778402907&sdata=XF%2FDDgrIvni6kMJQg0ubJXtVtfXUp1HLifUBz2RTxJ4%3D&reserved=0

I don't see why not, other than possible duplication of effort when it
comes to some of the basic algorithms.
Speaking of which, what policies are there on bringing in new dependencies
to GHC, both compile-time and run-time (e.g. possible SMT solver support)?
On Mon, 12 Jun 2017 at 17:07 Simon Peyton Jones
Interesting!
Maybe there are a couple of different alternatives:
A. A rewrite of Hoopl, with all the same basic ideas and data structures, but with a better API (I’m not sure exactly in what way, but Michael has some idea, as does Sophie), and a more efficient implementation.
B. A more radical change to use hypergraphs, type-level lists etc. This sounds interesting, but it’s a more substantial change and before using it for GHC we’d need to discuss the new proposed API in some detail
There’s no reason we couldn’t do (A) and (B) in parallel.
Michael is suggesting doing (A) in GHC’s tree, but with a clearly-declared intent to bring it out as a separate library. (I’d advocate *making* it a separate library in GHC’s tree; we already have a number of those.
That would leave Sophie free to do (B) free of the constraints of GHC depending on it; but we could always use it later.
Does that sound plausible? Do we know of any other Hoopl users?
Simon
*From:* Sophie Taylor [mailto:sophie@traumapony.org] *Sent:* 11 June 2017 14:09 *To:* Michal Terepeta
; Simon Peyton Jones < simonpj@microsoft.com>; ghc-devs *Cc:* Kavon Farvardin
*Subject:* Re: Removing Hoopl dependency? Hello, fellow workers!
So, I'll pop in here with my thoughts.
I'm writing an independent intermediate language library for functional languages, and I looked at using Hoopl. I would use it, but there are several reasons why I'm not currently doing so:
1) Combining facts from different domains through fancy lattice algorithms. This is fairly straightforward to add to Hoopl with minimal extra API change.
2) I wanted to write my data facts as a type-level list, `freer-effects` style, in order to be more explicit in my types about dependencies between analyses. This would require significantly altering the API.
3) Its own custom graph code. This is the biggest reason why I decided not to. Some problems:
* It seems impossible to change the topology of the graph in a rewriting step.
* I wanted to use term hypergraphs/hyperjungles due to some pretty nifty properties
* The intermediate language I'm implementing, a derivative of Graph Reduction Intermediate Notation, aka GRIN from UHC, is, as its name implies, intrinsically graph-based. Thus, graph manipulation has to be pretty easy to do.
So instead, I've decided to optimise another hypergraph library (`graph-rewriting` - I'm going to be rewriting it to use an inductive representation a la FGL) and implement a generic, Hoopl-esque analysis library on top of that. (Or more accurately, that is my plan for the next six months - I've been sidetracked getting parsing to work nice with an effect-based stack!)
So, if Hoopl2 does become a thing, I'd be very keen on working on it, but if I were to actually use it myself, it'd probably require a complete rewrite. Fortunately, it's a pretty small library; and for GHC, its current usage is a pretty straightforward usecase which shouldn't be affected too much. That being said, if GHC were to better use Hoopl (e.g. moving some of the optimisations on Core to be Hoopl-based passes) then it would be a different story.
So I guess I'm volunteering to do the rewrite for a potential Hoopl2 if it's wanted, as I'm about to do pretty much that anyway.
Cheers,
Sophie
On Fri, 9 Jun 2017 at 22:31 Michal Terepeta
wrote: On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones
wrote: Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
One reason only: because it makes Hoopl usable by compilers other than GHC. And, dually, efforts by others to improve Hoopl will benefit GHC.
If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
A re-usable library should be
a) a significant chunk of code,
b) that can plausibly be re-purposed by others
c) and that has an explicable API
I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold. But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds. It’s designed to be re-usable. Whether it is actually re-used is another matter, of course. But if it’s part of GHC, it can’t be.
I agree with your characterization of a re-usable library and that
Core optimizer would not be a good fit. But I do think that Hoopl also
has some problems with b) and c) (although smaller):
- Using an optimizer-as-a-library is not really common (I'm not aware
of any compilers doing this, LLVM is to some degree close but it
exposes the whole language as the interface so it's closer to the
idea of extracting the whole Cmm backend). So I don't think the API
for such a project is well understood.
- The API is pretty wide and does put serious constraints on the IR
(after all it defines blocks and graphs), making reusability
potentially more tricky.
So I think I understand your argument and we just disagree on whether
this is worth the effort of having a separate package.
[...]
I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me? If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.
Having even Hoopl2 as a separate package would still entail
additional work:
- Hoopl2 would still need to duplicate some concepts (eg, `Unique`,
etc. since it needs to be standalone)
- Understanding code (esp. by newcommers) would be harder: the Cmm
backend would be split between GHC and Hoopl2, with the latter
necessarily being far more general/polymorphic than needed by GHC.
- Getting the right performance in the presence of all this additional
generality/polymorphism will likely require fair amount of
additional work.
- If Hoopl2 is used by other compilers, then we need to be more
careful changing anything in incompatible ways, this will require
more discussions & release coordination.
Considering that Hoopl was never actually picked up by other
compilers, I'm not convinced that this cost is justified. But I
understand that other people might have a different opinion.
So how about a compromise:
- decouple GHC from the current Hoopl (ie, go ahead with my diff),
- keep everything Hoopl related only in `compiler/cmm/Hoopl` with the
long-term intention of creating a separate package,
- experiment with and improve the code,
- once (if?) we're happy with the results, discuss what/how to
extract to a separate package.
That gives us the freedom to try things out and see what works well
(I simply don't have ready solutions for anything, being able to
experiment is IMHO quite important). And once we reach the right
performance/representation/abstraction/API we can work on extracting
that.
What do you think?
Cheers,
Michal
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-devs&data=02%7C01%7Csimonpj%40microsoft.com%7Cd747eec3caa74856abe408d4b0cb1b80%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636327833778402907&sdata=XF%2FDDgrIvni6kMJQg0ubJXtVtfXUp1HLifUBz2RTxJ4%3D&reserved=0

Sophie Taylor
I don't see why not, other than possible duplication of effort when it comes to some of the basic algorithms.
Speaking of which, what policies are there on bringing in new dependencies to GHC, both compile-time and run-time (e.g. possible SMT solver support)?
We are generally fairly conservative with adding new dependencies of either type. There are a variety of reasons for this: In the case of runtime dependencies the associated costs are fairly clear: it would either be a) harder for users to use GHC (in the case of mandatory dependencies) or, b) make it harder to follow the behavior of the compiler (in the case of optional dependencies discovered at runtime). There are also costs in the case of compile-time dependencies, although they may not be as easy to see. First, in order to maintain a reproducible revision history GHC includes all dependent libraries as submodules and ships them with source distributions. These submodules carry a small but non-negligible cost to developers due to idiosyncracies in how they are handled by both git and Phabricator. Moreover, we need to periodically bump these submodules, which inevitably brings integration issues which require coordination with upstream to fix. Also, there is a significant synchronization overhead associated with getting upstream maintainers to release new library versions prior to a GHC release. While this generally only affects the release manager, for that person it is indeed a significant cost and does tend to slow down the release cycle. Finally, dependencies of the `ghc` library affects users of tooling which links to it (e.g. ghc-mod). Specifically, since we can only link against a single version of a given package at a time, such tooling packages are forced to link against whatever version `ghc` depends upon. This means that users won't get bugfixes and can constrain install plans, sometimes to the point where no plan is possible. Cheers, - Ben

Speaking of which, what policies are there on bringing in new dependencies to GHC, both compile-time and run-time (e.g. possible SMT solver support)?
We don’t have a formal policy, but we are generally reluctant to take on new dependencies. For SMT solvers, Iavor is using one via a typechecker plugin.
Simon
From: Sophie Taylor [mailto:sophie@traumapony.org]
Sent: 12 June 2017 09:50
To: Simon Peyton Jones
On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones
mailto:simonpj@microsoft.com> wrote: Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
One reason only: because it makes Hoopl usable by compilers other than GHC. And, dually, efforts by others to improve Hoopl will benefit GHC.
If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
A re-usable library should be a) a significant chunk of code, b) that can plausibly be re-purposed by others c) and that has an explicable API
I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold. But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds. It’s designed to be re-usable. Whether it is actually re-used is another matter, of course. But if it’s part of GHC, it can’t be.
I agree with your characterization of a re-usable library and that Core optimizer would not be a good fit. But I do think that Hoopl also has some problems with b) and c) (although smaller): - Using an optimizer-as-a-library is not really common (I'm not aware of any compilers doing this, LLVM is to some degree close but it exposes the whole language as the interface so it's closer to the idea of extracting the whole Cmm backend). So I don't think the API for such a project is well understood. - The API is pretty wide and does put serious constraints on the IR (after all it defines blocks and graphs), making reusability potentially more tricky. So I think I understand your argument and we just disagree on whether this is worth the effort of having a separate package.
[...]
I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me? If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.
Having even Hoopl2 as a separate package would still entail additional work: - Hoopl2 would still need to duplicate some concepts (eg, `Unique`, etc. since it needs to be standalone) - Understanding code (esp. by newcommers) would be harder: the Cmm backend would be split between GHC and Hoopl2, with the latter necessarily being far more general/polymorphic than needed by GHC. - Getting the right performance in the presence of all this additional generality/polymorphism will likely require fair amount of additional work. - If Hoopl2 is used by other compilers, then we need to be more careful changing anything in incompatible ways, this will require more discussions & release coordination. Considering that Hoopl was never actually picked up by other compilers, I'm not convinced that this cost is justified. But I understand that other people might have a different opinion. So how about a compromise: - decouple GHC from the current Hoopl (ie, go ahead with my diff), - keep everything Hoopl related only in `compiler/cmm/Hoopl` with the long-term intention of creating a separate package, - experiment with and improve the code, - once (if?) we're happy with the results, discuss what/how to extract to a separate package. That gives us the freedom to try things out and see what works well (I simply don't have ready solutions for anything, being able to experiment is IMHO quite important). And once we reach the right performance/representation/abstraction/API we can work on extracting that. What do you think? Cheers, Michal _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devshttps://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-devs&data=02%7C01%7Csimonpj%40microsoft.com%7Cd747eec3caa74856abe408d4b0cb1b80%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636327833778402907&sdata=XF%2FDDgrIvni6kMJQg0ubJXtVtfXUp1HLifUBz2RTxJ4%3D&reserved=0

Ben, Simon,
Thanks, that's good to know!
On Tue, 13 Jun 2017 at 07:48 Simon Peyton Jones
Speaking of which, what policies are there on bringing in new dependencies to GHC, both compile-time and run-time (e.g. possible SMT solver support)?
We don’t have a formal policy, but we are generally reluctant to take on new dependencies. For SMT solvers, Iavor is using one via a typechecker plugin.
Simon
*From:* Sophie Taylor [mailto:sophie@traumapony.org] *Sent:* 12 June 2017 09:50 *To:* Simon Peyton Jones
; Michal Terepeta < michal.terepeta@gmail.com>; ghc-devs *Cc:* Kavon Farvardin
*Subject:* Re: Removing Hoopl dependency? I don't see why not, other than possible duplication of effort when it comes to some of the basic algorithms.
Speaking of which, what policies are there on bringing in new dependencies to GHC, both compile-time and run-time (e.g. possible SMT solver support)?
On Mon, 12 Jun 2017 at 17:07 Simon Peyton Jones
wrote: Interesting!
Maybe there are a couple of different alternatives:
A. A rewrite of Hoopl, with all the same basic ideas and data structures, but with a better API (I’m not sure exactly in what way, but Michael has some idea, as does Sophie), and a more efficient implementation.
B. A more radical change to use hypergraphs, type-level lists etc. This sounds interesting, but it’s a more substantial change and before using it for GHC we’d need to discuss the new proposed API in some detail
There’s no reason we couldn’t do (A) and (B) in parallel.
Michael is suggesting doing (A) in GHC’s tree, but with a clearly-declared intent to bring it out as a separate library. (I’d advocate *making* it a separate library in GHC’s tree; we already have a number of those.
That would leave Sophie free to do (B) free of the constraints of GHC depending on it; but we could always use it later.
Does that sound plausible? Do we know of any other Hoopl users?
Simon
*From:* Sophie Taylor [mailto:sophie@traumapony.org] *Sent:* 11 June 2017 14:09 *To:* Michal Terepeta
; Simon Peyton Jones < simonpj@microsoft.com>; ghc-devs *Cc:* Kavon Farvardin
*Subject:* Re: Removing Hoopl dependency? Hello, fellow workers!
So, I'll pop in here with my thoughts.
I'm writing an independent intermediate language library for functional languages, and I looked at using Hoopl. I would use it, but there are several reasons why I'm not currently doing so:
1) Combining facts from different domains through fancy lattice algorithms. This is fairly straightforward to add to Hoopl with minimal extra API change.
2) I wanted to write my data facts as a type-level list, `freer-effects` style, in order to be more explicit in my types about dependencies between analyses. This would require significantly altering the API.
3) Its own custom graph code. This is the biggest reason why I decided not to. Some problems:
* It seems impossible to change the topology of the graph in a rewriting step.
* I wanted to use term hypergraphs/hyperjungles due to some pretty nifty properties
* The intermediate language I'm implementing, a derivative of Graph Reduction Intermediate Notation, aka GRIN from UHC, is, as its name implies, intrinsically graph-based. Thus, graph manipulation has to be pretty easy to do.
So instead, I've decided to optimise another hypergraph library (`graph-rewriting` - I'm going to be rewriting it to use an inductive representation a la FGL) and implement a generic, Hoopl-esque analysis library on top of that. (Or more accurately, that is my plan for the next six months - I've been sidetracked getting parsing to work nice with an effect-based stack!)
So, if Hoopl2 does become a thing, I'd be very keen on working on it, but if I were to actually use it myself, it'd probably require a complete rewrite. Fortunately, it's a pretty small library; and for GHC, its current usage is a pretty straightforward usecase which shouldn't be affected too much. That being said, if GHC were to better use Hoopl (e.g. moving some of the optimisations on Core to be Hoopl-based passes) then it would be a different story.
So I guess I'm volunteering to do the rewrite for a potential Hoopl2 if it's wanted, as I'm about to do pretty much that anyway.
Cheers,
Sophie
On Fri, 9 Jun 2017 at 22:31 Michal Terepeta
wrote: On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones
wrote: Maybe this is the core of our disagreement - why is it a good idea to have Hoopl as a separate package in the first place?
One reason only: because it makes Hoopl usable by compilers other than GHC. And, dually, efforts by others to improve Hoopl will benefit GHC.
If I proposed extracting parts of Core optimizer to a separate package, wouldn't you expect some really good reasons for doing this?
A re-usable library should be
a) a significant chunk of code,
b) that can plausibly be re-purposed by others
c) and that has an explicable API
I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are unlikely to hold. But we carefully designed Hoopl from the ground up so that it was agnostic about the node types, and so can be re-used for control flow graphs of many kinds. It’s designed to be re-usable. Whether it is actually re-used is another matter, of course. But if it’s part of GHC, it can’t be.
I agree with your characterization of a re-usable library and that
Core optimizer would not be a good fit. But I do think that Hoopl also
has some problems with b) and c) (although smaller):
- Using an optimizer-as-a-library is not really common (I'm not aware
of any compilers doing this, LLVM is to some degree close but it
exposes the whole language as the interface so it's closer to the
idea of extracting the whole Cmm backend). So I don't think the API
for such a project is well understood.
- The API is pretty wide and does put serious constraints on the IR
(after all it defines blocks and graphs), making reusability
potentially more tricky.
So I think I understand your argument and we just disagree on whether
this is worth the effort of having a separate package.
[...]
I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me? If we are free to choose nomenclature and API for hoopl2, I’m not yet seeing why making it a separate package is harder than not doing so. E.g. template-haskell is a separate package.
Having even Hoopl2 as a separate package would still entail
additional work:
- Hoopl2 would still need to duplicate some concepts (eg, `Unique`,
etc. since it needs to be standalone)
- Understanding code (esp. by newcommers) would be harder: the Cmm
backend would be split between GHC and Hoopl2, with the latter
necessarily being far more general/polymorphic than needed by GHC.
- Getting the right performance in the presence of all this additional
generality/polymorphism will likely require fair amount of
additional work.
- If Hoopl2 is used by other compilers, then we need to be more
careful changing anything in incompatible ways, this will require
more discussions & release coordination.
Considering that Hoopl was never actually picked up by other
compilers, I'm not convinced that this cost is justified. But I
understand that other people might have a different opinion.
So how about a compromise:
- decouple GHC from the current Hoopl (ie, go ahead with my diff),
- keep everything Hoopl related only in `compiler/cmm/Hoopl` with the
long-term intention of creating a separate package,
- experiment with and improve the code,
- once (if?) we're happy with the results, discuss what/how to
extract to a separate package.
That gives us the freedom to try things out and see what works well
(I simply don't have ready solutions for anything, being able to
experiment is IMHO quite important). And once we reach the right
performance/representation/abstraction/API we can work on extracting
that.
What do you think?
Cheers,
Michal
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.haskell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-devs&data=02%7C01%7Csimonpj%40microsoft.com%7Cd747eec3caa74856abe408d4b0cb1b80%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636327833778402907&sdata=XF%2FDDgrIvni6kMJQg0ubJXtVtfXUp1HLifUBz2RTxJ4%3D&reserved=0

Simon Peyton Jones via ghc-devs
That would leave Sophie free to do (B) free of the constraints of GHC depending on it; but we could always use it later.
Does that sound plausible? Do we know of any other Hoopl users?
CCing Ning, who is currently maintaining hoopl and I believe has some projects using it. Ning, you may want to have a look through this thread if you haven't already seen it. You can find the previous messages in the list archive [1]. Cheers, - Ben [1] May messages: https://mail.haskell.org/pipermail/ghc-devs/2017-May/014255.html June messages: https://mail.haskell.org/pipermail/ghc-devs/2017-June/014293.html

On Mon, Jun 12, 2017 at 8:05 PM Ben Gamari
wrote: Simon Peyton Jones via ghc-devs writes: Snip
That would leave Sophie free to do (B) free of the constraints of GHC depending on it; but we could always use it later.
Does that sound plausible? Do we know of any other Hoopl users?
CCing Ning, who is currently maintaining hoopl and I believe has some projects using it.
Ning, you may want to have a look through this thread if you haven't already seen it. You can find the previous messages in the list archive [1].
Cheers,
- Ben
Based on [1] there are four public packages: - ethereum-analyzer, - linearscan-hoopl, - llvm-analysis, - text-show-instances But there might be more that are not open-source/uploaded to hackage/stackage. Cheers, Michal [1] https://www.stackage.org/lts-8.18/package/hoopl-3.10.2.1
participants (8)
-
Alan & Kim Zimmerman
-
Ben Gamari
-
Erik de Castro Lopo
-
Herbert Valerio Riedel
-
Merijn Verstraaten
-
Michal Terepeta
-
Simon Peyton Jones
-
Sophie Taylor