Re: Static data and RULES

17 Feb 2017

      I've never used such rules myself, but when I asked Duncan Coutts about 
whether and how such rules were used in the wild, he said
...
Well I've certainly tried to use that in the past.
A previous version of the cbor lib which used a different
representation did a lot of matching on constructors to re-arrange input to
an interpreter, until I discovered that GHC
actually uses constructor wrappers and that matching on constructors was
thus not reliable.
He described such rules as "a totally legit thing to want to do". If a 
datatype represents an AST, then rewriting its terms can optimize the 
constructed programs. Of course, it's ultimately up to you. I have no dog in 
this race myself; my concern was for other people's code that could break as a 
result. Certainly such code is already fragile when strict constructors are 
involved, but if people have cleverly figured out that lazy constructors are 
more reliable in that regard, they could be using it. I don't know.

David

On Friday, February 17, 2017 8:06:17 AM EST Simon Peyton Jones via ghc-devs 
wrote:
...
{-# RULES   "L" LCon1 0 = LCon2
Oh I missed this entirely. You want to write a rule FOR a data
constructor????   I thought you just meant one that matches on a data
constructor.
 That is you want (L 0) to rewrite, all by itself, to LCon2? 
That had never occurred to me as a possibility.  Bizarre. Let’s not do
that.
·         GHC does not (knowingly) support it today
·         It is a deeply weird thing to do
·         If you want to do it, write you own “smart constructor” mkLCon1,
that inlines when you say
...
mkLCon1 x = LCon1 x
{-# INILNE [0] mkLCon1 #-}
{-# RULES “L” mkLCon1 x = LCon2 #-}
Problem solved.
Simon
From: David Feuer [mailto:david.feuer@gmail.com]
Sent: 17 February 2017 00:30
To: Simon Peyton Jones 
Cc: ghc-devs ; Reid Barton ; Ben
Gamari 
 Subject: RE: Static data and RULES
Let me give an example. Suppose we have
data L = LCon1 Int | LCon2
data S = SCon !Int
{-# RULES
"L" LCon1 0 = LCon2
"S" forall x . f (SCon x) = g x
 #-}
The immediate problem today is with "S". The SCon wrapper could very well
inline before the rule has a chance to fire. We'd like to be able to phase
that inline to give it a chance.
...
The "L" rule becomes problematic when we try to identify static data the
simplifier shouldn't have to try to optimize. If it identifies LCon 0 as
static, the "L" rule will never fire.
...
On Feb 16, 2017 7:08 PM, "David Feuer"
mailto:david.feuer@gmail.com> wrote:
 Semantically,
the proposed scheme is very nearly equivalent to breaking *every* data
constructor into a worker and a wrapper, and allowing INLINE and NOINLINE
pragmas on the wrappers. That would allow terms built only from constructor
workers and literals to be identified as they're constructed in any stage
and left alone by the simplifier. It would also allow people using RULES
that match on constructors to make those work reliably, by making sure the
bindings they match on don't inline away or get marked static too early. Of
course, we don't actually need to add more worker/wrapper pairs to do this;
we can fake that. 
On Feb 16, 2017 6:53 PM, "Simon Peyton Jones"
mailto:simonpj@microsoft.com> wrote:
 I’m sorry I
still don’t understand the problem.  Can you give an example?  It all works
fine today; what will change in the proposed new scheme.  Indeed what IS
the proposed new scheme? 
I’m lost
Simon
From: David Feuer
[mailto:david.feuer@gmail.commailto:david.feuer@gmail.com]
 Sent: 16
February 2017 23:51
To: Simon Peyton Jones
mailto:simonpj@microsoft.com>
 Cc: ghc-devs
mailto:ghc-devs@haskell.org>; Reid Barton
mailto:rwbarton@gmail.com>; Ben Gamari
mailto:bgamari@gmail.com> Subject: RE: Static data and
RULES
Sorry; guess I should have given more background on that. This goes back to
the performance problems Ben encountered in Typeable. The goal is to avoid
trying to optimize something over and over that's never ever going to
change. If we know that a term is made only of static data, we can skip it
altogether in simplification. Suppose we have
...
foo = Just (Right [1])
Then no amount of optimization will ever be useful. But what about RULES? If
the outermost pattern in a rule matches on a data constructor, then it's
not static anymore! We may be replacing it with something else. So we need
a finer mechanism. We *also* need a finer mechanism for strict constructors
in general. We need to avoid inlining those too early if they're mentioned
in any position in RULES. Trying to make this work right automagically
looks a bit tricky in the face of orphan rules and such.
...
On Feb 16, 2017 6:35 PM, "Simon Peyton Jones"
mailto:simonpj@microsoft.com> wrote:
 I don’t
understand any of this.
However, RULES are allowed to match on data constructors and it would be
nice to let that keep happening.
...
Why won’t it keep happening?  What is the problem you are trying to solve? 
Why does the fast-path make it harder?
...
Maybe open a ticket?
Simon
From: ghc-devs
[mailto:ghc-devs-bounces@haskell.orgmailto:ghc-devs-bounces@haskell.org]
On Behalf Of David Feuer
 Sent: 16 February 2017 22:13
To: Ben Gamari mailto:bgamari@gmail.com>; Reid Barton
mailto:rwbarton@gmail.com>
 Cc: ghc-devs
mailto:ghc-devs@haskell.org>
Subject: Static data and RULES
Ben Gamari and Reid Barton are interested in making it cheaper for static
data to pass through simplification. The basic idea is that if a term is
already made entirely of data constructors and literals, then there's
nothing left to optimize.
...
However, RULES are allowed to match on data constructors and it would be
nice to let that keep happening. But on the other hand, RULES are
apparently (according to Duncan Coutts) already broken for strict data
constructors, because they have workers and wrappers.
...
My thought: let's allow phased INLINE and NOINLINE pragmas for data
constructors. The default would be INLINE. The ~ phase choice would not be
available: once inline, always inline.
...
Semantics
~~~~~~~~~~
For all constructors:
If a constructor is allowed by pragmas to inline in a certain phase, then in
that phase terms built from it can be considered static. Once static,
always static.
...
If a constructor is not allowed to inline in a certain phase, terms built
from it will be considered non-static.
...
After demand analysis and worker/wrapper, all constructors are considered
inline.
...
For strict constructors:
A strict constructor wrapper prohibited from inlining in a certain phase
simply will not.
...
Strict constructor wrappers will all be allowed to inline after demand
analysis and worker/wrapper. This matches the way we now handle wrappers
actually created in that phase.
...
Syntax:
For GADT syntax, this is easy:
data Foo ... where
  {-# INLINE [1] Bar #-}
  Bar :: ...
For traditional syntax, I think it's probably best to pull the pragmas to
the top:
...
{-# NOINLINE Quux #-}
data Baz ... = Quux ... | ...