Ok, I backed it out for all but the compound cases and the performance test once more passes, and I can round-trip compound RdrNames.

On Fri, Dec 12, 2014 at 11:30 PM, Alan & Kim Zimmerman <alan.zimm@gmail.com> wrote:
On reflection, I can try to make it work with annotations just for those fairly rare cases where there are parens/backquotes, and use the location span otherwise.

On Fri, Dec 12, 2014 at 11:20 PM, Alan & Kim Zimmerman <alan.zimm@gmail.com> wrote:
The problem is round-tripping cases like this, which are valid

    (     ///     ) :: Int -> Int -> Int
    a /// b = 3

    baz :: Int -> Int -> Int
    a `     baz     ` b = 4

There can be arbitrary spaces between the surrounding parens and the operator name, and between the backquotes and the identifier in the infix version.

In each case we simply get a RdrName, which in turn is wrapped in HsVar or whatever.

The D538 productions are of the form

    var     :: { Located RdrName }
            : varid                 { $1 }
            | '(' varsym ')'        {% ams (sLL $1 $> (unLoc $2))
                                           [mo $1,mj AnnVal $2,mc $3] }

and
 
    tyvarop :: { Located RdrName }
    tyvarop : '`' tyvarid '`'       {% ams (sLL $1 $> (unLoc $2))
                                           [mj AnnBackquote $1,mj AnnVal $2
                                           ,mj AnnBackquote $3] }

So the location tracks the entire span, but we need annotations for the three individual parts.

Note: I did not check how far close to the limit the performance was prior to this change, it may have been the last 1% to take it over.

Alan


On Fri, Dec 12, 2014 at 11:03 PM, Simon Peyton Jones <simonpj@microsoft.com> wrote:

I am now adding an `AnnVal` to every RdrName, to be able to separate it out from any decoration, such as surrounding backticks or parens.

 

That seems like overkill to me.  (a `op` b) is an HsOpApp, and must of course have backticks unless op is an operator like (a + b), in which case it doesn’t.

 

The corner case is something like ((`op`) a b), which will parse as (HsApp (HsApp (HsVar op) (HsVar a)) (HsVar b)).  But it would be silly for us to get bent out of shape because of such a vanishingly rare corner case.  Instead, if you really want to reflect it faithfully, add a new constructor for “parens around backticks”).

 

Let’s only take these overheads when there is real reason to do so.

 

Simon

 

From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Alan & Kim Zimmerman
Sent: 12 December 2014 14:22
To: ghc-devs@haskell.org
Subject: D538 and compiler performance spec

 

For API annotations I am working in the details of RdrNames, which come in a bewildering variety of syntactic forms.

My latest change causes perf/compiler to fail, with

bytes allocated value is too high:
    Expected    parsing001(normal) bytes allocated: 587079016 +/-5%
    Lower bound parsing001(normal) bytes allocated: 557725065
    Upper bound parsing001(normal) bytes allocated: 616432967
    Actual      parsing001(normal) bytes allocated: 704940512
    Deviation   parsing001(normal) bytes allocated:      20.1 %

I am now adding an `AnnVal` to every RdrName, to be able to separate it out from any decoration, such as surrounding backticks or parens.

Is this a problem? The alternative would be to add a SourceText field to RdrName.

Alan