Methinks a lot of the former performance considerations in Unique are out-dated (as per earlier discussion; direct use of unboxed ints etc.).


An upside of using an ADT for the types of uniques is that we don't actually need to reserve 8 bits for a Char (which is committing to neither the actual number of classes, nor the "nature" of real Chars in Haskell). Instead, we can make a bitmask dependent on the number of classes that we actually use and stick the tag on the least-significant side of the Unique, as opposed to the most-significant (as we do now).


We want to keep things working on 32-bits, but maybe a future of parallel builds is only for 64-bits. In this case, I would suggest that the 64-bit-case looks like this:


<thread_id_bits:8> <unique_id_bits:56-X> <tag_bits:X>


whereas the 32-bit case simply has


<unique_id_bits:32-X> <tag_bits:X>


Where X is dependent on the size of the UniqueClass-sum-type (to be introduced). This would be CPP-magic'd using ​WORD_SIZE_IN_BITS.


Ph.







From: Simon Peyton Jones <simonpj@microsoft.com>
Sent: 20 August 2014 13:01
To: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org
Subject: RE: Unique as special boxing type & hidden constructors
 

Sounds like a good idea to me.   Would need to think about making sure that it all still worked, somehow, on 32 bit.

 

S

 

From: p.k.f.holzenspies@utwente.nl [mailto:p.k.f.holzenspies@utwente.nl]
Sent: 20 August 2014 11:31
To: Simon Peyton Jones; ghc-devs@haskell.org
Subject: RE: Unique as special boxing type & hidden constructors

 

Dear Simon, et al,

 

I seem to recall that the Unique(Supply) was an issue in parallelising GHC itself. There's a comment in the code (signed JSM) that there aren't any 64-bit bugs, if we have at least 32-bits for Ints and Chars fit in 8 characters. Then, there's bitmasks like 0x00FFFFFF to separate the "Int-part" from the "Char-part".

 

I was wondering; if we move Uniques to 64 bits, but use the top 16 (instead of the current 8) for *both* the tag (currently a Char, soon an sum-type) and the threadId of the supplying thread of a Unique, would that help?

 

Regards,

Philip

 

 

 

 


From: Simon Peyton Jones <simonpj@microsoft.com>
Sent: 18 August 2014 23:29
To: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org
Subject: RE: Unique as special boxing type & hidden constructors

 

1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed?

 

I think so, unless others yell to the contrary.

 

2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different "classes" of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like:

 

> data UniqueClass

>   = UniqDesugarer

>   | UniqAbsCFlattener

>   | UniqSimplStg

>   | UniqNativeCodeGen

>   ...

 

OK by me

 

3) Is there a reason for having functions implementing class-methods to be exported? In the case of Unique, there is pprUnique and:

> instance Outputable Unique where

>   ppr = pprUnique

 

Please don’t change this.  If you want to change how pretty-printing of uniques works, and want to find all the call sites of pprUnique, it’s FAR easier to grep for pprUnique than to search for  all calls of ppr, and work out which are at type Unique! 

 

(In my view) it’s usually much better not to use type classes unless you actually need overloading.

 

Simon

 

From: p.k.f.holzenspies@utwente.nl [mailto:p.k.f.holzenspies@utwente.nl]
Sent: 18 August 2014 14:50
To: Simon Peyton Jones; ghc-devs@haskell.org
Subject: RE: Unique as special boxing type & hidden constructors

 

Dear Simon, et al,

 

Looking at Unique, there are a few more design choices that may be outdated, and since I'm polishing things now, anyway, I figured I could update it on more fronts.

 

1) There is a #ifdef define(__GLASGOW_HASKELL__), which confused me somewhat. Similar things occur elsewhere in the code. Isn't the assumption that GHC is being used? Is this old portability stuff that may be removed?

 

2) Uniques are produced from a Char and an Int. The function to build Uniques (mkUnique) is not exported, according to the comments, so as to see all characters used. Access to these different "classes" of Uniques is given through specialised mkXXXUnique functions. Does anyone have a problem with something like:

 

> data UniqueClass

>   = UniqDesugarer

>   | UniqAbsCFlattener

>   | UniqSimplStg

>   | UniqNativeCodeGen

>   ...

 

and a public (i.e. exported) function:

 

> mkUnique :: UniqueClass -> Int -> Unique

 

? The benefit of this would be to have more (to my taste) self-documenting code and a greater chance that documentation is updated (the list of "unique supply characters" in the comments is currently outdated).

 

3) Is there a reason for having functions implementing class-methods to be exported? In the case of Unique, there is pprUnique and:

 

> instance Outputable Unique where

>   ppr = pprUnique

 

Here pprUnique is exported and it is used in quite a few places where it's argument is unambiguously a Unique (so it's not to force the type) *and* "ppr" is used for all kinds of other types. I'm assuming this is an old choice making things marginally faster, but I would say cleaning up the API / namespace would now outweigh this margin.

I will also be adding Haddock-comments, so when this is done, a review would be most welcome (I'll also be doing some similar transformations to other long-since-untouched-code).

 

Regards,

Philip

 

 

 

 

 

 

 

 


Van: Simon Peyton Jones <simonpj@microsoft.com>
Verzonden: maandag 18 augustus 2014 00:11
Aan: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org
Onderwerp: RE: Unique as special boxing type & hidden constructors

 

Re (1) I think this is a historical.  A newtype wrapping an Int should be fine.  I’d be ok with that change.

 

Re (2), I think your question is: why does module Unique export the data type Unique abstractly, rather than exporting both the data type and its constructor.  No deep reason here, but it guarantees that you can only *make* a unique from an Int by calling ‘mkUniqueGrimily’, which signals clearly that something fishy is going on.  And rightly so!

 

Simon

 

From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of p.k.f.holzenspies@utwente.nl
Sent: 15 August 2014 11:53
To: ghc-devs@haskell.org
Subject: Unique as special boxing type & hidden constructors

 

Dear all,

 

I'm working with Alan to instantiate everything for Data.Data, so that we can do better SYB-traversals (which should also help newcomers significantly to get into the GHC code base). Alan's looking at the AST types, I'm looking at the basic types in the compiler.

 

Right now, I'm looking at Unique and two questions come up:

 

> data Unique = MkUnique FastInt

 

1) As someone already commented: Is there a specific reason (other than history) that this isn't simply a newtype around an Int? If we're boxing anyway, we may as well use the default Int boxing and newtype-coerce to the specific purpose of Unique, no?

 

2) As a general question for GHC hacking style; what is the reason for hiding the constructors in the first place?

 

I understand about abstraction and there are reasons for hiding, but there's a "public GHC API" and then there are all these modules that people can import at their own peril. Nothing is guaranteed about their consistency from version to version of GHC. I don't really see the point about hiding constructors (getting in the way of automatically deriving things) and then giving extra functions like (in the case of Unique):

 

> getKeyFastInt (MkUnique x) = x

> mkUniqueGrimily x = MkUnique (iUnbox x)

 

I would propose to just make Unique a newtype for an Int and making the constructor visible.

Regards,

Philip