[GHC] #12415: Fancy BinIface encoding for tuples is broken for constraint tuples

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: #12357 Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- We have a special way of encoding tuple-related names in the interface file symbol table. See `Note [Symbol table representation of names]`. However, it was broken for constraint tuples by ffc21506894c7887d3620423aaf86bc6113a1071. The problem is very similar to the one which prevented us from encoding tuple type representations in #12357 (see ticket:12357#comment:31). By unwiring constraint tuples it becomes harder to identify them for special handling during serialization. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12415 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #12357 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by bgamari: @@ -5,4 +5,22 @@ - ffc21506894c7887d3620423aaf86bc6113a1071. The problem is very similar to - the one which prevented us from encoding tuple type representations in - #12357 (see ticket:12357#comment:31). By unwiring constraint tuples it - becomes harder to identify them for special handling during serialization. + ffc21506894c7887d3620423aaf86bc6113a1071. Namely, `putName` now panics + when given a constraint tuple, + {{{#!hs + putTupleName_ :: BinHandle -> TyCon -> TupleSort -> Word32 -> IO () + putTupleName_ bh tc tup_sort thing_tag + = -- ASSERT(arity < 2^(30 :: Int)) + put_ bh (0x80000000 .|. (sort_tag `shiftL` 28) .|. (thing_tag `shiftL` + 26) .|. arity) + where + (sort_tag, arity) = case tup_sort of + BoxedTuple -> (0, fromIntegral (tyConArity tc)) + UnboxedTuple -> (1, fromIntegral (tyConArity tc `div` 2)) + -- See Note [Unboxed tuple RuntimeRep vars] in TyCon + ConstraintTuple -> pprPanic "putTupleName:ConstraintTuple" (ppr tc) + }}} + While this currently doesn't break anything, this is only because the + clever encoding isn't used for constraint tuples. + + I believe the problem is very similar to the one which prevented us from + encoding tuple type representations in #12357 (see + ticket:12357#comment:31). By unwiring constraint tuples it becomes harder + to identify them for special handling during serialization. New description: We have a special way of encoding tuple-related names in the interface file symbol table. See `Note [Symbol table representation of names]`. However, it was broken for constraint tuples by ffc21506894c7887d3620423aaf86bc6113a1071. Namely, `putName` now panics when given a constraint tuple, {{{#!hs putTupleName_ :: BinHandle -> TyCon -> TupleSort -> Word32 -> IO () putTupleName_ bh tc tup_sort thing_tag = -- ASSERT(arity < 2^(30 :: Int)) put_ bh (0x80000000 .|. (sort_tag `shiftL` 28) .|. (thing_tag `shiftL` 26) .|. arity) where (sort_tag, arity) = case tup_sort of BoxedTuple -> (0, fromIntegral (tyConArity tc)) UnboxedTuple -> (1, fromIntegral (tyConArity tc `div` 2)) -- See Note [Unboxed tuple RuntimeRep vars] in TyCon ConstraintTuple -> pprPanic "putTupleName:ConstraintTuple" (ppr tc) }}} While this currently doesn't break anything, this is only because the clever encoding isn't used for constraint tuples. I believe the problem is very similar to the one which prevented us from encoding tuple type representations in #12357 (see ticket:12357#comment:31). By unwiring constraint tuples it becomes harder to identify them for special handling during serialization. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12415#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #12357 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by bgamari: @@ -26,1 +26,2 @@ - to identify them for special handling during serialization. + to identify them for special handling during serialization, when we only + have the `Name` to look at. New description: We have a special way of encoding tuple-related names in the interface file symbol table. See `Note [Symbol table representation of names]`. However, it was broken for constraint tuples by ffc21506894c7887d3620423aaf86bc6113a1071. Namely, `putName` now panics when given a constraint tuple, {{{#!hs putTupleName_ :: BinHandle -> TyCon -> TupleSort -> Word32 -> IO () putTupleName_ bh tc tup_sort thing_tag = -- ASSERT(arity < 2^(30 :: Int)) put_ bh (0x80000000 .|. (sort_tag `shiftL` 28) .|. (thing_tag `shiftL` 26) .|. arity) where (sort_tag, arity) = case tup_sort of BoxedTuple -> (0, fromIntegral (tyConArity tc)) UnboxedTuple -> (1, fromIntegral (tyConArity tc `div` 2)) -- See Note [Unboxed tuple RuntimeRep vars] in TyCon ConstraintTuple -> pprPanic "putTupleName:ConstraintTuple" (ppr tc) }}} While this currently doesn't break anything, this is only because the clever encoding isn't used for constraint tuples. I believe the problem is very similar to the one which prevented us from encoding tuple type representations in #12357 (see ticket:12357#comment:31). By unwiring constraint tuples it becomes harder to identify them for special handling during serialization, when we only have the `Name` to look at. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12415#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #12357 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): One of the roadblocks standing in the way of reinstating this encoding is that the constraint tuple's `Name`s are actually quite different from the other tuples (and even boxed/unboxed sums). There are five `Name`s associated with a type (or class) that we need to worry about in these cases, a. the `Name` of the `TyCon` b. the `Name` of its `DataCon` c. the `Name` of its `DataCon` worker d. the `Name` of its type representation e. the `Name` of its promoted data constructor type representation In the case of boxed and unboxed tuples a, b, and c are wired-in and d and e are known-key. In the case of constraint tuples, however, a is known-key and the rest are unknown to the compiler, requiring a lookup in the `GHC.Classes` interface file. This means that it is quite difficult to spot most of the `Name`s that for boxed and unboxed tuples we encode specially. Simon, do you think this is worth fixing? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12415#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #12357 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Earlier today it occurred to me that `putName` already does a finite map lookup which made me think twice about the approach I took in #12357, {{{#!hs knownKeyNamesMap :: UniqFM Name knownKeyNamesMap = listToUFM_Directly [(nameUnique n, n) | n <- knownKeyNames] putName :: BinDictionary -> BinSymbolTable -> BinHandle -> Name -> IO () putName _dict BinSymbolTable{...} bh name | name `elemUFM` knownKeyNamesMap , let (c, u) = unpkUnique (nameUnique name) -- INVARIANT: (ord c) fits in 8 bits = ... }}} `knownKeyNamesMap` is currently used for two things, * A membership check is made when encoding a name in `putName` so we know that the `Name` can be encoded as just its unique. * A lookup is done in `getSymtabName` to recover the `Name` from the unique during deserialization The fact that we already do a lookup here puts an option for resolving this issue (as well as the tuple type representation issue of #12357) back on the table which I had previously ruled out in ticket:12357#comment:29. The idea is that we simply keep a lookup data structure containing `Name`s of things that need special treatment during interface file serialization. Indeed this is precisely what `knownKeyNamesMap` is. One option here would be to add the `Name`s of constraint tuples to this map and change it's type a bit, {{{#!hs data KnownKeyThing = -- | Tuple things get a fancy encoding of their own. -- While tuple type and data constructors are wired-in and therefore -- easy to spot, type reps are merely known key so we need to identify -- them by a look-up in 'knownKeyThingsMap'. TupleTypeRep !TupleSort !Arity -- | Constraint tuples. Only the 'TyCon's of these are known key. | CTupleTyCon !Arity -- | Boxed and unboxed sums (these have a similar encoding to tuples) | SumTyCon !Boxity !Arity | SumDataCon !Boxity !Arity !ConTagZ -- | Something which we know the key of; these things -- we encode in the interface file as just their 'Unique' | KnownKeyName Name knownKeyThingsMap :: NameEnv KnownKeyThing knownKeyThingsMap = mkNameEnv $ known_key_things ++ tuple_typerep_things ++ ctuple_things where known_key_things = [ (name, KnownKeyName name) | names <- knownKeyNames ] ctuple_things = [ (cTupleTyConName arity, CTupleTyCon arity) | arity <- [2..mAX_TUPLE_SIZE] ] tuple_typerep_things = [ (rep_name, TupleTypeRep tup_sort arity) | tup_sort <- [BoxedTuple, UnboxedTuple] , arity <- [2..mAX_TUPLE_SIZE] , let Just rep_name = tyConRepName_maybe $ tupleTyCon boxity arity ] }}} We'd then just modify the logic in `putName` to do the appropriate thing with the result of the lookup from this map. The deserialization side of things should be similarly straightforward (decoding the encoded `KnownKeyThing` and then doing what is necessary to turn it into a `Name`). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12415#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #12357 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Simon and I have been doing a lot of thinking about how best to fix this: here is the current story. It looks like we can get rid of the entirely tuple serialization since tuples (and indeed all other wired-in things) are known-key and are therefore already serialized efficiently with the existing known-key logic in `BinIface`. I've gone ahead and removed this codepath in Phab:D2467. However, Phab:D2467 still leaves some room for improvement as we have an extremely large number of tuple and sum names in `knownKeyNames` and hence the name cache unnecessarily. These can in principle be removed (and indeed this was what lead me to this ticket to begin with). However, as this is largely an orthogonal matter I'll be tracking it in # -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12415#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: #12357 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #12357 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => closed * resolution: => fixed * milestone: => 8.2.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12415#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12415: Fancy BinIface encoding for tuples is broken for constraint tuples -------------------------------------+------------------------------------- Reporter: bgamari | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 8.0.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #12357 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Hooray! Well done Ben. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12415#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC