Broken Data.Data instances

Dear GHC-ers, Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB's "everything" being broken by these instances, not so much. Would a patch "fixing" these instances be acceptable? Regards, Philip

GHC's data structures are often mutually recursive. e.g. * The TyCon for Maybe contains the DataCon for Just * The DataCon For just contains Just's type * Just's type contains the TyCon for Maybe So any attempt to recursively walk over all these structures, as you would a tree, will fail. Also there's a lot of sharing. For example, every occurrence of 'map' is a Var, and inside that Var is map's type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map. Maybe that's it; I'm not certain since I did not write the Data instances for any of GHC's types Simon From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of p.k.f.holzenspies@utwente.nl Sent: 24 July 2014 16:42 To: ghc-devs@haskell.org Subject: Broken Data.Data instances Dear GHC-ers, Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB's "everything" being broken by these instances, not so much. Would a patch "fixing" these instances be acceptable? Regards, Philip

Dear Simon, et al, These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful). So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable? Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested? Regards, Philip
Simon Peyton Jones mailto:simonpj@microsoft.com 24 Jul 2014 18:22
GHC's data structures are often mutually recursive. e.g.
·The TyCon for Maybe contains the DataCon for Just
·The DataCon For just contains Just's type
·Just's type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there's a lot of sharing. For example, every occurrence of 'map' is a Var, and inside that Var is map's type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that's it; I'm not certain since I did not write the Data instances for any of GHC's types
Simon
*From:*ghc-devs [mailto:ghc-devs-bounces@haskell.org] *On Behalf Of *p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances
Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB's "everything" being broken by these instances, not so much.
Would a patch "fixing" these instances be acceptable?
Regards,
Philip

So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable? That's fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented. Simon From: "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] Sent: 24 July 2014 18:42 To: Simon Peyton Jones Cc: ghc-devs@haskell.org Subject: Re: Broken Data.Data instances Dear Simon, et al, These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful). So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable? Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested? Regards, Philip [cid:image001.jpg@01CFA78B.7D356DE0] Simon Peyton Jonesmailto:simonpj@microsoft.com 24 Jul 2014 18:22 GHC's data structures are often mutually recursive. e.g. · The TyCon for Maybe contains the DataCon for Just · The DataCon For just contains Just's type · Just's type contains the TyCon for Maybe So any attempt to recursively walk over all these structures, as you would a tree, will fail. Also there's a lot of sharing. For example, every occurrence of 'map' is a Var, and inside that Var is map's type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map. Maybe that's it; I'm not certain since I did not write the Data instances for any of GHC's types Simon From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of p.k.f.holzenspies@utwente.nlmailto:p.k.f.holzenspies@utwente.nl Sent: 24 July 2014 16:42 To: ghc-devs@haskell.orgmailto:ghc-devs@haskell.org Subject: Broken Data.Data instances Dear GHC-ers, Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB's "everything" being broken by these instances, not so much. Would a patch "fixing" these instances be acceptable? Regards, Philip

While we are talking about fixing traversals, how about getting rid of the
phase specific panic initialisers for placeHolderType, placeHolderKind and
friends?
In order to safely traverse with SYB, the following needs to be inserted
into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/...
)
-- Check the Typeable items
checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool
checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ`
fixity `SYB.extQ` nameSet) x
where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) ::
GHC.NameSet -> Bool
postTcType = const (stage < SYB.TypeChecker ) ::
GHC.PostTcType -> Bool
fixity = const (stage < SYB.Renamer ) ::
GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit
'undefined values.
Perhaps use an initialiser that can have its panic turned off when called
via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards,
Philip
________________________________
Van: Alan & Kim Zimmerman [alan.zimm@gmail.com]
Verzonden: vrijdag 25 juli 2014 13:44
Aan: Simon Peyton Jones
CC: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org
Onderwerp: Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much. Would a patch “fixing” these instances be acceptable? Regards, Philip _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

My intention would be to simply change the placeholders into something that would not blow up during a normal traversal, preferably something that still gives the required behaviour when invoked normally by GHC, to indicate a bug that needs fixing, but that can somehow be turned off at other tiimes. I am open to suggestions as to a mechanism that can achieve this, I thought of some kind of setting via Dynamic Flags. Alan On Sun, Jul 27, 2014 at 4:17 PM,
Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip
------------------------------ *Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

To my knowledge there is no trac ticket to make the AST safe. Is this
correct? Can I make one?
Alan
On Sun, Jul 27, 2014 at 4:28 PM, Alan & Kim Zimmerman
My intention would be to simply change the placeholders into something that would not blow up during a normal traversal, preferably something that still gives the required behaviour when invoked normally by GHC, to indicate a bug that needs fixing, but that can somehow be turned off at other tiimes.
I am open to suggestions as to a mechanism that can achieve this, I thought of some kind of setting via Dynamic Flags.
Alan
On Sun, Jul 27, 2014 at 4:17 PM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip
------------------------------ *Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl]
*Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Philip, Alan, If you need a hand, I'm happy to pitch in guidance. I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock. Simon, It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try. Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways. -Edward On Sun, Jul 27, 2014 at 10:17 AM,
Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip
------------------------------ *Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Philip
How would you like to take this forward? From my side I would appreciate
all guidance/help to get it resolved, it is a huge hindrance for HaRe.
Alan
On Sun, Jul 27, 2014 at 7:27 PM, Edward Kmett
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip
------------------------------ *Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl]
*Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

What if there is a good reason for a missing/broken Data.Data instance? I'm specifically thinking of GADTs. There are few currently, but I, for one, have toyed with the idea of adding more. My recollection is that Data.Data doesn't work with GADTs. As a concrete, existent example, see CoAxiom.BranchList, which allows for type-level reification of singleton lists as distinct from other, not-necessarily-singleton lists.
I would very much like to support API usage that would benefit from working Data.Data instances, but I also want to be sure we're not eliminating other possible futures without due discussion.
Richard
On Jul 27, 2014, at 2:04 PM, "Alan & Kim Zimmerman"
Philip
How would you like to take this forward? From my side I would appreciate all guidance/help to get it resolved, it is a huge hindrance for HaRe.
Alan
On Sun, Jul 27, 2014 at 7:27 PM, Edward Kmett
wrote: Philip, Alan, If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan, In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip
Van: Alan & Kim Zimmerman [alan.zimm@gmail.com] Verzonden: vrijdag 25 juli 2014 13:44 Aan: Simon Peyton Jones CC: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org Onderwerp: Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman
wrote: While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends? In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/...)
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones
wrote: So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable? That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
From: "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] Sent: 24 July 2014 18:42 To: Simon Peyton Jones Cc: ghc-devs@haskell.org Subject: Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
Simon Peyton Jones
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of p.k.f.holzenspies@utwente.nl Sent: 24 July 2014 16:42 To: ghc-devs@haskell.org Subject: Broken Data.Data instances
Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Im mostly looking at the Data.Data stuff as "nice to have" at this point.
Well, it really is need to have for some users, but they can typically get
by by writing a few hundred lines of boilerplate when its not there.
If you need to break something internally and it costs us a Data instance
for something? Have at it.
If we can still hack around the changes with Data then great.
Otherwise the Data machinery has always been for expert users who already
deal with a great deal of breakage anyways, so thrashing on that API seems
fine to me. Not desirable, but not unexpected.
-Edward
On Sun, Jul 27, 2014 at 9:49 PM, Richard Eisenberg
What if there is a good reason for a missing/broken Data.Data instance? I'm specifically thinking of GADTs. There are few currently, but I, for one, have toyed with the idea of adding more. My recollection is that Data.Data doesn't work with GADTs. As a concrete, existent example, see CoAxiom.BranchList, which allows for type-level reification of singleton lists as distinct from other, not-necessarily-singleton lists.
I would very much like to support API usage that would benefit from working Data.Data instances, but I also want to be sure we're not eliminating other possible futures without due discussion.
Richard
On Jul 27, 2014, at 2:04 PM, "Alan & Kim Zimmerman"
wrote: Philip
How would you like to take this forward? From my side I would appreciate all guidance/help to get it resolved, it is a huge hindrance for HaRe.
Alan
On Sun, Jul 27, 2014 at 7:27 PM, Edward Kmett
wrote: Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip
------------------------------ *Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl]
*Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there. There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is | HsMultiIf PostTcType [LGRHS id (LHsExpr id)] After type checking we know what type the thing has, but before we have no clue. We could get around this by saying type PostTcType = Maybe TcType but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed. It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky. However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this: | HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)] and define PostTcType as a closed type family thus type family PostTcType a where PostTcType Id = TcType PostTcType other = () That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact. Simon From: Edward Kmett [mailto:ekmett@gmail.com] Sent: 27 July 2014 18:27 To: p.k.f.holzenspies@utwente.nl Cc: alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs Subject: Re: Broken Data.Data instances Philip, Alan, If you need a hand, I'm happy to pitch in guidance. I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock. Simon, It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try. Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways. -Edward On Sun, Jul 27, 2014 at 10:17 AM,
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much. Would a patch “fixing” these instances be acceptable? Regards, Philip _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

What about creating a specific type with a single constructor for the "not
relevant to this phase" type to be used instead of () above? That would
also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman
What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones
wrote:
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe.
From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
Sent: maandag 28 juli 2014 11:14
To: Simon Peyton Jones
Cc: Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
Subject: Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much. Would a patch “fixing” these instances be acceptable? Regards, Philip _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go:
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards,
Philip
From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
Sent: maandag 28 juli 2014 11:14
To: Simon Peyton Jones
Cc: Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
Subject: Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much. Would a patch “fixing” these instances be acceptable? Regards, Philip _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Philip I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better. On a related note, is there any way to constrain the 'a' in type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp to have an instance of Data? I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original). Alan On Mon, Jul 28, 2014 at 12:30 PM,
Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go:
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* maandag 28 juli 2014 11:14 *To:* Simon Peyton Jones *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
*Subject:* Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman
wrote: What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones
wrote: I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Dear Alan, I would think you would want to constrain the result, i.e. type family (Data (PostTcType a)) => PostTcType a where … The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’. Your point about SYB-recognition of WrongPhase is, of course, a good one ;) Regards, Philip From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] Sent: maandag 28 juli 2014 14:10 To: Holzenspies, P.K.F. (EWI) Cc: Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org Subject: Re: Broken Data.Data instances Philip I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better. On a related note, is there any way to constrain the 'a' in type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp to have an instance of Data? I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original). Alan On Mon, Jul 28, 2014 at 12:30 PM,
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much. Would a patch “fixing” these instances be acceptable? Regards, Philip _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

I already tried that, the syntax does not seem to allow it. I suspect some higher form of sorcery will be required, as alluded to here http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family Alan On Mon, Jul 28, 2014 at 4:55 PM,
Dear Alan,
I would think you would want to constrain the result, i.e.
type family (Data (PostTcType a)) => PostTcType a where …
The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’.
Your point about SYB-recognition of WrongPhase is, of course, a good one ;)
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* maandag 28 juli 2014 14:10 *To:* Holzenspies, P.K.F. (EWI) *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
*Subject:* Re: Broken Data.Data instances
Philip
I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better.
On a related note, is there any way to constrain the 'a' in
type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp
to have an instance of Data?
I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original).
Alan
On Mon, Jul 28, 2014 at 12:30 PM,
wrote: Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go:
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
*Sent:* maandag 28 juli 2014 11:14
*To:* Simon Peyton Jones *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
*Subject:* Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman
wrote: What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones
wrote: I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

FYI I edited the paste at http://lpaste.net/108262 to show the problem
On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman
I already tried that, the syntax does not seem to allow it.
I suspect some higher form of sorcery will be required, as alluded to here http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
Alan
On Mon, Jul 28, 2014 at 4:55 PM,
wrote: Dear Alan,
I would think you would want to constrain the result, i.e.
type family (Data (PostTcType a)) => PostTcType a where …
The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’.
Your point about SYB-recognition of WrongPhase is, of course, a good one ;)
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* maandag 28 juli 2014 14:10 *To:* Holzenspies, P.K.F. (EWI) *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
*Subject:* Re: Broken Data.Data instances
Philip
I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better.
On a related note, is there any way to constrain the 'a' in
type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp
to have an instance of Data?
I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original).
Alan
On Mon, Jul 28, 2014 at 12:30 PM,
wrote: Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go:
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
*Sent:* maandag 28 juli 2014 11:14
*To:* Simon Peyton Jones *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
*Subject:* Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Status update
I have worked through a proof of concept update to the GHC AST whereby the
type is provided as a parameter to each data type. This was basically a
mechanical process of changing type signatures, and required very little
actual code changes, being only to initialise the placeholder types.
The enabling types are
type PostTcType = Type -- Used for slots in the abstract syntax
-- where we want to keep slot for a type
-- to be added by the type checker...but
-- [before typechecking it's just bogus]
type PreTcType = () -- used before typechecking
class PlaceHolderType a where
placeHolderType :: a
instance PlaceHolderType PostTcType where
placeHolderType = panic "Evaluated the place holder for a PostTcType"
instance PlaceHolderType PreTcType where
placeHolderType = ()
These are used to replace all instances of PostTcType in the hsSyn types.
The change was applied against HEAD as of last friday, and can be found here
https://github.com/alanz/ghc/tree/wip/landmine-param
https://github.com/alanz/haddock/tree/wip/landmine-param
They pass 'sh validate' with GHC 7.6.3, and compile against GHC 7.8.3. I
have not tried to validate that yet, have no reason to expect failure.
Can I please get some feedback as to whether this is a worthwhile change?
It is the first step to getting a generic traversal safe AST
Regards
Alan
On Mon, Jul 28, 2014 at 5:45 PM, Alan & Kim Zimmerman
FYI I edited the paste at http://lpaste.net/108262 to show the problem
On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman
wrote:
I already tried that, the syntax does not seem to allow it.
I suspect some higher form of sorcery will be required, as alluded to here http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
Alan
On Mon, Jul 28, 2014 at 4:55 PM,
wrote: Dear Alan,
I would think you would want to constrain the result, i.e.
type family (Data (PostTcType a)) => PostTcType a where …
The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’.
Your point about SYB-recognition of WrongPhase is, of course, a good one ;)
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* maandag 28 juli 2014 14:10 *To:* Holzenspies, P.K.F. (EWI) *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
*Subject:* Re: Broken Data.Data instances
Philip
I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better.
On a related note, is there any way to constrain the 'a' in
type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp
to have an instance of Data?
I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original).
Alan
On Mon, Jul 28, 2014 at 12:30 PM,
wrote: Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go:
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
*Sent:* maandag 28 juli 2014 11:14
*To:* Simon Peyton Jones *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
*Subject:* Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

And I dipped my toes into the phabricator water, and uploaded a diff to
https://phabricator.haskell.org/D153
I left the lines long for now, so that it is clear that I simply added
parameters to existing type signatures.
On Tue, Aug 12, 2014 at 10:51 PM, Alan & Kim Zimmerman
Status update
I have worked through a proof of concept update to the GHC AST whereby the type is provided as a parameter to each data type. This was basically a mechanical process of changing type signatures, and required very little actual code changes, being only to initialise the placeholder types.
The enabling types are
type PostTcType = Type -- Used for slots in the abstract syntax -- where we want to keep slot for a type -- to be added by the type checker...but -- [before typechecking it's just bogus] type PreTcType = () -- used before typechecking
class PlaceHolderType a where placeHolderType :: a
instance PlaceHolderType PostTcType where
placeHolderType = panic "Evaluated the place holder for a PostTcType"
instance PlaceHolderType PreTcType where placeHolderType = ()
These are used to replace all instances of PostTcType in the hsSyn types.
The change was applied against HEAD as of last friday, and can be found here
https://github.com/alanz/ghc/tree/wip/landmine-param https://github.com/alanz/haddock/tree/wip/landmine-param
They pass 'sh validate' with GHC 7.6.3, and compile against GHC 7.8.3. I have not tried to validate that yet, have no reason to expect failure.
Can I please get some feedback as to whether this is a worthwhile change?
It is the first step to getting a generic traversal safe AST
Regards Alan
On Mon, Jul 28, 2014 at 5:45 PM, Alan & Kim Zimmerman
wrote:
FYI I edited the paste at http://lpaste.net/108262 to show the problem
On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
I already tried that, the syntax does not seem to allow it.
I suspect some higher form of sorcery will be required, as alluded to here http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
Alan
On Mon, Jul 28, 2014 at 4:55 PM,
wrote: Dear Alan,
I would think you would want to constrain the result, i.e.
type family (Data (PostTcType a)) => PostTcType a where …
The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’.
Your point about SYB-recognition of WrongPhase is, of course, a good one ;)
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* maandag 28 juli 2014 14:10 *To:* Holzenspies, P.K.F. (EWI) *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
*Subject:* Re: Broken Data.Data instances
Philip
I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better.
On a related note, is there any way to constrain the 'a' in
type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp
to have an instance of Data?
I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original).
Alan
On Mon, Jul 28, 2014 at 12:30 PM,
wrote: Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go:
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
*Sent:* maandag 28 juli 2014 11:14
*To:* Simon Peyton Jones *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
*Subject:* Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl]
*Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Dear Alan,
I’ve had a look at the diffs on Phabricator. They’re looking good. I have a few comments / questions:
1) As you said, the renamer and typechecker are heavily interwoven, but when you *know* that you’re between renamer and typechecker (i.e. when things have ‘Name’s, but not ‘Id’s), isn’t it better to choose the PreTcType as argument? (Basically, look for any occurrence of “Name PostTcType” and replace with Pre.)
2) I saw your point about being able to distinguish PreTcType from () in SYB-traversals, but you have now defined PreTcType as a synonym for (). With an eye on the maximum line-width of 80 characters and these things being explicit everywhere as a type parameter (as opposed to a type family over the exposed id-parameter), how much added value is there still in having the names PreTcType and PostTcType? Would “()” and “Type” not be as clear? I ask, because when I started looking at GHC, I was overwhelmed with all the names for things in there, most of which then turn out to be different names for the same thing. The main reason to call the thing PostTcType in the first place was to give some kind of warning that there would be nothing there before TC.
3) The variable name “ptt” is a bit misleading to me. I would use “ty”.
4) In the cases of the types that have recently been parameterized in what they contain, is there a reason to have the ty-argument *after* the content-argument? E.g. why is it “LGRHS RdrName (LHsExpr RdrName PreTcType) PreTcType” instead of “LGRHS RdrName PreTcType (LHsExpr RdrName PreTcType)”? This may very well be a tiny stylistic thing, but it’s worth thinking about.
5) I much prefer deleting code over commenting it out. I understand the urge, but if you don’t remove these lines before your final commit, they will become noise in the long term. Versioning systems preserve the code for you. (Example: Convert.void)
Regards,
Philip
From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
Sent: woensdag 13 augustus 2014 8:50
To: Holzenspies, P.K.F. (EWI)
Cc: Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
Subject: Re: Broken Data.Data instances
And I dipped my toes into the phabricator water, and uploaded a diff to https://phabricator.haskell.org/D153
I left the lines long for now, so that it is clear that I simply added parameters to existing type signatures.
On Tue, Aug 12, 2014 at 10:51 PM, Alan & Kim Zimmerman
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much. Would a patch “fixing” these instances be acceptable? Regards, Philip _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Hi Philip Thanks for the feedback. Firstly, I see this as a draft change as a proof of concept, and as such I deliberately tried to keep things "obvious" until it had been fully worked through. It helped in managing my own confusion to limit the changes to be things that either HAD to change (PostTcType), or the introduction of new things that did not previously exist (ptt, PreTcType). Naming them the way I did I was able to make sure that I did not end up making cascading changes to currently good code when I was in a sticky point. This definitely helped in the renamer code. It also makes it clearer to current reviewers that this is in fact a straightforward change. If there is a consensus that this is something worth doing, then I agree on your proposed changes and will work them through. On the void thing I only realised afterwards what was happening, I am now not sure whether it is better to keep the new placeHolderType values or restore void as a synonym for it. It must definitely go it it is not used though. Alan On Wed, Aug 13, 2014 at 12:58 PM,
Dear Alan,
I’ve had a look at the diffs on Phabricator. They’re looking good. I have a few comments / questions:
1) As you said, the renamer and typechecker are heavily interwoven, but when you **know** that you’re between renamer and typechecker (i.e. when things have ‘Name’s, but not ‘Id’s), isn’t it better to choose the PreTcType as argument? (Basically, look for any occurrence of “Name PostTcType” and replace with Pre.)
2) I saw your point about being able to distinguish PreTcType from () in SYB-traversals, but you have now defined PreTcType as a synonym for (). With an eye on the maximum line-width of 80 characters and these things being explicit everywhere as a type parameter (as opposed to a type family over the exposed id-parameter), how much added value is there still in having the names PreTcType and PostTcType? Would “()” and “Type” not be as clear? I ask, because when I started looking at GHC, I was overwhelmed with all the names for things in there, most of which then turn out to be different names for the same thing. The main reason to call the thing PostTcType in the first place was to give some kind of warning that there would be nothing there before TC.
3) The variable name “ptt” is a bit misleading to me. I would use “ty”.
4) In the cases of the types that have recently been parameterized in what they contain, is there a reason to have the ty-argument **after** the content-argument? E.g. why is it “LGRHS RdrName (LHsExpr RdrName PreTcType) PreTcType” instead of “LGRHS RdrName PreTcType (LHsExpr RdrName PreTcType)”? This may very well be a tiny stylistic thing, but it’s worth thinking about.
5) I much prefer deleting code over commenting it out. I understand the urge, but if you don’t remove these lines before your final commit, they will become noise in the long term. Versioning systems preserve the code for you. (Example: Convert.void)
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* woensdag 13 augustus 2014 8:50
*To:* Holzenspies, P.K.F. (EWI) *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
And I dipped my toes into the phabricator water, and uploaded a diff to https://phabricator.haskell.org/D153
I left the lines long for now, so that it is clear that I simply added parameters to existing type signatures.
On Tue, Aug 12, 2014 at 10:51 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
Status update
I have worked through a proof of concept update to the GHC AST whereby the type is provided as a parameter to each data type. This was basically a mechanical process of changing type signatures, and required very little actual code changes, being only to initialise the placeholder types.
The enabling types are
type PostTcType = Type -- Used for slots in the abstract syntax -- where we want to keep slot for a type -- to be added by the type checker...but -- [before typechecking it's just bogus]
type PreTcType = () -- used before typechecking
class PlaceHolderType a where placeHolderType :: a
instance PlaceHolderType PostTcType where
placeHolderType = panic "Evaluated the place holder for a PostTcType"
instance PlaceHolderType PreTcType where placeHolderType = ()
These are used to replace all instances of PostTcType in the hsSyn types.
The change was applied against HEAD as of last friday, and can be found here
https://github.com/alanz/ghc/tree/wip/landmine-param https://github.com/alanz/haddock/tree/wip/landmine-param
They pass 'sh validate' with GHC 7.6.3, and compile against GHC 7.8.3. I have not tried to validate that yet, have no reason to expect failure.
Can I please get some feedback as to whether this is a worthwhile change?
It is the first step to getting a generic traversal safe AST
Regards
Alan
On Mon, Jul 28, 2014 at 5:45 PM, Alan & Kim Zimmerman
wrote: FYI I edited the paste at http://lpaste.net/108262 to show the problem
On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman
wrote: I already tried that, the syntax does not seem to allow it.
I suspect some higher form of sorcery will be required, as alluded to here http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
Alan
On Mon, Jul 28, 2014 at 4:55 PM,
wrote: Dear Alan,
I would think you would want to constrain the result, i.e.
type family (Data (PostTcType a)) => PostTcType a where …
The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’.
Your point about SYB-recognition of WrongPhase is, of course, a good one ;)
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* maandag 28 juli 2014 14:10 *To:* Holzenspies, P.K.F. (EWI) *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
*Subject:* Re: Broken Data.Data instances
Philip
I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better.
On a related note, is there any way to constrain the 'a' in
type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp
to have an instance of Data?
I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original).
Alan
On Mon, Jul 28, 2014 at 12:30 PM,
wrote: Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go:
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
*Sent:* maandag 28 juli 2014 11:14
*To:* Simon Peyton Jones *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
*Subject:* Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman
wrote: What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones
wrote: I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Eek. Glancing at this I see that every single data type has an extra type parameter. To me this feels like a sledgehammer to crack a nut. What is wrong with the type-function approach?
Simon
From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
Sent: 13 August 2014 07:50
To: Philip K.F. Hölzenspies
Cc: Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
Subject: Re: Broken Data.Data instances
And I dipped my toes into the phabricator water, and uploaded a diff to https://phabricator.haskell.org/D153
I left the lines long for now, so that it is clear that I simply added parameters to existing type signatures.
On Tue, Aug 12, 2014 at 10:51 PM, Alan & Kim Zimmerman
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much. Would a patch “fixing” these instances be acceptable? Regards, Philip _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Simon,
I've been encouraging the type family approach. See https://phabricator.haskell.org/D157
Thanks,
Richard
On Aug 15, 2014, at 11:17 AM, Simon Peyton Jones
Eek. Glancing at this I see that every single data type has an extra type parameter. To me this feels like a sledgehammer to crack a nut. What is wrong with the type-function approach?
Simon
From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] Sent: 13 August 2014 07:50 To: Philip K.F. Hölzenspies Cc: Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org Subject: Re: Broken Data.Data instances
And I dipped my toes into the phabricator water, and uploaded a diff to https://phabricator.haskell.org/D153
I left the lines long for now, so that it is clear that I simply added parameters to existing type signatures.
On Tue, Aug 12, 2014 at 10:51 PM, Alan & Kim Zimmerman
wrote: Status update
I have worked through a proof of concept update to the GHC AST whereby the type is provided as a parameter to each data type. This was basically a mechanical process of changing type signatures, and required very little actual code changes, being only to initialise the placeholder types.
The enabling types are
type PostTcType = Type -- Used for slots in the abstract syntax -- where we want to keep slot for a type -- to be added by the type checker...but -- [before typechecking it's just bogus]
type PreTcType = () -- used before typechecking
class PlaceHolderType a where placeHolderType :: a
instance PlaceHolderType PostTcType where
placeHolderType = panic "Evaluated the place holder for a PostTcType"
instance PlaceHolderType PreTcType where placeHolderType = ()
These are used to replace all instances of PostTcType in the hsSyn types.
The change was applied against HEAD as of last friday, and can be found here
https://github.com/alanz/ghc/tree/wip/landmine-param https://github.com/alanz/haddock/tree/wip/landmine-param
They pass 'sh validate' with GHC 7.6.3, and compile against GHC 7.8.3. I have not tried to validate that yet, have no reason to expect failure.
Can I please get some feedback as to whether this is a worthwhile change?
It is the first step to getting a generic traversal safe AST
Regards
Alan
On Mon, Jul 28, 2014 at 5:45 PM, Alan & Kim Zimmerman
wrote: FYI I edited the paste at http://lpaste.net/108262 to show the problem
On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman
wrote: I already tried that, the syntax does not seem to allow it.
I suspect some higher form of sorcery will be required, as alluded to herehttp://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
Alan
On Mon, Jul 28, 2014 at 4:55 PM,
wrote: Dear Alan,
I would think you would want to constrain the result, i.e.
type family (Data (PostTcType a)) => PostTcType a where …
The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’.
Your point about SYB-recognition of WrongPhase is, of course, a good one ;)
Regards, Philip
From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] Sent: maandag 28 juli 2014 14:10 To: Holzenspies, P.K.F. (EWI) Cc: Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
Subject: Re: Broken Data.Data instances
Philip
I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better.
On a related note, is there any way to constrain the 'a' in
type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp
to have an instance of Data?
I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original).
Alan
On Mon, Jul 28, 2014 at 12:30 PM,
wrote: Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go: I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards, Philip
From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] Sent: maandag 28 juli 2014 11:14 To: Simon Peyton Jones Cc: Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
Subject: Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman
wrote: What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on. Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones
wrote: I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc. If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is | HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying type PostTcType = Maybe TcType but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this: | HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
From: Edward Kmett [mailto:ekmett@gmail.com] Sent: 27 July 2014 18:27 To: p.k.f.holzenspies@utwente.nl Cc: alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
Subject: Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip
Van: Alan & Kim Zimmerman [alan.zimm@gmail.com] Verzonden: vrijdag 25 juli 2014 13:44 Aan: Simon Peyton Jones CC: Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org Onderwerp: Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman
wrote: While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/...)
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones
wrote: So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
From: "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] Sent: 24 July 2014 18:42 To: Simon Peyton Jones Cc: ghc-devs@haskell.org Subject: Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
Simon Peyton Jones 24 Jul 2014 18:22 GHC’s data structures are often mutually recursive. e.g. · The TyCon for Maybe contains the DataCon for Just · The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Ofp.k.f.holzenspies@utwente.nl Sent: 24 July 2014 16:42 To: ghc-devs@haskell.org Subject: Broken Data.Data instances
Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards, Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Did you look at https://phabricator.haskell.org/D157? It superseded
https://phabricator.haskell.org/D153
On Fri, Aug 15, 2014 at 5:17 PM, Simon Peyton Jones
Eek. Glancing at this I see that every single data type has an extra type parameter. To me this feels like a sledgehammer to crack a nut. What is wrong with the type-function approach?
Simon
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* 13 August 2014 07:50 *To:* Philip K.F. Hölzenspies
*Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
And I dipped my toes into the phabricator water, and uploaded a diff to https://phabricator.haskell.org/D153
I left the lines long for now, so that it is clear that I simply added parameters to existing type signatures.
On Tue, Aug 12, 2014 at 10:51 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
Status update
I have worked through a proof of concept update to the GHC AST whereby the type is provided as a parameter to each data type. This was basically a mechanical process of changing type signatures, and required very little actual code changes, being only to initialise the placeholder types.
The enabling types are
type PostTcType = Type -- Used for slots in the abstract syntax -- where we want to keep slot for a type -- to be added by the type checker...but -- [before typechecking it's just bogus]
type PreTcType = () -- used before typechecking
class PlaceHolderType a where placeHolderType :: a
instance PlaceHolderType PostTcType where
placeHolderType = panic "Evaluated the place holder for a PostTcType"
instance PlaceHolderType PreTcType where placeHolderType = ()
These are used to replace all instances of PostTcType in the hsSyn types.
The change was applied against HEAD as of last friday, and can be found here
https://github.com/alanz/ghc/tree/wip/landmine-param https://github.com/alanz/haddock/tree/wip/landmine-param
They pass 'sh validate' with GHC 7.6.3, and compile against GHC 7.8.3. I have not tried to validate that yet, have no reason to expect failure.
Can I please get some feedback as to whether this is a worthwhile change?
It is the first step to getting a generic traversal safe AST
Regards
Alan
On Mon, Jul 28, 2014 at 5:45 PM, Alan & Kim Zimmerman
wrote: FYI I edited the paste at http://lpaste.net/108262 to show the problem
On Mon, Jul 28, 2014 at 5:41 PM, Alan & Kim Zimmerman
wrote: I already tried that, the syntax does not seem to allow it.
I suspect some higher form of sorcery will be required, as alluded to here http://stackoverflow.com/questions/14133121/can-i-constrain-a-type-family
Alan
On Mon, Jul 28, 2014 at 4:55 PM,
wrote: Dear Alan,
I would think you would want to constrain the result, i.e.
type family (Data (PostTcType a)) => PostTcType a where …
The Data-instance of ‘a’ doesn’t give you much if you have a ‘PostTcType a’.
Your point about SYB-recognition of WrongPhase is, of course, a good one ;)
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com] *Sent:* maandag 28 juli 2014 14:10 *To:* Holzenspies, P.K.F. (EWI) *Cc:* Simon Peyton Jones; Edward Kmett; ghc-devs@haskell.org
*Subject:* Re: Broken Data.Data instances
Philip
I think the main reason for the WrongPhase thing is to have something that explicitly has a Data and Typeable instance, to allow generic (SYB) traversal. If we can get by without this so much the better.
On a related note, is there any way to constrain the 'a' in
type family PostTcType a where PostTcType Id = TcType PostTcType other = WrongPhaseTyp
to have an instance of Data?
I am experimenting with traversals over my earlier paste, and got stuck here (which is the reason the Show instances were commentet out in the original).
Alan
On Mon, Jul 28, 2014 at 12:30 PM,
wrote: Sorry about that… I’m having it out with my terminal server and the server seems to be winning. Here’s another go:
I always read the () as “there’s nothing meaningful to stick in here, but I have to stick in something” so I don’t necessarily want the WrongPhase-thing. There is very old commentary stating it would be lovely if someone could expose the PostTcType as a parameter of the AST-types, but that there are so many types and constructors, that it’s a boring chore to do. Actually, I was hoping haRe would come up to speed to be able to do this. That being said, I think Simon’s idea to turn PostTcType into a type-family is a better way altogether; it also documents intent, i.e. () may not say so much, but PostTcType RdrName says quite a lot.
Simon commented that a lot of the internal structures aren’t trees, but cyclic graphs, e.g. the TyCon for Maybe references the DataCons for Just and Nothing, which again refer to the TyCon for Maybe. I was wondering whether it would be possible to make stateful lenses for this. Of course, for specific cases, we could do this, but I wonder if it is also possible to have lenses remember the things they visited and not visit them twice. Any ideas on this, Edward?
Regards,
Philip
*From:* Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
*Sent:* maandag 28 juli 2014 11:14
*To:* Simon Peyton Jones *Cc:* Edward Kmett; Holzenspies, P.K.F. (EWI); ghc-devs
*Subject:* Re: Broken Data.Data instances
I have made a conceptual example of this here http://lpaste.net/108262
Alan
On Mon, Jul 28, 2014 at 9:50 AM, Alan & Kim Zimmerman
wrote: What about creating a specific type with a single constructor for the "not relevant to this phase" type to be used instead of () above? That would also clearly document what was going on.
Alan
On Mon, Jul 28, 2014 at 9:14 AM, Simon Peyton Jones
wrote: I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
If the “hand grenades” are the PostTcTypes, etc, then I can explain why they are there.
There simply is no sensible type you can put before the type checker runs. For example one of the constructors in HsExpr is
| HsMultiIf PostTcType [LGRHS id (LHsExpr id)]
After type checking we know what type the thing has, but before we have no clue.
We could get around this by saying
type PostTcType = Maybe TcType
but that would mean that every post-typechecking consumer would need a redundant pattern-match on a Just that would always succeed.
It’s nothing deeper than that. Adding Maybes everywhere would be possible, just clunky.
However we now have type functions, and HsExpr is parameterised by an ‘id’ parameter, which changes from RdrName (after parsing) to Name (after renaming) to Id (after typechecking). So we could do this:
| HsMultiIf (PostTcType id) [LGRHS id (LHsExpr id)]
and define PostTcType as a closed type family thus
type family PostTcType a where
PostTcType Id = TcType
PostTcType other = ()
That would be better than filling it with bottoms. But it might not help with generic programming, because there’d be a component whose type wasn’t fixed. I have no idea how generics and type functions interact.
Simon
*From:* Edward Kmett [mailto:ekmett@gmail.com] *Sent:* 27 July 2014 18:27 *To:* p.k.f.holzenspies@utwente.nl *Cc:* alan.zimm@gmail.com; Simon Peyton Jones; ghc-devs
*Subject:* Re: Broken Data.Data instances
Philip, Alan,
If you need a hand, I'm happy to pitch in guidance.
I've had to mangle a bunch of hand-written Data instances and push out patches to a dozen packages that used to be built this way before I convinced the authors to switch to safer versions of Data. Using virtual smart constructors like we do now in containers and Text where needed can be used to preserve internal invariants, etc.
This works far better for users of the API than just randomly throwing them a live hand grenade. As I recall, these little grenades in generic programming over the GHC API have been a constant source of pain for libraries like haddock.
Simon,
It seems to me that regarding circular data structures, nothing prevents you from walking a circular data structure with Data.Data. You can generate a new one productively that looks just like the old with the contents swapped out, it is indistinguishable to an observer if the fixed point is lost, and a clever observer can use observable sharing to get it back, supposing that they are allowed to try.
Alternately, we could use the 'virtual constructor' trick there to break the cycle and reintroduce it, but I'm less enthusiastic about that idea, even if it is simpler in many ways.
-Edward
On Sun, Jul 27, 2014 at 10:17 AM,
wrote: Alan,
In that case, let's have a short feedback-loop between the two of us. It seems many of these files (Name.lhs, for example) are really stable through the repo-history. It would be nice to have one bigger refactoring all in one go (some of the code could use a polish, a lot of code seems removable).
Regards, Philip ------------------------------
*Van:* Alan & Kim Zimmerman [alan.zimm@gmail.com] *Verzonden:* vrijdag 25 juli 2014 13:44 *Aan:* Simon Peyton Jones *CC:* Holzenspies, P.K.F. (EWI); ghc-devs@haskell.org *Onderwerp:* Re: Broken Data.Data instances
By the way, I would be happy to attempt this task, if the concept is viable.
On Thu, Jul 24, 2014 at 11:23 PM, Alan & Kim Zimmerman < alan.zimm@gmail.com> wrote:
While we are talking about fixing traversals, how about getting rid of the phase specific panic initialisers for placeHolderType, placeHolderKind and friends?
In order to safely traverse with SYB, the following needs to be inserted into all the SYB schemes (see
https://github.com/alanz/HaRe/blob/master/src/Language/Haskell/Refact/Utils/... )
-- Check the Typeable items checkItemStage1 :: (Typeable a) => SYB.Stage -> a -> Bool checkItemStage1 stage x = (const False `SYB.extQ` postTcType `SYB.extQ` fixity `SYB.extQ` nameSet) x where nameSet = const (stage `elem` [SYB.Parser,SYB.TypeChecker]) :: GHC.NameSet -> Bool postTcType = const (stage < SYB.TypeChecker ) :: GHC.PostTcType -> Bool fixity = const (stage < SYB.Renamer ) :: GHC.Fixity -> Bool
And in addition HsCmdTop and ParStmtBlock are initialised with explicit 'undefined values.
Perhaps use an initialiser that can have its panic turned off when called via the GHC API?
Regards
Alan
On Thu, Jul 24, 2014 at 11:06 PM, Simon Peyton Jones < simonpj@microsoft.com> wrote:
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
That’s fine with me provided (a) the default behaviour is not immediate divergence (which it might well be), and (b) the pitfalls are documented.
Simon
*From:* "Philip K.F. Hölzenspies" [mailto:p.k.f.holzenspies@utwente.nl] *Sent:* 24 July 2014 18:42 *To:* Simon Peyton Jones *Cc:* ghc-devs@haskell.org *Subject:* Re: Broken Data.Data instances
Dear Simon, et al,
These are very good points to make for people writing such traversals and queries. I would be more than happy to write a page on the pitfalls etc. on the wiki, but in my experience so far, exploring the innards of GHC is tremendously helped by trying small things out and showing (bits of) the intermediate structures. For me, personally, this has always been hindered by the absence of good instances of Data and/or Show (not having to bring DynFlags and not just visualising with the pretty printer are very helpful).
So... does anyone object to me changing these "broken" instances with the ones given by DeriveDataTypeable?
Also, many of these internal data structures could be provided with useful lenses to improve such traversals further. Anyone ever go at that? Would be people be interested?
Regards, Philip
*Simon Peyton Jones*
24 Jul 2014 18:22
GHC’s data structures are often mutually recursive. e.g.
· The TyCon for Maybe contains the DataCon for Just
· The DataCon For just contains Just’s type
· Just’s type contains the TyCon for Maybe
So any attempt to recursively walk over all these structures, as you would a tree, will fail.
Also there’s a lot of sharing. For example, every occurrence of ‘map’ is a Var, and inside that Var is map’s type, its strictness, its rewrite RULE, etc etc. In walking over a term you may not want to walk over all that stuff at every occurrence of map.
Maybe that’s it; I’m not certain since I did not write the Data instances for any of GHC’s types
Simon
*From:* ghc-devs [mailto:ghc-devs-bounces@haskell.org
] *On Behalf Of * p.k.f.holzenspies@utwente.nl *Sent:* 24 July 2014 16:42 *To:* ghc-devs@haskell.org *Subject:* Broken Data.Data instances Dear GHC-ers,
Is there a reason for explicitly broken Data.Data instances? Case in point:
instance Data Var where
-- don't traverse?
toConstr _ = abstractConstr "Var"
gunfold _ _ = error "gunfold"
dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b
collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type]
allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much.
Would a patch “fixing” these instances be acceptable?
Regards,
Philip
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Ah, I see. Is there some way for D153 to be retired, then, to avoid inattentive people looking at it? (I’m wading through a week’s worth of email backlog.)
I’ll look at D157
S
From: Alan & Kim Zimmerman [mailto:alan.zimm@gmail.com]
Sent: 15 August 2014 16:36
To: Simon Peyton Jones
Cc: Philip K.F. Hölzenspies; Edward Kmett; ghc-devs@haskell.org
Subject: Re: Broken Data.Data instances
Did you look at https://phabricator.haskell.org/D157? It superseded https://phabricator.haskell.org/D153
On Fri, Aug 15, 2014 at 5:17 PM, Simon Peyton Jones
instance Data Var where -- don't traverse? toConstr _ = abstractConstr "Var" gunfold _ _ = error "gunfold" dataTypeOf _ = mkNoRepType "Var"
I understand (vaguely) arguments about abstract data types, but this also excludes convenient queries that can, e.g. extract all types from a CoreExpr. I had hoped to do stuff like this:
collect :: (Typeable b, Data a, MonadPlus m) => a -> m b collect = everything mplus $ mkQ mzero return
allTypes :: CoreExpr -> [Type] allTypes = collect
Especially when still exploring (parts of) the GHC API, being able to extract things in this fashion is very helpful. SYB’s “everything” being broken by these instances, not so much. Would a patch “fixing” these instances be acceptable? Regards, Philip _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
participants (6)
-
"Philip K.F. Hölzenspies"
-
Alan & Kim Zimmerman
-
Edward Kmett
-
p.k.f.holzenspies@utwente.nl
-
Richard Eisenberg
-
Simon Peyton Jones