
(I've previously sent this mail to haskell-cafe, but I guess this list is more appropriate) Hi all, for my Master's thesis, I'm looking into functional hardware descriptions, and in particular to translate haskell code into VHDL that can be programmed into an FPGA. For this, I'm using the GHC API to load the haskell source and give me (simplified) core representation. I then walk over the core tree and translate it to VHDL. On the whole the GHC API has been very useful to me. It took me some time to get used to all the (deeply) nested (algebraic) types (I didn't have any previous haskell experience when I started a month or so ago), but things are working out by now. However, there are two issues bothering me still. The first is that the Core types (in particular CoreExpr) are not instances of Show. They are instances of Outputable, which allows them to be pretty printed. However, this pretty printing is good to view the structure of the expression that the CoreExpr represents, but doesn't show the structure of the CoreExpr itself. For example, tuple construction is printed simply as (a, b), while the actual core expression is a nested application of two types, and a and b to the GHC.Tuple.(,) function (or datacon?). Also, the exact constructors used are not quite clear, which made it harder to work with the Core language for me. Instead of looking at the structure of the CoreExpr and write the appropriate patterns, I had to resort to a lot of trial and error. I tried deriving show for CoreExpr myself, but that required me to derive Show for a dozen other types and resulted in type errors beyond my understanding. Is there any compelling reason that CoreExprs are not instances of Show? My second question concerns typle construction. Since tuple types are not primitive types, but dependent types defined in various places (GHC.Tuple and Base IIRC), code working with tuples is not fundamentally different from code working with other (user defined) dependent types and thus not trivial to recognize. I've found that there are some predicate functions that can tell me if a type is a tuple type, but I've had no such luck for the actual tuple construction. In particular, when an expression creates a tuple of two Ints (a, b), this is represented in Core as the application of the element types and the element values to the (,) function (or data constructor? Not sure about this, in Core it's just a Var and the isDataConName [IIRC] predicate crashes when run on the Var). For now, I've manually matched the Var's name to GHC.Tuple.(,) and removed all type arguments to get at the actual values in the tuple. This is of course completely non-portable, to tuples with more than two elements, or unboxed tuples, etc. Is there some predicate function that can do this for me in a more portable way? Gr. Matthijs

SPJ might be able to give you some better answers, but this should help get you started. On Fri, Jan 30, 2009 at 08:03:47PM +0100, Matthijs Kooijman wrote:
However, there are two issues bothering me still. The first is that the Core types (in particular CoreExpr) are not instances of Show. They are instances of Outputable, which allows them to be pretty printed.
Making all GHC's datatypes Showable would probably be a nightmare. Not only because of all the instances you'd have to add, but because many of the datatypes have cycles in them (or example, DataCons point back to their TyCon). Even if you got all the "derriving Show"s in there you'd be stuck writing manual instances everywhere cycles appear. In addition, the sheer amount of data GHC carries around in its various data types would make the output pretty incomprehensable. For example, every occurance of an identifier carries around unfolding info, strictness info, arity, specializations, etc. You're probably better off just trying to wrap your head around the pretty printed output and using that. When you need to pull some more details out of a datatype just sticking random pprPanics in is a great debugging aid.
about this, in Core it's just a Var and the isDataConName [IIRC] predicate crashes when run on the Var).
You might be looking for isDataConId :: Id -> Bool, or isDataConId_maybe :: Id -> Maybe DataCon.
For now, I've manually matched the Var's name to GHC.Tuple.(,) and removed all type arguments to get at the actual values in the tuple. This is of course completely non-portable, to tuples with more than two elements, or unboxed tuples, etc.
You might find exprIsConApp_maybe :: CoreExpr -> Maybe (DataCon,[CoreExpr]) handy (along with isTupleCon :: DataCon -> Bool on the DataCon). -Brian

Hi Brian, thanks for the swift response!
Making all GHC's datatypes Showable would probably be a nightmare. Not I guessed as much. Thanks for confirming my suspicions :-)
pretty printed output and using that. When you need to pull some more details out of a datatype just sticking random pprPanics in is a great debugging aid. How would I go about this, then? You mean I can then use ppr on random places without needing guid get the output up properly? I think I've simply used "error" for that to the same effect, which does indeed gets the job done (albeit slowly).
You might be looking for isDataConId :: Id -> Bool, or isDataConId_maybe :: Id -> Maybe DataCon. I didn't find this one yet, thanks.
You might find exprIsConApp_maybe :: CoreExpr -> Maybe (DataCon,[CoreExpr]) handy (along with isTupleCon :: DataCon -> Bool on the DataCon). That looks exactly like what I need, thanks. I've used isTupleCon before (working from a Type), but didn't find anything to may a CoreExpr to a DataCon.
I'll let you know how it goes. Gr. Matthijs

You may find it useful to use External Core (supported in GHC 6.10), which allows GHC to dump its Core representation into a text file with a well-defined format: http://www.haskell.org/ghc/docs/latest/html/users_guide/ext-core.html There is a (relatively) stand-alone library for handling Core programs, under utils/ext-core in the GHC distribution. If you don't find it useful, and have any suggestions as to how it could be improved, I would be interested to know; please CC both me and this list if so. Cheers, Tim -- Tim Chevalier * http://cs.pdx.edu/~tjc * Often in error, never in doubt "Einstein argued that there must be simplified explanations of nature, because God is not capricious or arbitrary. No such faith comforts the software engineer." -- Fred Brooks

Core types (in particular CoreExpr) are not instances of Show. They are instances of Outputable, which allows them to be pretty printed. However, this pretty printing is good to view the structure of the expression that the CoreExpr represents, but doesn't show the structure of the CoreExpr itself.
Not sure how practical this would be, but to get an idea of the internal structure: if you can derive the SYB classes for the Core types (which would generally be useful if you do lots of traversals), you can easily define a generic show operation, which would give you a lot of control over what kind of output you'd like. There is an example here: http://hackage.haskell.org/trac/ghc/wiki/GhcApiAstTraversals look for the showData code - straighforward base plus lots of exceptions to avoid some ideosyncrasies of GHC's internal representations (in particular, don't try to traverse parts of the structure that haven't been filled yet) example output here http://hackage.haskell.org/packages/archive/ghc-syb/0.1.1/doc/html/GHC-SYB-U... hmm, I need to update that wiki page and the syb-utils package page - the code has moved from there, to the syb and ghc-syb packages, with proper maintainers. In particular, check the use of standalone deriving to get the SYB class instances for GHC API types in package ghc-syb: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/ghc-syb Iirc, the syb instances for Core were not included, if that hasn't changed yet, it would be a useful addition (for an old first stab, see the link at the bottom of that wiki page, CoreSynTraverse). Claus

Matthijs | However, there are two issues bothering me still. The first is that the | Core types (in particular CoreExpr) are not instances of Show. They are | instances of Outputable, which allows them to be pretty printed. | However, this pretty printing is good to view the structure of the | expression that the CoreExpr represents, but doesn't show the structure | of the CoreExpr itself. For example, tuple construction is printed | simply as (a, b), while the actual core expression is a nested | application of two types, and a and b to the GHC.Tuple.(,) function | (or datacon?). Also, the exact constructors used are not quite clear, There's absolutely no reason why CoreExpr CoreBind Type should not be an instance of Show. It'd take you 10 mins to make it so, with the aid of 'standalone deriving' (described in the GHC user manual). There *is* a reason why TyCon and Class are not: a TyCon enumerates its DataCons whose type mentions the TyCon In short, the data structures are, by design, cyclic. Printing one of these would take a long time. But I bet you could get a long way with the three above, plus just printing the *name* of a TyCon or Class or Id. Something like: instance Show TyCon where show tc = showSDoc (ppr tc) | My second question concerns typle construction. Since tuple types are | not primitive types, but dependent types defined in various places | (GHC.Tuple and Base IIRC), code working with tuples is not fundamentally | different from code working with other (user defined) dependent types | and thus not trivial to recognize. I've found that there are some | predicate functions that can tell me if a type is a tuple type, but I've | had no such luck for the actual tuple construction. Well a tuple looks like (tdc ty1 ... tyn arg1 .. argn), where tdc is the data constructor for the tuple. So what you need is: isTupleDataConId :: Id -> Bool It's easy to write isTupleDataConId id | Just data_con <- isDataConId_maybe id = isTupleTyCon (dataConTyCon data_con) I hope this is helpful Simon

Hi Matthijs, This is a shameless plug for EMGM, a library for generic programming that we've been working on at Utrecht. | However, there are two issues bothering me still. The first is that the
| Core types (in particular CoreExpr) are not instances of Show. They are | instances of Outputable, which allows them to be pretty printed. | However, this pretty printing is good to view the structure of the | expression that the CoreExpr represents, but doesn't show the structure | of the CoreExpr itself. For example, tuple construction is printed | simply as (a, b), while the actual core expression is a nested | application of two types, and a and b to the GHC.Tuple.(,) function | (or datacon?). Also, the exact constructors used are not quite clear,
There's absolutely no reason why CoreExpr CoreBind Type should not be an instance of Show. It'd take you 10 mins to make it so, with the aid of 'standalone deriving' (described in the GHC user manual).
There *is* a reason why TyCon and Class are not: a TyCon enumerates its DataCons whose type mentions the TyCon
In short, the data structures are, by design, cyclic. Printing one of these would take a long time.
But I bet you could get a long way with the three above, plus just printing the *name* of a TyCon or Class or Id. Something like: instance Show TyCon where show tc = showSDoc (ppr tc)
Suppose you want to print a type with the exception of one constructor, because it is mutually recursive with another or just prints out lots of useless information. There are at least two ways to do it, one with EMGM, and one with standalone deriving. I show both below.
{-# LANGUAGE TemplateHaskell #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE OverlappingInstances #-} {-# LANGUAGE UndecidableInstances #-} {-# LANGUAGE StandaloneDeriving #-}
module Example where
import qualified Generics.EMGM as G
-----------------------------------------
data A = A0 Int | A1 B data B = B0 Char | B1 A
$(G.derive ''A) $(G.derive ''B)
instance G.Rep G.Show B where rep = G.Show (\_ _ -> f) where f (B0 c) = showString "(B0 " . showChar c . showString ")" f (B1 _) = showString "(B1 <some A>)"
valAB = A1 (B1 (A0 37)) showAB = G.show valAB
-----------------------------------------
data C = C0 Int | C1 D data D = D0 Char | D1 C
deriving instance Show C instance Show D where showsPrec _ = f where f (D0 c) = showString "(D0 " . showChar c . showString ")" f (D1 _) = showString "(D1 <some C>)"
valCD = C1 (D1 (C0 37)) showCD = show valCD
-----------------------------------------
The first one uses EMGM's Template Haskell-based derive function to generate the type representation. I then give an overriding instance for the generic Show function (where G.Show is a newtype used for all show/shows/showsPrec generic functions). So, the constructor B1 will not print out the value of its A-type argument. The second uses standalone deriving and a handwritten instance for D that does the same thing as the first solution did for B. What's the difference? Well, between these instances of G.Show and Show, there's not much. However, the EMGM approach gives you access to a lot of other generic functions including Read, Crush, Collect, etc. See the documentation for all of them [1]. One function you may be able to take advantage of is 'collect', perhaps to collect the B values in valAB. *Example> G.show (G.collect valAB :: [B]) "[(B1 <some A>)]" Moral of the story: you can do it either way, but EMGM gives you a lot extra. Apologies for the self-promotion, but we're looking for people who might want to use EMGM. ;) If you have feedback, let us know! [2] Regards, Sean [1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/emgm [2] http://www.cs.uu.nl/wiki/GenericProgramming/EMGM
participants (6)
-
Brian Alliet
-
Claus Reinke
-
Matthijs Kooijman
-
Sean Leather
-
Simon Peyton-Jones
-
Tim Chevalier