
Arunkumar S Jadhav
Now in between all these contents of stack that were pointing to two graphs (i.e x+y and x-y) are being replicated on the stack and then one of the copies (of both the graphs) is being zapped.
Yes, it is curious. I think the main reason is to swap the order of the values, so that the application of f is correct, i.e. f (x+y) (x-y) rather than f (x-y) (x+y) But I am also puzzled why the original copies of (x+y) and (x-y) remain on the stack, and why those stack entries are zapped.
Also what all are the uses of ZAP nodes apart from black hole detection. Do ZAP nodes help in garbage collection too ?
In theory the GC could recover all the space in a zapped heap node apart from the first pointer (which will eventually be overwritten with an indirection to the final result). However, the nhc98 collector does not currently do this, so I believe at the moment the ZAP bit is only used for black hole detection.
Q2) As Malcolm explained in detail this is the purpose of CONSTR macro
CONSTR(c,s,ws) Construct a tag (i.e. a header for a data node) where there is a mixture of pointers and basic values amongst the data items
It seems I was almost right in this description, but mixed up the pointers/non-pointers. s = size = total number of data items in the node ws = number of data items which are pointers to other nodes The number of non-pointers is therefore (s - ws). should read: s = size = total number of data items in the node ws = number of basic data items (non-pointers) The number of pointers is therefore (s - ws).
I compiled various examples but till now I haven't seen a single example where CONSTR is used for a mixture of pointers and basic values. It has always been for basic values.
In fact, every example has only /pointers/, with no basic data values. This is because basic data values in a polymorphic lazy language are nearly always represented as a heap pointer to the value ("boxed"), which is stored separately. The only case in which the basic value can be "in-lined" in a data structure, is when it is explicitly "unboxed" by the programmer (or implicitly "unboxed" by an optimising compiler). In the GHC compiler, for instance, unboxed values are marked in the source code with a # symbol, like this example on the GHC mailing list today: forn :: a -> Int# -> IO () forn a n | n >=# 10000# = return () | otherwise = fory a 0# >> forn a (n +# 1#) You can see that not only the literal numeric values are unboxed, but their type is different, and operations on unboxed values are also marked with a #, because their code must be different from the standard boxed versions. nhc98 has some rudimentary support for unboxed values, which is why the CONSTR macro allows to specify how many fields of the data structure are unboxed. However, I believe this compiler support was never completed by the original author, because the parser does not accept the # marks. There is one hand-written file in the runtime system that actually uses unboxed values - src/runtime/Builtin/cPack.c - but I don't think the functions defined there are imported into nhc98's libraries, so it is essentially dead code at the moment. Regards, Malcolm