What constitutes an 'evaluated node'?

Hello. I am writing my own Yhi-Bytecode backend. I wish to minimize the number of indirection checks, and the bytecode.xml is not clear on the subject. Some of the bytecodes, eg INT_SWITCH, require the node they are applied to, to be 'evaluated'. EVAL takes a node and returns an equivalent evaluated node. However in the current implementation (Yhi) EVAL has the side-effect of making any other reference to the original node become evaluated. For instance, this is "legal": not: PUSH_ARG_0 EVAL POP_1 PUSH_ZAP_ARG_0 JUMP_FALSE L2 POP_1 PUSH_CON_0 RETURN L2: POP_1 PUSH_CON_1 RETURN { Prelude.False, Prelude.True } Specifically, I have in mind an implementation where indirections are NEVER automatically followed, and only EVAL and RETURN_EVAL (and for efficiency perhaps APPLY) check for them, returning the targets as the evaluated nodes. Thus in the above example, the value pushed by PUSH_ZAP_ARG_0 would be an indirection node, not False in any case, so our 'not' would mis-behave as 'const False'. Does any yhc generated code rely on this (mis?) feature? Does anyone (note that I do NOT say 'else') think this is a misfeature? Stefan

Hi Stefan,
Some of the bytecodes, eg INT_SWITCH, require the node they are applied to, to be 'evaluated'. EVAL takes a node and returns an equivalent evaluated node. However in the current implementation (Yhi) EVAL has the side-effect of making any other reference to the original node become evaluated.
This is a definite 'feature', in fact it's entirely integral to the correct handling of lazy evaluation. Remember that in Haskell no computation is actually performed unless it is needed in order to produce an output. Thus for example const x y = y f = const (...really expensive...) 3 because const doesn't need the value of x in order to give a result the 'really expensive' computation isn't needed and is thus never performed. Similarly Haskell avoids doing the same computation twice f = g x x x x where x = ... some really expensive computation ... g a b c d = a + b + c + d Here although x is passed to g 4 times (and g will only evaluate x when it needs to), the really expensive computation will only be done once. Now the yhc byte-code for g would be something along the lines of g(a,b,c,d): PUSH_ARG d EVAL PUSH_ARG c EVAL ADDI PUSH_ARG b EVAL ADDI PUSH_ARG a EVAL ADDI RETURN Now if EVAL didn't update all the other references to '... some really expensive computation ...' then g would end up doing the expensive computation 4 times. However because when EVAL first evaluates the expensive computation it replaces it with an indirection to it's result then it only does the expensive computation once. Doing something 4 times when you could do it once is bad enough. But infact it can be a lot more than this, with certain examples not having memorization can turn a linear algorithm into an exponential one! For a more thorough explanation of implementation lazy functional programming languages I suggest reading 'Implementing Functional Programming Languages: A tutorial' by Simon Peyton Jones and David Lester, the full text is available online http://research.microsoft.com/%7Esimonpj/Papers/pj%2Dlester%2Dbook/ There is also a more in-depth version available called 'The implementation of functional programming languages', which is also available online: http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/index.htm Hope that helps Tom

On Thu, Mar 01, 2007 at 10:49:14AM +0000, Tom Shackell wrote:
Hi Stefan,
Some of the bytecodes, eg INT_SWITCH, require the node they are applied to, to be 'evaluated'. EVAL takes a node and returns an equivalent evaluated node. However in the current implementation (Yhi) EVAL has the side-effect of making any other reference to the original node become evaluated.
This is a definite 'feature', in fact it's entirely integral to the correct handling of lazy evaluation. Remember that in Haskell no computation is actually performed unless it is needed in order to produce an output. Thus for example
const x y = y
f = const (...really expensive...) 3
because const doesn't need the value of x in order to give a result the 'really expensive' computation isn't needed and is thus never performed.
Similarly Haskell avoids doing the same computation twice
f = g x x x x where x = ... some really expensive computation ...
g a b c d = a + b + c + d
Here although x is passed to g 4 times (and g will only evaluate x when it needs to), the really expensive computation will only be done once.
I'm sorry, I seem to have horribly mangled my question. in the present implementation: ptr1 -----> Some Big ptr2 -----> Full PAP after EVAL: /-----------\ ptr1 -----/ An Ind- \--V ptr2 -----> irection ---> Value after POP_1: An Ind- ptr1 -----> irection ---> Value INT_SWITCH examines Value. In the implementation I'm trying for, INT_SWITCH would examine the indirection, BUT in my implentation EVAL *does* chase indirections, so: g(ptr1,ptr2) = ptr1 `seq` ptr2 ptr1 -----> Some Big ptr2 -----> Full PAP after EVAL: /-----------\ ptr1 -----/ An Ind- \--V ptr2 -----> irection ---> Value after POP_1: An Ind- ptr1 -----> irection ---> Value after EVAL: (chases the ind): An Ind- ptr1 ---\ irection /-> Value \------------/ which is returned. (yes I know about RETURN_EVAL)
Now if EVAL didn't update all the other references to '... some really expensive computation ...' then g would end up doing the expensive computation 4 times.
However because when EVAL first evaluates the expensive computation it replaces it with an indirection to it's result then it only does the expensive computation once.
Doing something 4 times when you could do it once is bad enough. But infact it can be a lot more than this, with certain examples not having memorization can turn a linear algorithm into an exponential one! Known, known, known. For a more thorough explanation of implementation lazy functional programming languages I suggest reading 'Implementing Functional Programming Languages: A tutorial' by Simon Peyton Jones and David Lester, the full text is available online
http://research.microsoft.com/%7Esimonpj/Papers/pj%2Dlester%2Dbook/
There is also a more in-depth version available called 'The implementation of functional programming languages', which is also available online:
http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/index.htm Am doing so since about a week ago :)
Hope this time it makes more sense Stefan

Stefan O'Rear wrote:
I'm sorry, I seem to have horribly mangled my question. in the present implementation:
ptr1 -----> Some Big ptr2 -----> Full PAP
after EVAL: /-----------\ ptr1 -----/ An Ind- \--V ptr2 -----> irection ---> Value
after POP_1: An Ind- ptr1 -----> irection ---> Value
INT_SWITCH examines Value.
In the implementation I'm trying for, INT_SWITCH would examine the indirection, BUT in my implentation EVAL *does* chase indirections, so:
g(ptr1,ptr2) = ptr1 `seq` ptr2
ptr1 -----> Some Big ptr2 -----> Full PAP
after EVAL: /-----------\ ptr1 -----/ An Ind- \--V ptr2 -----> irection ---> Value
after POP_1: An Ind- ptr1 -----> irection ---> Value
after EVAL: (chases the ind): An Ind- ptr1 ---\ irection /-> Value \------------/ which is returned. (yes I know about RETURN_EVAL)
I'm affraid I'm still slightly unclear on your question. In the current implementation: g(ptr1,ptr2) = ptr1 `seq` ptr2 Is compiled into something approximating g(ptr1,ptr2): PUSH_ZAP_ARG ptr1 EVAL POP_1 PUSH_ZAP_ARG ptr2 RETURN_EVAL Executing this we would get something like g(ptr1,ptr2): ptr1 ---> Some Big Thunk ptr2 ----------^ PUSH_ZAP_ARG ptr1 ptr1 ---------> ZAP ptr2 ---------> Some Big thunk EVAL // eval doesn't remove indirections from the result ptr2 ---> Ind ------> Value POP_1 ptr2 ---> Ind ------> Value PUSH_ZAP_ARG ptr2 ptr2 ---> ZAP RETURN_EVAL // return_eval must eventually finish with 'return', // and 'return' doesn't remove indirections so the result // of g will have an indirection in it. It would certainly be possible to make RETURN remove indirections from the result - and it might constitute a small optimization.
Known, known, known.
Appologies, your question seemed to be asking "why do we need indirections?", though it's clear now that's not what you mean :-)
Am doing so since about a week ago :)
Ah excellent :-) Tom
participants (3)
-
Stefan O'Rear
-
Thomas Shackell
-
Tom Shackell