
Hello YHC hackers, I've been looking pretty closely at YHC and I have a couple of questions: 1) I'm interested in the possibility of doing Java-style verification of bytecode, so I've been studying the bytecode docs. Looking at the bytecodes, I notice that all the jump bytecodes seem to indicate that jumps are always forward jumps. Upon reflection it makes perfect sense, but I want to make sure that is a conscious design decision. It certainly makes the kinds of analysis I'm interested in easier... 2) I notice that concurrency is a big unclaimed todo item. Are there any concrete plans in this direction? I'm sort of interested in this, but before I think about it too much, I was wondering what has and has not been thought about/done so far. Thanks! Rob Dockins

Hi Robert
1) I'm interested in the possibility of doing Java-style verification of bytecode, so I've been studying the bytecode docs. Looking at the bytecodes, I notice that all the jump bytecodes seem to indicate that jumps are always forward jumps. Upon reflection it makes perfect sense, but I want to make sure that is a conscious design decision. It certainly makes the kinds of analysis I'm interested in easier...
Yes, it is a concious decision, since only forward jumps are required.
From what I know (and I may be wrong here), adding backward jumps would have little impact on the execution of the bytecode, but just isn't needed.
A byte code verifier would be cool :)
2) I notice that concurrency is a big unclaimed todo item. Are there any concrete plans in this direction? I'm sort of interested in this, but before I think about it too much, I was wondering what has and has not been thought about/done so far.
Tom has a working implementation of concurrency on his machine, and is just doing a bit of testing before committing. It isn't OS/machine level concurrency (it just switches thread every n instructions), but it has been implemented including MVar's. Thanks Neil

Hi Rob, Just to add further comment to what Neil said: 1) Indeed, there is no need for backward jumps so they were deliberately not included. (This makes several of the compiler analyses easier too). Compared to verification of Java verification of YHC should be quite easy for the simple reason that YHC does not include support for unboxed integers, so all variables on the stack must always be pointers to heap nodes (or stack frames). 2) I plan to document my implementation of concurrency on the wiki soon. Since it changes some of the docs. Since the effect of concurrency on non-concurrent programs is fairly small support for concurrency will become part of the standard build soon :-) Cheers :-) Tom Robert Dockins wrote:
Hello YHC hackers,
I've been looking pretty closely at YHC and I have a couple of questions:
1) I'm interested in the possibility of doing Java-style verification of bytecode, so I've been studying the bytecode docs. Looking at the bytecodes, I notice that all the jump bytecodes seem to indicate that jumps are always forward jumps. Upon reflection it makes perfect sense, but I want to make sure that is a conscious design decision. It certainly makes the kinds of analysis I'm interested in easier...
2) I notice that concurrency is a big unclaimed todo item. Are there any concrete plans in this direction? I'm sort of interested in this, but before I think about it too much, I was wondering what has and has not been thought about/done so far.
Thanks! Rob Dockins _______________________________________________ Yhc mailing list Yhc@haskell.org http://haskell.org/mailman/listinfo/yhc

On Tuesday 28 February 2006 07:36 am, you wrote:
Hi Rob,
Just to add further comment to what Neil said:
1) Indeed, there is no need for backward jumps so they were deliberately not included. (This makes several of the compiler analyses easier too).
That's more or less what I expected. The absence of explicit loops makes things considerably simpler.
Compared to verification of Java verification of YHC should be quite easy for the simple reason that YHC does not include support for unboxed integers, so all variables on the stack must always be pointers to heap nodes (or stack frames).
That's an interesting point. I'll think about this some more.
2) I plan to document my implementation of concurrency on the wiki soon. Since it changes some of the docs. Since the effect of concurrency on non-concurrent programs is fairly small support for concurrency will become part of the standard build soon :-)
That is very good to hear. It seems to me that the two biggest issues are: -- supporting multiple execution stacks -- blocking I/O and FFI calls How did you decide to handle these problems (mostly for my curiousity)? Thanks, Rob Dockins

That is very good to hear. It seems to me that the two biggest issues are:
-- supporting multiple execution stacks -- blocking I/O and FFI calls
How did you decide to handle these problems (mostly for my curiousity)?
You're quite right, those were two of the biggest issues :-) The first thing to note is that concurrency is implemented in Yhc by the interpretter running some instructions for one process, then some more instructions for the next process and so on. This is rather than using one OS thread per Haskell process because all processes can access the global heap, so if using OS threads it would be necessary to either lock every heap access (incredibly slow) or use some of the truely nasty tricks that concurrent GHC uses. The first problem, as you note, is that originally we only had one stack but now every process needs its own stack. After considering possible options I decided the easiest way was: - process stacks become normal nodes in the heap, and are garbage collected like other heap nodes. This preserves the property that we can't run out of heap but not stack (and vice versa). - a process is initially allocated a small stack space in the heap and if that runs out we simply allocate a new stack space (twice as large as before) and let the old stack be garbage collected. - I considered using stacks spread out over multiple 'pages', but in practice that I found that a) it's really hard to implement correctly, especially when you consider garbage collection can shift all your addresses around (don't forget frames on the stack reference the previous frame, possibly in a different stack page!) b) most stacks don't grow very large so there's little penalty to just throwing an old stack away if it gets too small. - it would also be easy to recycle old stacks i.e. perhaps one process's discarded stack would be the right size for another process. This would be especially true if stacks came in standard sizes (64,128,256 etc). I haven't implemented this yet. Now going back, the approach of running some instructions from one thread then some more from another is fine providing every instruction runs quickly. They all do, with one exception, PRIMITIVE. This is because PRIMITIVE is used for IO actions/FFI calls. The solution? Well the basic idea is "just spawn an OS thread to do the IO action while the main thread continues running Haskell". In detail, when PRIMITIVE is called for an IO action/FFI call: - an OS thread is taken from a pool of threads - the main thread tells the OS thread to run the IO action/FFI call - the main thread then reschedules so another haskell process can run - eventually the OS thread complete it's action and is put back in the pool. It adds the process to the list of processes that have completed their IO actions and are waiting to be rescheduled. - when the main thread next reschedules it checks the list of completed IO actions and makes the corresponding blocked processes ready to run again. There is a simple optimisation used which is if there is only one ready process in the program then it doesn't bother using a seperate thread, it just does the IO action directly. Hope that answers your questions :-) Cheers Tom Shackell

On Mar 3, 2006, at 5:29 AM, Thomas Shackell wrote:
That is very good to hear. It seems to me that the two biggest issues are: -- supporting multiple execution stacks -- blocking I/O and FFI calls How did you decide to handle these problems (mostly for my curiousity)?
You're quite right, those were two of the biggest issues :-)
The first thing to note is that concurrency is implemented in Yhc by the interpretter running some instructions for one process, then some more instructions for the next process and so on.
This is rather than using one OS thread per Haskell process because all processes can access the global heap, so if using OS threads it would be necessary to either lock every heap access (incredibly slow) or use some of the truely nasty tricks that concurrent GHC uses.
The first problem, as you note, is that originally we only had one stack but now every process needs its own stack. After considering possible options I decided the easiest way was: - process stacks become normal nodes in the heap, and are garbage collected like other heap nodes. This preserves the property that we can't run out of heap but not stack (and vice versa).
[snip] How do you detect when a process runs out of stack? Do you calculate the max stack usage from the bytecode and check before you push a new stack frame?
Now going back, the approach of running some instructions from one thread then some more from another is fine providing every instruction runs quickly. They all do, with one exception, PRIMITIVE. This is because PRIMITIVE is used for IO actions/FFI calls.
The solution? Well the basic idea is "just spawn an OS thread to do the IO action while the main thread continues running Haskell".
[snip] How do you deal with OS threads portably? I assume you use pthreads for *nix. Does windows know pthreads or did you create some thin wrapper over OS specific threading?
Hope that answers your questions :-)
Yes, thank you! Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

How do you detect when a process runs out of stack? Do you calculate the max stack usage from the bytecode and check before you push a new stack frame?
That's pretty much exactly right. The maximum stack usage for each function is statically calculated by the compiler anyway, see FunObj at: http://haskell.org/haskellwiki/Yhc/RTS/hbc The runtime doesn't quite check every time it pushes a new stack frame (it pushes stack frames for lots of reasons, for example because it's about to GC) so it always ensures that all times it has enough stack space for one stack frame (they're not very big anyway). When the runtime performs an EVAL function it checks that there is enough stack space for two frames and the called function's maximum stack usage (one for the frame it's going to create and another as the "spare").
How do you deal with OS threads portably? I assume you use pthreads for *nix. Does windows know pthreads or did you create some thin wrapper over OS specific threading?
A wrapper over OS specific threading, the *nix implementation does indeed use pthreads, and Neil's going to write the windows one since I know nothing about it ;-) Cheers :-) Tom

Thomas Shackell wrote:
The first thing to note is that concurrency is implemented in Yhc by the interpretter running some instructions for one process, then some more instructions for the next process and so on.
This is rather than using one OS thread per Haskell process because all processes can access the global heap, so if using OS threads it would be necessary to either lock every heap access (incredibly slow) or use some of the truely nasty tricks that concurrent GHC uses.
I resent that :-) There are other choices too: one OS thread per Haskell thread works fine if there's a Giant Lock around the whole runtime (this is the approach that OCaml uses, for instance). Your approach is clearly the right one in the context of YHC, though. I think it's fantastic that you guys have got concurrency working, BTW. Can you really context switch between two arbitrary instructions? Is the stack in a GC'able state at all times?
The first problem, as you note, is that originally we only had one stack but now every process needs its own stack. After considering possible options I decided the easiest way was: - process stacks become normal nodes in the heap, and are garbage collected like other heap nodes. This preserves the property that we can't run out of heap but not stack (and vice versa).
- a process is initially allocated a small stack space in the heap and if that runs out we simply allocate a new stack space (twice as large as before) and let the old stack be garbage collected.
This is exactly what GHC does. I found we had to be very careful with pointers to the old version of the stack, which occur from time to time. eg. ThreadIds are just pointers to the TSO (== stack), they have to be indirected to the new stack when accessed.
Now going back, the approach of running some instructions from one thread then some more from another is fine providing every instruction runs quickly. They all do, with one exception, PRIMITIVE. This is because PRIMITIVE is used for IO actions/FFI calls.
The solution? Well the basic idea is "just spawn an OS thread to do the IO action while the main thread continues running Haskell".
In detail, when PRIMITIVE is called for an IO action/FFI call:
- an OS thread is taken from a pool of threads
- the main thread tells the OS thread to run the IO action/FFI call
- the main thread then reschedules so another haskell process can run
- eventually the OS thread complete it's action and is put back in the pool. It adds the process to the list of processes that have completed their IO actions and are waiting to be rescheduled.
- when the main thread next reschedules it checks the list of completed IO actions and makes the corresponding blocked processes ready to run again.
What happens if the FFI call invokes a callback? (a foreign export, or foreign import wrapper)? Can callbacks be invoked by multiple OS threads? Are you planning to implement bound threads? Cheers, Simmon

Simon Marlow
I think it's fantastic that you guys have got concurrency working, BTW.
Me too. Well done chaps.
Can you really context switch between two arbitrary instructions? Is the stack in a GC'able state at all times?
AFAIK, the context switch can only happen at "safe" moments, namely at a NEEDHEAP (or NEEDSTACK?) instruction. Ordinary heap allocation is unchecked for overflow, so if there were arbitrary context switches, a thread could check there is enough space available with a NEEDHEAP, be switched out, then resume when the heap has been exhausted by another thread: crash bang. Thus, the atomic unit of thread work must be exactly the sequence of instructions bounded by NEEDHEAP checks.
What happens if the FFI call invokes a callback? (a foreign export, or foreign import wrapper)? Can callbacks be invoked by multiple OS threads? Are you planning to implement bound threads?
At the moment, I'm pretty sure yhc does not implement foreign import wrapper. (That one is on my todo list for nhc98 as well.) But the point stands, even with foreign export. Does the exported Haskell function get run in the separate OS thread (causes heap-locking issues)? Or will it be added back into the internal thread pool? Regards, Malcolm

AFAIK, the context switch can only happen at "safe" moments, namely at a NEEDHEAP (or NEEDSTACK?) instruction. Ordinary heap allocation is unchecked for overflow, so if there were arbitrary context switches, a thread could check there is enough space available with a NEEDHEAP, be switched out, then resume when the heap has been exhausted by another thread: crash bang. Thus, the atomic unit of thread work must be exactly the sequence of instructions bounded by NEEDHEAP checks.
Indeed this is exactly right. The runtime only bothers to check whether to reschedule just before executing a NEEDHEAP. NEEDSTACK was removed. It seemed easier to have the FInfo store the maximum stack usage and do the stack check on EVAL.
At the moment, I'm pretty sure yhc does not implement foreign import wrapper. (That one is on my todo list for nhc98 as well.) But the point stands, even with foreign export. Does the exported Haskell function get run in the separate OS thread (causes heap-locking issues)? Or will it be added back into the internal thread pool?
Again, exactly right, it doesn't do callbacks yet. Given the choice to avoid heap locks my inclination would be to get the OS thread to ask the main thread "Please run this Haskell code and wake me up when you're done". Cheers :-) Tom
participants (6)
-
Malcolm Wallace
-
Neil Mitchell
-
Robert Dockins
-
Simon Marlow
-
Thomas Shackell
-
Tom Shackell