
... The UltraSPARC T1/T2 architecture supports very fast thread synchronisation (by taking advantage of the fact that all threads share the same L2 cache). ... Ah, scratch that second part then - though this is perhaps less of an issue when you have 4MB of L2 cache, vs the 256k cache for the machine in the paper. Ben. On 25/07/2008, at 10:38 AM, Ben Lippmeier wrote:
On 25/07/2008, at 8:55 AM, Duncan Coutts wrote:
Right. GHC on SPARC has also always disabled the register window when running Haskell code (at least for registerised builds) and only uses it when using the C stack and calling C functions.
I'm not sure whether register windows and continuation based back- ends are ever going to be very good matches - I don't remember the last time I saw a 'ret' instruction in the generated code :). If there's a killer application for register windows in GHC it'd be something tricky.
I'd be more interested in the 8 x hardware threads per core, [1] suggests that (single threaded) GHC code spends over half its time stalled due to L2 data cache miss. 64 threads per machine is a good incentive for trying out a few `par` calls..
Ben.