Sylvan<
sebastian.sylvan@gmail.com> wrote:
> GHC doesn't have per-thread allocation so it's probably a bit tricky to get
> that working. Plus, for parallelism it's not clear that a piece of data is
> necessarily "owned" by one thead, since it could be produced by a spark and
> consumed by another spark, those two independent sparks may not necessarily
> occupy the same thread, which means that any *other* data accessed by the
> firs thread could thrash the cache. So really you'd need per-spark
> allocation areas which would probably make sparks very heavy weight.
> In other words, I think there's plenty of research that needs to be done
> w.r.t. scheduling things in time and space so as to avoid false sharing. You
> could, of course, always chunk your work manually, and make sure that each
> "chunk" works on a big block that won't share cache lines with anything else
> (e.g. by padding the data structures).
> Also, while GHC does a fair bit of mutation on its own internal data (thunks
> etc.), most of the "user data" is read-only, which should help. I.e. once a
> cache line has been filled up, there won't be any synchronisation needed on
> that data again.
>
> On Wed, Aug 5, 2009 at 8:04 PM, Thomas Witzel <
witzel.thomas@gmail.com>
> wrote:
>>
>> I'll try that. I'd like to stick with it. As for the memory, although
>> its probably quite a bit of work, it should be doable to have code
>> generated where the threads have their own, non-overlapping, memory
>> pages, so that the CPUs don't go into a cache-thrashing death-match.
>> I'll spend some more time with Haskell and then go from there.
>>
>> On Wed, Aug 5, 2009 at 3:01 PM, Sebastian
>> Sylvan<
sebastian.sylvan@gmail.com> wrote:
>> >
>> >
>> > On Wed, Aug 5, 2009 at 6:59 PM, Thomas Witzel <
witzel.thomas@gmail.com>
>> > wrote:
>> >>
>> >> 2. I started with the very simple nfib example given in the manual for
>> >> Control.Parallel (Section 7.18). On my systems using multiple cores
>> >> makes the code actually slower than just using a single core. While
>> >> the manual cautions that this could be the case for certain
>> >> algorithms, I'm wondering whether this is the desired behaviour for
>> >> this example.
>> >>
>> >> I'm using ghc 6.10.4 right now.
>> >
>> > IIRC the development version of GHC has some major work to optimize
>> > concurrency, so it may be worth trying that. In particular I believe it
>> > executes sparks in batches, to reduce the overhead (which hopefully
>> > fixes
>> > your issue).
>> >
>> > --
>> > Sebastian Sylvan
>> >
>
>
>
> --
> Sebastian Sylvan
>