Re: help wrt semantics / primops for pure prefetches

30 Nov 2014


      indeed, its a *tool* that cut both ways, and needs to pay for itself with
benchmarks!
part of why i'm pushing for the has_side_effects=True attribute in
https://phabricator.haskell.org/D350
is because it'll help make it more predictable when the instruction will
exactly fire, because using it effectively requires pretty precise
understanding of application level memory pressure!

@greg, i'd love more code review if you dont mind :)

On Sat, Nov 29, 2014 at 9:15 PM, Gregory Collins <greg@gregorycollins.net>
wrote:
...
On Thu, Nov 27, 2014 at 1:36 AM, Simon Marlow <marlowsd@gmail.com> wrote:
...
I haven't been watching this, but I have one question: does prefetching
actually *work*?  Do you have benchmarks (or better still, actual
library/application code) that show some improvement?  I admit to being
slightly sceptical - when I've tried using prefetching in the GC it has
always been a struggle to get something that shows an improvement, and even
when I get things tuned on one machine it typically makes things slower on
a different processor.  And that's in the GC, doing it at the Haskell level
should be even harder.
I've gotten some speedups (for some operations) from speculative prefetch
for hash tables -- e.g. for cuckoo hash where you know the key can be in
one of two buckets with p=0.5. Prefetching one while you search the other
lets you squeeze out some instruction-level parallelism, at the expense of
cache thrashing.
The cache thrashing issue means that whether prefetching works for you
depends a lot on your inputs: it can help if your program can handle some
additional cache pressure, and it might hurt you otherwise.
G
--
Gregory Collins <greg@gregorycollins.net>

Re: help wrt semantics / primops for pure prefetches

Carter Schonwald