
27 Nov
2009
27 Nov
'09
2:02 p.m.
An idea to reduce memeory latency, is to have a complex dual/quad/etc. core CPU surrounded by four or more simpler CPUs. The simpler CPUs have a simpler instruction set and do less context switching. So any operations (e.g. folding, filtering) that reduce the size of a larger data structure and that can be done on the simpler CPU instruction sets are shunted off to the simpler CPUs. So, the complex CPUs receive less data and therefore can have better cacheing performance, less latency and better throughput. -- Regards, Casey