Workload trend suggests that we want to enable cache to cache transfer to minimize latency. So we need to use direct communication scheme. However, due to the distributed nature of the direct communication, it is hard to enforce order, and traditionally we can only enforce a total order (the strongest assumption) using an ordered interconnect (e.g., bus). For example, destination-set predictor uses the direct communication scheme but has to use an ordered network.That limits the performance. The technology trend suggests that we want to use unordered interconnects. Continue reading
Direct communication, i.e., broadcast, achieves low latency while requiring high bandwidth due to broadcast. This is what snoopy implementation does.
Indirect communication, i.e., centralized, requires low bandwidth while incurring high latency. This is what directory based implementation does. Continue reading