Nilanjan Goswami, Zhongqi LI, Ajit Verma, Ramkumar Shankar, and, Tao Li, Integrating Nanophotonics in GPU Microarchitecture
Parallel Architectures and Compilation Techniques Conference, 2012-09-22
Abstract : As high-performance computing device, the GPU has exposed bandwidth and latency bottlenecks in on-chip interconnect and off-chip memory access. To eliminate such bottlenecks, we employ silicon nanophotonics and 3D stacking technologies in GPU microarchitecture. This provides higher communication bandwidth and lower latency signaling mechanisms at reduced power. Furthermore, to insulate the performance of the GPU compute cores from the interconnect bottlenecks we propose a novel interconnect aware thread scheduling scheme to alleviate the traffic congestion. We evaluate a 3D stacked GPU with 2048 SIMD cores having photonic interconnect. The photonic multiple-write-single-read crossbar network with 32B channel bandwidth on average, achieves 96% power reduction. We anticipate that for emerging workloads and microarchitectures the implications of the proposed ideas are far reaching in terms of power and performance.