Design of a high-performance buffered crossbar switch fabric using network on chip

More Info
expand_more

Abstract

High-performance routers constitute the basic building blocks of the Internet. The wide majority of today's high-performance routers are designed using a crossbar fabric switch as interconnect topology. The buffered crossbar (CICQ) switching architecture is known to be the best crossbar-based architecture for routers design. However, CICQs require expensive on-chip buffers whose cost grows quadratically with the router port count. Additionally, they use long wires to connect router inputs to outputs, resulting in non-negligible delays. In this thesis, we propose a novel design for the CICQ switch architecture. We design the whole buffered crossbar fabric as a Network on Chip (NoC). We propose two architectural variants. The first is named the Unidirectional NoC (UDN), where the crossbar core is built using a NoC with input and output ports placed on two opposite sides of the fabric chip. Because of the chip pins layout, we improved the UDN architecture using a Multidirectional NoC (MDN) architecture, by placing the inputs and output around the perimeter (four sides) of the NoC-based crossbar fabric. Both proposed architectures have been analyzed with appropriate routing algorithms for both unicast and multicast traffic conditions. Our results show that the proposed NoC based crossbar switching design outperforms the conventional CICQ architecture. Our designs offers several advantages when compared to traditional CICQ design: 1) Speedup, because short wires allow reliable high-speed signalling, and simple local arbitration per on-chip router. 2) Load balancing, because paths from different input-output port pairs share the same router buffers. 3) Path diversity allows traffic from an input port to follow different paths to its destination output port. 4) Simpler switch design by allowing simple input memory structure such as first-in first-out (FIFO) input queueing, as opposed to traditional design where virtual output queueing (VOQ) is required.