RapidIO low-power, low-latency server and storage networks for the cloud
Mohammad Akhter, Integrated Device Technology - June 18, 2012
Recently, blade- and micro-servers and storage systems have gained significant attention as an alternative to traditional top-of-rack based systems for data centers in the cloud. The server and storage systems in a typical cloud infrastructure require easier scaling, efficient virtualization, and high-performance multi-core processors with low-latency interconnects to deliver a reliable low-power system solution while guaranteeing a low-latency secured user experience.
The blade architectures with in-chassis switch cards and fewer uplink cables offer lower cost and improved reliability with respect to the ToR based systems. The micro-server and storage systems, on the other hand, are expected to offer smaller footprint and lower power consumption with respect to blade solutions. Depending on the architecture and availability of native I/O in the processor, the latency of micro-server and storage systems may be significantly reduced while offering improved reliability, scalability and secured virtualization.
The server and storage architectures typically use interconnects such as Ethernet, PCIe and InfiniBand. Although Ethernet (with TCP/IP) is predominantly used for networking, it could also be used to carry mixed traffic (such as storage, networking, and compute) by using Converged Ethernet. However, latency and power consumption in such systems for mixed traffic is quite high due to compute-intensive protocol stacks. The QoS is also impacted significantly due to frame-based congestion management schemes and Ethernet’s inability to interrupt large frame transmission.
The PCIe and the InfiniBand protocols, on the other hand, are primarily used for compute and storage traffic. Without native messaging support and reliance on single-root hierarchy, the PCIe interconnect is limited in scalability. Server and storage systems based on InfiniBand devices, on the other hand, lack native processor support and suffer from higher system cost and end-to-end latency.
Example Cloud Use Case
Among many others, the blade and the micro-server architectures may benefit applications such as time-constrained big data analysis in the cloud. Such application needs to support mixed traffic flows related to networking, storage and computation. To ensure better QoS, these kinds of applications also need to complete the overall processing of large data sets and real-time streaming data within a given time period. The Apache Hadoop framework with HBase is typically used to handle big data analysis.
The Hadoop-HBase framework has three components: MapReduce (compute), HBase (Storage), and the Interconnect Fabric. In this framework, data and tasks associated with MapReduce engine are loaded and executed in parallel across server clusters over the network that require scalability and efficient interconnect depending on the data size. The Map and Reduce functions in this framework perform the computation and the HBase handles the fast and random access to the large distributed storage nodes over the network. For a large number of Mapping engines to work in parallel, the large data set is first partitioned into a number smaller set. The mapping engines then sort data into an intermediate format (paired to “key”) and transfers to the storage closest to the compute nodes.
After data loading and mapping, the output data from various storage nodes are transferred and shuffled to Reducer servers over the network based on the data “key”. The master servers ensure data distribution to Reducer Servers based on locality of intermediate data. Once Reducer processing is completed, data is combined to produce the final results, which are written back to one or more output files in the storage server over the network.
In this framework, to complete processing (computation and storage access) within a given time period, the software infrastructure could benefit from server and storage architectures with efficient interconnect. For example, the MapReduce model can take advantage of low-latency infrastructure to load and execute tasks in parallel across large scalable compute and storage nodes. With superior hardware based fault-tolerance, error recovery and low latency synchronization such framework can also reduce software overhead associated with system management and reliability (in MapReduce and HBase). In summary, the processing steps in such framework impose the following requirements on the data center interconnect.
The features in the RapidIO protocol discussed in the following section support these and many other requirements relevant to server and storage applications in the cloud.
The high-performance RapidIO protocol was introduced in 2002 as a packet-based open data communication standard. Since then, millions of RapidIO-based devices are shipped around the world from a large number of OEMs and silicon suppliers to meet interconnect requirements in 3G/4G wireless base stations, video servers, military communications, embedded processing and high-performance computing. Some of these applications have benefited from RapidIO-based lowest-latency scalable fabric architecture, while others have taken advantage of the deterministic guarantee of delivery, fault tolerance and reliability features to connect large numbers of multi-core CPUs.
The RapidIO protocol and packet formats are specified in a three-layer architectural hierarchy. The protocol supports short-, medium- and long-reach links on or across boards. The standard also supports both fiber and cable links.
In terms of management and control, the RapidIO protocol supports comprehensive system bring-up, interoperability, multicasting, error management and superior fault recovery mechanisms. With a simplified layered architecture, it is possible to implement the protocol stacks in hardware while keeping the overall software overhead, system cost and power consumption low. 0 summarizes key features in RapidIO applicable to blade and micro-server architectures.