EDN Access

 

July 3, 1997


ATM switch chips switch on Net reliability

STEPHEN KEMPAINEN, TECHNICAL EDITOR

Transferring Internet, intranet, and extranet traffic increasingly depends on reliable ATM switches, because a single fault in the backbone can spiral into huge losses. To build reliable ATM switches, redundancy is key: Chips and system architectures must collaborate on switches that can survive multiple simultaneous faults.

Most businesses are now using the Internet, intranets, and extranets--almost as much as their telephones--to send and receive information. As a result, users expect the Net to be just as reliable as the phone system. Achieving that reliability is a challenge to the asynchronous-transfer-mode (ATM) switches deep within the Net: The switches must lose as little data as possible while working 24 hours a day, 365 days a year--at maximum throughput. A common approach to reaching reliability goals is to add redundancy to a design. Core switch chip sets are now available on which to build a reliable, high-performance ATM switch.

The first job of any switch is to quickly transfer blocks of data to their destinations. Traditionally, switches used a best-effort approach to transfer data blocks whether the blocks comprised ATM cells or Internet-protocol packets. "Best effort" means that the switch discards cells or packets from a block to alleviate congestion. Then, upon missing the block, the destination notifies the data source to resend that block. However, at ATM's 622-Mbps speed, using this approach would overwhelm the network resources with retransmissions of entire data blocks, in turn causing more congestion that would balloon into more lost cells.

Therefore, minimizing cell loss is essential for ensuring that the switch can reliably keep the traffic moving. Toward that end, some switch chips use nonblocking architectures in which all ports can concurrently receive from and transmit to different ports. However, congestion still occurs when multiple ingress ports want to use the same egress port. Consequently, even nonblocking switches need integrated traffic control to avoid congestion.

Reliability also means that the switch must offer high availability, measured in downtime per year. The telephone-company benchmark for availability is 99.999%, or "five-nines," availability: 3 minutes of downtime per year over rolling--continuously overlapping--six-month intervals. However, a new benchmark for ATM switches aims to achieve 99.9999%, or "six-nines," availability: no more than 30 sec of downtime per year.

No matter how good the chips are, though, the software is critical to reliability. Meeting the six-nines level means that the software must also provide fault tolerance, maintenance, and fault recovery. The software and hardware must collaborate to achieve an architecture in which even multiple simultaneous faults cannot bring down the entire switch. Using standards-based software, such as the Telecommunications Management Network, the Simple Network Management Protocol, and the Remote Monitoring modules, minimizes reliability problems. Such standards help by applying a rigorous methodology to writing and licensing software.

Redundancy, redundancy

The first requirement for reliable design is redundant components. How much redundancy you need depends on the number of simultaneous faults that you want your switch to survive. Duplicating power supplies, controllers, switch fabrics, management processors, and line cards lets your switch endure multiple faults. Figure 1 shows redundant links between eight line cards and two 8×8 (eight input ports and eight output ports) switch cores. Duplicating the links between redundant components lets a switch tolerate even more faults.

Taking redundancy a step further is to have component pairs run in parallel, or lock step: duplicating the traffic on each component, so that if one should fail the other is already carrying the traffic. For example, two line cards operating in lock step would both be carrying the same link traffic. Other components that should run in lock step include the monitoring, maintenance, and fault-recovery systems. The more redundant components a switch has, the more likely it is to survive simultaneous faults.

Core switch chips can support redundancy with proper partitioning and features (Table 1). For example, Lucent Technologies' Atlanta chip set directly supports redundancy with chip partitioning and clock distribution (Figure 2). The ATM buffer manager (ABM) supports simultaneous links to dual switch fabrics by duplicating the interface to the ATM switch element (ASX). In addition, the system-timing and -clocking schemes allow for independent line-card clocks and arbitrary clock skews across a backplane. This approach eases duplicating the clock distribution by eliminating clock trees and minimizing skew effects.

Another, often-overlooked, component of reliable switches is a redundant bus for test and maintenance (TM) . In normal operation, the data bus carries TM information. But, if the data bus should, for example, lose termination power, the TM bus gives alternate access to component modules. When choosing a TM bus, you should place higher priority on availability than on high performance. TM buses include the IEEE-1394, or Firewire, bus and the IEEE-1149.5-1995 IEEE Standard for Module Test and Maintenance Bus.

Along with a redundant bus, failure compartmentalization--stopping faults in one component from affecting others--is key to a reliable design. For example, a clock-distribution scheme can limit the effect of a faulty clock source. Redundancy and fault compartmentalization can be expensive, but a dead switch is even more expensive in lost revenue and reputation.

No surprises

There is more to a reliable design than just redundancy. You cannot let network events take your switch by surprise. Therefore, you must monitor and maintain operations under normal conditions, so that abnormal conditions produce alarm signals before catastrophic failures. For example, running scheduled diagnostic routines on alternate redundant components maintains their readiness. To let you observe operations and perform maintenance, the ITU standard I.610, B-ISDN Operations and Maintenance Principles and Functions, specifies the ATM operations, administrative, and maintenance (OAM) layer. It functions at both the network and the internal-switch levels. The standard reflects ATM's telephony origins and thus incorporates the robustness that has prevailed there for decades.

The OAM specification covers both hardware and software components. It also works internally and externally for connections within and between switches. OAM provides the tools to monitor and probe the system for warning signs of trouble; it also provides remote-diagnostics and -configuration tools (see box, "OAM mechanisms in ATM").

OAM depends on accurate statistics gathering. However, the 622-Mbps traffic on the switch ports overwhelms traditional statistics-gathering techniques. To gather statistics, the hardware cannot just sample traffic flow. It must continuously check each virtual connection (VC) to catch problems before they mushroom. Therefore, chips such as Lucent's ATM-layer user-network-interface (UNI) manager (ALM) support statistics gathering. ALM maintains a variety of optional, per-connection, 31-bit statistics counters in external memory. They are large enough to track lots of ATM cells until the management processor can collect the statistics and clear the counters. Also, to aid in statistics reporting, the ALM captures available-bit-rate (ABR) resource- and performance-management OAM cells at ingress. It then routes them to any VC for egress or to the processor port for endpoint processing. The management processor can then insert OAM and resource-management cells back through the processor port.

The most efficient way to handle an OAM cell is to avoid interrupting the management processor. The hardware can perform all real-time OAM chores and let the software perform diagnostics and processing. For example, the RCMP-800 ingress port controller from PMC-Sierra automatically receives and generates alarm indications and remote-defect-indication cells. It also generates backward-performance-reporting cells. When OAM cells, such as those for activating and deactivating VC monitoring, require processing, the RCMP-800 routes them to the processor port. In addition, the RCMP-800 identifies loopback cells for direct extraction and return to the loopback source by the egress.

Nonstop repair strategy

For full reliability, a switch must not only detect, but also recover from and repair faults. Recovery and repair start with rerouting traffic around component faults to maintain switch availability. Next, the switch should recover as many statistics as possible. Recovering statistics that occurred before the fault occurred helps find its cause. OAM and a TM bus are useful in collecting data to troubleshoot the problem. Once the diagnostic processor identifies the problem, the maintenance processor should plan a repair incorporating hot-swapping--keeping the switch operating while replacing the problem component. Hot-swapping also helps to avoid delays incurred when powering down and from reset and initialization after power-up. Some loss of throughput may occur while you perform hot-swap maintenance, but that situation is better than having the switch go out of service.

Hot-swapping without losing cells is a big challenge for recovery software. Software must not only alert the system to complete outstanding transactions and save important information before shutting down the faulty component, but also manage the switch to redundant components. It then must synchronize reconfiguration of the replacement component into the system.

Both hardware components and software modules need hot-swapping or "hot-upgrading." Software upgrades for bug fixes and feature enhancements should cause minimal disturbance to the switch. All software modules need the ability to deactivate for upgrade by replacement modules on a scheduled or unscheduled basis. This feature facilitates upgrading software and hot-swapping hardware components.

To perform hot-swapping, hardware can have no daisy-chained signals, such as clocks, between components. Thus, if a component fails, it does not bring down other components. The other hardware requirement is that no electrical transients cause data loss or component failure. (Inserting the capacitive load of a plug-in board into an active bus can cause such transients.) The recovery software is responsible for avoiding data loss. However, if your design subjects a board-edge chip's I/O pins to these transients, you should check with the chip manufacturer to ensure that the I/O buffers include transient protection.

In addition to nonstop maintenance and repair, a reliable switch needs congestion control to provide high throughput and avoid losing cells. Traditionally, data-networking switches discarded cells when traffic backed up at popular egress ports and storage buffers overflowed. Managing congestion by dropping cells worked in these cases because the end destination would miss sequentially numbered cells and initiate a retransmission. Retransmissions are no longer acceptable because of the network speeds and time-critical traffic the network carries, such as video and audio. Now, to achieve lossless congestion control, your design must use ATM flow-control techniques that force data sources to shape the traffic. "Traffic shaping" means that the source doesn't send cells into the network until the source has reasonable assurance they will get to the destination on time without being dropped.

Traffic shaping begins at the cell source and continues through the network at each intermediary switch until the destination. A traffic scheduler at each switch shapes traffic by determining when to launch cells from an egress port. The scheduler considers priorities, quality of service (QoS), and network feedback while determining which VC's cells to launch downstream and when. The VC setup protocol establishes the priority level and QoS parameters for the VC. Each scheduler in the connection records the parameters.

Traffic schedulers, such as that in MMC's ATMS2000 switch engine, are critical to congestion control. The switch shares a scheduler that schedules traffic on a per-VC basis. This approach saves the expense of having a scheduler at each port. On the other hand, Lucent Technologies' Atlanta chip set includes scheduling logic on the ALM and ABM at each port card. However, the Atlanta lacks the per-VC queuing ability, instead using programmable, weighted, round-robin algorithms in scheduling four priority queues at each port.

The ATM-cell stream flows in a VC that could pass through many switches before reaching the destination. Ideally, each switch would maintain the shaping parameters on a per-VC basis. Any chips you select for core switching should be able to manage enough VC descriptors for the converging VCs at the targeted application, be it deep in the core or at the edge of the network. For instance, the Lucent Atlanta ALM supports as many as 64,000 VCs at ingress and at egress for each port--enough to handle many converging VCs at the core of a large network.

Traffic shaping relies on sophisticated queuing and buffer management to get a VC's cells to the output ports in time to meet launch schedules. One VC cannot block another. This fact is especially true for switches at the core of the Net where thousands of connections with many QoS requirements and priorities converge. The switch chips should not allow head-of-line blocking, in which cells become stuck in FIFO buffers behind cells with lower priority. The MMC Switch Engine and XStream chip set solves this blocking problem by eliminating the use of FIFO buffers (Figure 3). The port interface stores the cell payload in a common data memory and sends the cell header to the switch-controller chips. The controllers use pointer management instead of FIFO buffers to implement nonblocking queues.

Another way to control traffic through a switch is to implement feedback paths that prohibit buffers from overflowing. The Lucent Atlanta chip set supports a nonblocking, back-pressure scheme that carries buffer-full messages among the ATM crossbar element, the ATM switch element, and the ABM to stop buffer overflow. It provides for lossless storage throughout the switch. The Atlanta ABM handles full-duplex traffic at 622 Mbps and queues as many as 32,000 ATM cells in each direction. The MMC chip set, on the other hand, uses one controller and no FIFO buffers for all traffic through the switch; therefore, it requires no separate feedback connections.

Continuous traffic control also requires per-stream traffic policing. To implement this feature, some chip sets, such as MMC's XStream, include usage-parameter control, which checks the VCs to ensure that they conform to the contract for QoS that the VC negotiated with the switch at call setup. During the negotiation, the switch assesses its resources and reserves those that VCs require. If a VC transfers more traffic than the resources allocated to it can handle, then that VC violates its contract and interferes with the resources allocated to another VC.

The best way to control traffic through the switch is to implement per-VC queuing. In the MMC chip set, per-VC queuing uses three switch-interface chips plus memory to hold the queues and monitor the fill levels. The Queuing Controller and Scheduler process the VC QoS parameters to schedule outputs. Each VC has its own queue and associated priority level. The queues are logical rather than physical and can therefore be dynamically sized. This approach allows only the active VCs to actually use memory resources. The queues also have programmable thresholds to maximize memory usage.


References

  1. Anderson, Jon, Patrice Lamy, Laurent Hue, and Luc Le Beller, "Operations standards for global ATM networks: network element view," IEEE Communications, December 1996, pg 72.



  • Redundant components are essential to ensuring minimal cell loss and high availability.

  • Hardware support of OAM cells reduces software complexity in fault and performance management of ATM switches.

  • Hot-swap capability without losing cells is a big challenge to recovery software.

  • Per-virtual-connection queuing guarantees no head-of-line blocking.

  • Per-virtual-connection policing is necessary for maintaining quality-of-service contracts.

OAM mechanisms in ATM

The operations, administrative, and maintenance (OAM) protocols monitor the health and welfare of a switch's asynchronous-transfer-mode (ATM) layer. The OAM protocols work for not only interswitch, but also intraswitch probing. After identifying and locating faults, OAM reports them to routing processors that detour traffic. However, the protocol does use bandwidth and add complexity. The benefits of ATM-layer troubleshooting must outweigh the cost of adding OAM.

OAM is based on fault-management, performance-management, activation and deactivation, and system-management cells, which carry information between and within ATM equipment. (Table A). F4 cells monitor virtual paths (VPs); F5 cells monitor virtual circuits (VCs). Fault-management cells identify, locate, and report to the processor all losses of connections. Connection defects can arise from physical-layer defects, loss of cell delineation, and loss of continuity. Defects can occur at the ATM-connection level without having any problems at the physical layer. For instance, corrupted data could enter the VP- or VC-routing tables and cause loss of either type of connection.

The OAM protocols monitor the ATM network end to end and segment by segment. End-to-end performance and fault management let end users see what kind of service they are getting from the overall network. Segment-by-segment monitoring lets administrative domains monitor their own equipment and performance. In addition, both types of monitoring are useful in billing and accounting.

To demonstrate OAM's benefits, imagine that a company contracts for a VP, or "permanent virtual connection" (PVC), from a carrier to connect the home office to a branch office on the other side of the country. The company then uses multiple VCs in that path to carry voice, video, and data streams between the offices. The carrier uses F4 OAM cells to check their own network and switch performance. The F4 cells indicate the usage and performance that the carrier is providing to the company at the same time they are monitoring the PVC for faults. The company uses F5 cells to monitor the VCs that the contracted VP is carrying. The company knows how it is using the bandwidth it has contracted and when there are problems. Both participants benefit when OAM protocols offer network performance, fault, and maintenance management at multiple levels. The company can choose to monitor its VCs with fine granularity, such as counting every cell. The carrier can also choose its own granularity independently of that its customer chooses, because the carrier passes only the F5 cells through its network and processes only the F4 cells.

In a segment carrying a VP connection, two switches use OAM cells to check connections (Figure A and Table A). In a VP connection, the switches use F4 OAM cells. If Switch A's OAM controller sees no transmitted user cells or reported defects on the VP in a specified time, T0, the controller generates and transmits continuity-check (CC) cells downstream. If Switch B does not receive user or CC cells in elapsed time T1, where T1 accounts for physical delay between the equipment, the Switch B OAM controller determines that there is a loss of connection. The Switch B controller then generates alarm-indication-signal (AIS) cells to send upstream and downstream to report the lost connection. When Switch A receives the AIS cell, it then sends the remote-defect-indication cell to upstream switches. Once the segment is aware of the fault, loopback cells locate the fault for further diagnosis and recovery.

The performance-management cells exchange cell-accounting and quality-of-service (QoS) tracking information among ATM devices. OAM cells account for statistics on a per-VC and -VP basis. The cell-accounting information includes cells received, cells transmitted, and peak cell rates. The QoS information includes cell-loss and erred-cell rates, cell-misinsertion rates, cell delay, and cell-delay variations, or jitter.

The performance-management cells also perform bit-error checks on payloads, delineating and checking data cells. The source analyzes the block and calculates an error-detection code. The destination then checks on the payloads of cell blocks received and reports any detected errors. OAM protocols also provide remote monitoring and control using activation and deactivation cells, which carry information for setting up or discarding parameters in OAM controllers. In other words, the cells describe OAM monitoring goals and granularity. In addition, the system-management cells provide the system implementor with tools to build protocols for managing configuration, accounting, and security.

Table A--OAM-cell descriptions
Classification Name Function
Fault management (FM) Alarm-indication signal (AIS) Reports defects to neighboring nodes
Remote-defect indication (RDI) Reports defects to farther-upstream nodes
Continuity check Checks virtual connections for continuity
Loopback Locates faults and monitors connections
Performance management (PM) Forward monitoring Measures downstream cell loss, errors, and delays on a per-VC basis
Backward reporting (BR) Reports FM results to source
OAM activation and deactivation PM forward monitoring Sets up and dissolves per-VC monitoring functions
PM backward reporting Sets up and dissolves per-VC monitoring functions
FM continuity check Sets up and dissolves continuity-check tasks
System management Not specified Used by equipment and system manufacturers

Stephen Kempainen, Technical Editor

You can reach Stephen Kempainen at 1-415-643-1760, fax 1-415-643-9513,
e-mail
ednkempainen@worldnet.att.net.


| EDN Access | Feedback | Table of Contents |


Copyright © 1997 EDN Magazine, EDN Access. EDN is a registered trademark of Reed Properties Inc, used under license. EDN is published by Cahners Publishing Company, a unit of Reed Elsevier Inc.
Table 1--Products for 622-Mbps ATM switch cores and port interfaces
Vendor Product Function BIST1 QoS classes2 Availability Package Price
IGT
Gaithersburg, MD
1-301-990-9890
www.igt.com
WAC-487-A quad routing table; WAC-488-A quad switching element Scales from 5 to 40 Gbps through-put; per-VC input buffering, per-VC switch feedback; self-routing switch fabric; 64 service classes with per-VC queues; per-service-class virtual output buffering for congestion firewall between outputs JTAG external test ABR, CBR, VBR, UBR November 503- and 596-pin BGAs $264,
$316
(10,000)
Integrated Device Technology
Santa Clara, CA
1-408-754-4624
www.idt.com
IDT77V500 switch controller; IDT77V400 switch memory The core for concentrators and expanders; ports configurable for any combination with total 1.24 Gbps (1- to 622-Mbps in ports and 4- to 155-Mbps out ports); on-chip memory for 8192 52- or 56-byte cells; V400 can be used separately None ABR, CBR, VBR, UBR October 100- and 208-pin PQFPs $98
(10,000)
Lucent Technologies
Allentown, PA
1-800-372-2447
www.lucent.com/micro
Atlanta ATM layer UNI manager (ALM) ALM and ABM used together for port cards; provides multiplexing for lower rate user-network-interface and network-to-network- interface ports; handles as many as 64,000 ingress and egress VCs; 622-Mbps aggregate bandwidth; statistics collecting JTAG external test ABR, CBR, VBR, UBR Now 240-pin SQFP $700/set
(1000)
Atlanta ATM buffer manager (ABM) Cell buffering; queue management and scheduling; queues as many as 64,000 cells/port; used with only the ALM to build a 30-port, nonblocking, low-end switch; supports redundant connections to switch fabric JTAG external test ABR, CBR, VBR, UBR Now 352-pin BGA $700/set
(1000)
Atlanta ATM switching element (ASX) 8×8 full-duplex ports using shared-memory archictecture; 5-Gbps throughput; 512-cell, on-chip buffer JTAG external test NA Now 388-pin BGA $700/set
(1000)
ATM crossbar element (ACE) Used with ASX for constructing three-stage switch fabric; switch fabrics as fast as 25 Gbps with 40 622-Mbps ports JTAG external test NA Now 352-pin BGA $700/set
(1000)
MMC Networks
Sunnyvale, CA
1-408-731-1600
www.mmcnet.com
ATMS2000 ATM Switch Engine (four chips) Nonblocking, 5-Gbps switch functions; scalable to 20 Gbps; dynamic allocation of shared-memory buffering; as many as 64,000 cells total buffering; four priority queues per port JTAG external test ABR, CBR, VBR, UBR Now Two 208-pin and two 240-pin PQFPs $2000
(1000)
ATMS2101C XChecker Monitors and polices cell traffic and gathers statistics for as many as 64,000 VCs; only one device per switch JTAG external test NA Now 208-pin PQFP $180
(1000)
XStream five-chip set Per-flow queuing technology for both frames and ATM, isolates traffic flow on per-VC basis, implements three scheduling algorithms JTAG external test ABR, CBR, VBR, UBR Now Three 208-pin PQFPs, 256-pin BGA $2000
(1000)
PMC-Sierra
Burnaby, BC, Canada
1-604-688-7300
www.pmc-sierra.com
PM7332 RCMP-800 Ingress port with address translation, cell-rate policing, counting, and OAM for 65,536 VCs; 16-bit interface to switch fabric; Utopia Level 2 interface JTAG external test ABR, CBR, VBR, UBR Now 240-pin PQFP $199
(5000)
Vitesse
Camarillo, CA
1-805-338-3700
VSC850 16×32 crosspoint switch GaAs switch for connecting any of 16 data inputs to one or a combination of 32 data outputs; differential ECL serial data ports as fast as 1.25 Gbps None NA Now 208-pin PQFP $143
(1000)
1BIST can be an internal test for testing gates inside the device or an external test for pc-board connections.
2ABR=available bit rate, CBR=constant bit rate, VBR=variable bit rate, and UBR=unspecified bit rate.