Multiprocessing unravels PCI- and CompactPCI-design problems
New bridge chips and multiprocessing, intelligent I/O boards solve many of the real-time limitations in PCI and CompactPCI designs.
Christoph Adam, Force Computers -- EDN, January 7, 1999
|
Computer architects developed the high-speed PCI bus for PCs. However, in recent years, the PCI bus has also found its way into the embedded market in various forms. The most popular representative is PCI’s electrical and logical equivalent—CompactPCI with its rugged, tried-and-true Eurocard format. Designers now implement various techniques to implement real-time multiprocessing using various PCI-to-PCI bridges (PPBs). Designers must differentiate between symmetrical and asymmetrical multiprocessing. Asymmetrical multiprocessing links several PCI-based boards via the PCI/CompactPCI bus. Each board is "intelligent," having its own processor and operating system. A basic configuration implements the user interface and the network part of the system on the host CPU with I/O modules providing additional real-time processing ( Figure 1 ). Symmetrical multiprocessing, on the other hand, combines several processors on one common processor bus under the control of one operating system that distributes independent threads to the multiple processors. Symmetrical multiprocessing occurs only on the processor bus and runs independently of the PCI bus or another I/O bus. This article covers only asymmetrical multiprocessing. A multiprocessing PCI/CompactPCI system must have a central controller that generates clock signals for the PCI bus, manages configuration, provides arbitration, and handles interrupts. The synchronous PCI bus requires the system controller to generate a central clock signal for the system. The processor provides this clock signal to the local PCI bus via the processor-to-PCI, or "north," bridge ( Figure 2 ). In a CompactPCI system, the CPU forwards the local PCI clock signal to the bus, which distributes the clock signal to other boards in the system. The developers of the PCI and CompactPCI based both buses on hardware and software specifications that define the autoconfiguration cycles to enable plug-and-play capabilities. During power-up autoconfiguration, the central system controller uses special bus transfers to detect which PCI devices connect to its local PCI bus and which connect to other PCI buses behind a PPB. The central system controller first configures and assigns address ranges for its local PCI devices and then repeats the process for the PPB and for any PCI devices found on PCI backplanes behind the PPB. Because the PCI bus is a multimaster bus, several masters can compete for possession of the bus via a request/grant mechanism. A central arbitrator that is usually in the processor-to-PCI bridge allocates possession of the PCI bus. For example, an SCSI controller collects data, requests possession of the bus and transfers data to main memory by DMA transfer. This DMA data transmission requires no processor. The operating system running on the system-controller board manages the resources—sometimes called "dumb I/O"—of boards without a separate processor on their local PCI bus. Thus, a multimaster system is not necessarily also a multiprocessing system. The PCI bus has four interrupt lines that several devices can use. The central interrupt controller is directly on the local PCI bus of the system-controller board. This arrangement is adequate for a PC system, but interrupt processing in a CompactPCI system is more complicated. A standard CompactPCI system has eight slots: If more than four interrupt-capable boards are installed, more than one interrupt source must share some of the four available interrupt lines. PPBs bridge the gap Designers connect PCI buses in various ways. The first and most obvious method is to use a direct and parallel connection between the local PCI bus and the PCI backplane. The obvious disadvantage of this method is the restriction on the number of loads that a PCI module can drive. This simple hardware restriction limits the number of loads in any PCI system to 10. From the view of the PCI bus, a PCI device on the motherboard counts as one load, and each additional slot counts as two more loads. For a standard PC, this situation means a maximum of four expansion slots. To avoid the load restrictions of the direct-connection method, the PCI Special Interest Group (PCISIG) developed the PPB architecture specification in 1994. This specification defines a "transparent," or "standard," PPB ( Figure 3 ). After normal identification during the autoconfiguration cycle, the PPB becomes transparent to the host processor. From the multiprocessing point of view, the transparent bridge behaves the same way as the direct connection of two PCI buses in the first method. This type of PPB neither contains resources, such as DMA or mailboxes, that would require a separate device driver nor converts addresses from one PCI bus to another. In a transparent PPB on both the system controller and an intelligent I/O board, standard CompactPCI-slot architecture normally places the system controller/host board in Slot 1 with I/O boards beginning in Slot 2 ( Figure 4 ). Each board’s PPB is the interface between the board’s local PCI bus and the backplane PCI bus. The transparent PPBs solve the load-restriction problem but leave several multiprocessing problems. Transparent PPBs have no plug-and-play capability because autoconfiguration cycles from the two processors collide. After power-up, each processor configures its local PCI I/O subsystem (PCI1 and PCI2 in Figure 4 ), configures its transparent PPB, and then attempts to identify and set the registers and address ranges of the attached devices behind the PPBs on the PCI backplane. This situation means that the host board cannot identify an intelligent I/O board and vice versa. Processors on the system board and on intelligent I/O boards using transparent bridges also generate synchronous PCI-clock signals and try to simultaneously drive the PCI backplane. In addition, a transparent bridge has no device driver that could manage communications (such as mailboxes) between the two processors. Thus, you cannot—at least not without considerable additional effort—achieve multiprocessing using transparent bridges for intelligent I/O boards in a PCI or CompactPCI system. Embedded bridges to the rescue A new type of PPB, an embedded, or "nontransparent," bridge, solves the multiple processor problems. The developers of these bridges designed them for intelligent I/O boards and thus for asynchronous multiprocessing. Using transparent PPBs on the host board and nontransparent PPBs on intelligent I/O boards provides a hardware approach to asymmetrical multiprocessing in PCI and CompactPCI systems. Embedded bridges enable plug-and-play operation without collision of the configuration cycles. The embedded bridge effectively separates the areas that the local and host processors scan and configure. After power-up, the intelligent I/O processor scans, or autoconfigures, its own I/O subsystem on the primary or local side of the embedded bridge. It then configures the embedded bridge, installs address windows, defines address translations, and maps the address registers for its local PCI devices into the embedded bridges primary (local) and secondary (PCI-backplane-bus) sides. The local processors’ configuration range ends at the embedded bridge and does not scan past the bridge to find devices on the PCI-backplane bus ( Figure 5 ). During the same power-up sequence, the processor on the host board configures the local PCI I/O subsystem (PCI1 in the figure), the transparent PPB, and scans for PCI devices on the PCI-backplane bus. When the host processor finds the embedded bridge on the intelligent I/O module it sees only the secondary side of the bridge, which looks like a normal PCI target device, and configures it accordingly. The host processor cannot see the underlying PCI bus of the intelligent I/O module (PCI2 in the figure). Embedded bridges also solve the problem of addressing conflicts by address conversion. For example, the processor of the I/O board can "hide" resources it requires for itself from the host processor. In addition, address conversion lets you allocate a device on the local PCI bus of the intelligent I/O board to a PCI address on the backplane PCI without address conflicts. Embedded bridges also handle the clock signal differently from the transparent bridges. The host processor drives the clock signal via its transparent PPB to the PCI backplane bus and to the attached PCI devices. The embedded bridge on the intelligent I/O board receives the host clock signal on its secondary (PCI-backplane-bus) side but does not forward the host clock signal to its primary side (PCI2). The primary side of the nontransparent bridge runs with the clock signal from its own local processor. Both clock signals are still synchronous for their PCI bus, but asynchronous with respect to each other. The nontransparent PPB also includes mailboxes and I2O FIFO functions for communication between the host processor and the intelligent I/O processor. Interrupt problems Theoretically, the most significant real-time restriction for multiprocessing is the PCI interrupt structure. The PCI bus has four interrupt lines that connect to a central interrupt controller. The PCI-to-ISA, or "south," bridge normally houses the central interrupt controller. Using Figure 3 with the system slot and local PCI bus as an example, local SCSI, LAN, and PCMCIA devices can generate interrupts. Each of these local devices has an interrupt line on the interrupt controller. For example, if the Ethernet controller (LAN) generates an interrupt request, the controller detects the interrupt and sends it to the processor. The processor then generates an interrupt-acknowledge (IACK) transfer to the interrupt controller, which supplies the start address for the required interrupt-service routine. A PCI system with as many as four interrupt-capable devices can therefore quickly respond to an external event in a predictable time. Interrupt processing becomes more complicated in a standard CompactPCI system because interrupts must be shared when more than four boards are inserted or if a board contains a multifunction device that can drive more than one interrupt line. The recommended routing for the interrupt lines on the backplane and their connection to the interrupt lines of the individual slots means that INTA of the system slot connects to INTB of Slot 2, INTC of Slot 3, INTD of Slot 4, INTA of Slot 5, and so forth ( Figure 6 ). To see how this interrupt processing works, assume that slots 3 and 7 contain interrupt-capable "single-function" boards. The central interrupt controller detects an interrupt request from Slot 7 and then drives the individual INT line to the processor. The processor recognizes, by reading the interrupt vector from the interrupt controller, that the interrupt request came from INTC of the backplane bus. However, because two boards share this line, the processor does not know whether the interrupt request originated at Board 3 or Board 7. To determine which board originated the interrupt request, the processor must read the registers at the PCI interfaces of the two boards. This method of interrupt processing delays identification of the originator of the interrupt request and poses a theoretical risk: that the controller may not detect an interrupt request from Board 7 if valid interrupt requests from Board 3 arrive rapidly. Moreover, as a result of the PCI sequence convention, the write data of another transfer that may still be in the FIFO buffer of the system-controller board’s PPB must be forwarded before the transparent PPB can attempt a read access to the CompactPCI bus. This "flushing"’ consumes more time. Real-time interrupts If you plan to use the PCI bus for real-time applications, you should consider whether the interrupt structure is adequate. You should consider current system requirements and possible future upgrades. The basic principle behind intelligent I/O boards is to reduce the load on the host processor by locally handling interrupts and to route only necessary transfers via the CompactPCI bus. The intelligent I/O processor immediately handles critical interrupts, making irrelevant the somewhat-longer CompactPCI hardware-interrupt latency. Intelligent I/O boards can remove real-time requirements from the bus and leave Compact-PCI as a "management" bus. Revision 2.2 of the PCI local-bus specification includes message-signaled interrupts (MSIs) for real-time-critical applications that require many interrupt requests. MSIs do not use the four interrupt lines on the CompactPCI bus to indicate a service request to the processor. They instead use a PCI memory-write access over the CompactPCI bus to the PPB, which then can initiate a local interrupt on the system-controller board. The MSI revision is an as-yet-unapproved draft. Multiprocessing software The software for these designs, like the hardware, must also support asymmetrical multiprocessing systems. The processors must be able to interchange information. One approach is to use a shared memory that any processor can write data to for communication with other processors. However, a mechanism must also exist for notifying other processors that data is available. If the system controller wishes to notify an intelligent I/O board that it has stored data for the board in shared memory, the system-controller board initiates a register- access attempt to a special address of the embedded PPB of the intelligent I/O board ( Figure 7 ). This mailbox access triggers an interrupt on the I/O board’s local PCI bus. The local interrupt controller of the I/O board evaluates the interrupt, and the intelligent I/O processor then reads the data from shared memory. In the opposite direction, the intelligent I/O processor accesses a register of the nontransparent PPB and notifies the system-controller board of a CompactPCI bus interrupt. The transparent PPB directly forwards this interrupt request to the interrupt controller. Some applications require an additional communication facility between the intelligent I/O boards. This communication is similar to the mechanism an intelligent I/O board uses to notify the system controller. However, for this purpose, the intelligent I/O boards must write the location not only of shared memory but also of the corresponding (mailbox) registers of the I/O boards with which they communicate. I2O is an alternative communications technique between two intelligent I/O boards. This article describes the hardware requirements that you must fulfill for asymmetrical multiprocessing to function on a CompactPCI system. Using embedded bridges addresses the key multiprocessing requirements—clocking, plug-and-play configuration management, arbitration, and interrupt handling. Embedded bridges also offer additional features, such as mailboxes, I2O support, and hidden resource capabilities. Embedded bridge chips and asymmetrical multiprocessing I/O boards promise to solve some of the real-time limitations of PCI and CompactPCI designs. Handling the real-time activities on dedicated intelligent I/O boards eliminates the headaches of systemwide interrupt processing and restricts the PCI or CompactPCI bus to controlling system traffic. |
||
|
|
||
| Author's biography Christoph Adam works in the Technical Marketing Department of Force Computers (Neubiberg, Munich, Germany), where he has worked for six years. In his current position, he interfaces with OEM customers and defines VMEbus and CompactPCI products. Previously, he designed high-speed VMEbus interfaces and PCI controllers. Adam, who has a degree in electrical engineering and information technology, enjoys traveling and outdoor sports. He was a semiprofessional soccer player in the second German soccer league. |
||
Talkback


















