FROM EDN EUROPE: ARM targets automotive and industrial dominance
Back before the PC owned the personal computer market, British company Acorn sold the BBC Microcomputer—so called because the national broadcaster ran educational programmers tailored for this machine. Engineers loved the BBC Micro because its highly efficient BBC-Basic compiler freely mixed high-level code and assembly language and the system offered direct access to I/O. As a result, the machine found roles in laboratories and industrial applications that its designers could hardly have foreseen. Spurred on by success while recognising the need for a more powerful processor than an 8-bit 6502, Acorn embarked on designing its own silicon—eventually adopting the RISC approach that researchers David Patterson at Berkeley and Bell Laboratories' David Ditzel proposed back in 1980 (Reference 1). In 1985, the ARM1 became the first commercially available RISC chip.
Ultimately, Acorn spun off its chip-design department into the fabless intellectual-property concern Advanced RISC Machines—now better known as ARM. Since its inception in 1991, this company's dominance in mobile products has come to rival the PC's desktop supremacy. According to a recent story in Electronics Weekly, ARM's chief executive officer Warren East said with respect to the company's market share in 3G phones: "If it's not 100 percent, it's very close to 100 percent" (Reference 2). East went on to note that despite the telecoms success, ARM's non-wireless business has grown faster than its wireless business: "Non-wireless shipments are half a billion a year and non-wireless is growing at 44%, whereas overall we're growing 30%." So what's the secret of ARM's success, and how useful are its architectures beyond the confines of mobile telephony—within automotive and industrial applications, for instance?
The key advantage that RISC promises for power-sensitive applications is low power consumption with minimal performance impact. ARM has long claimed leadership in the MIPS-per-Watt race with, for instance, its new Cortex-A8 core achieving 600 to 800 MHz throughput for 300 to 400 mW of power consumption. By comparison, a popular DSP such as Analog Devices' Blackfin—itself a highly efficient machine—consumes around 550 mW at 600 MHz in normal operation. RISC cores also require less silicon area than traditional microcontrollers, making it possible to include more peripherals. These features allow silicon vendors to offer stand-alone ARM-based devices that span basic 32-bit microcontrollers to complex application-specific standard products. First though, it's worth reviewing the ARM landscape to identify the core variants on offer and their relative feature sets. The company categorises its products as application cores, embedded cores, and secure cores. Application cores are those that run commercially available operating systems (OS) including Linux, Palm OS, Symbian OS, and Windows CE. These processors are intellectual-property designs for third-party integration and generally have an optional floating-point-unit to accelerate 3D graphics. The latest offerings comprise the ARM11 MPProcessor multiprocessor that can include up to four processor instances, as well as the new Cortex-A8 processor. The range also counts the ARM7 and ARM9 variants that are popular in the embedded space, with all devices in this group including a memory-management unit (MMU) to enable OS operation.
With a memory-protection unit (MPU) and an optional cryptographic coprocessor, the secure core group includes a host of security features to suit applications such as smartcards. But it's the embedded core range that's of greatest interest to most designers, where the ARM7 and the higher-performance ARM9 rule. Various bus interfaces are available, including the company's AHB (advanced high-performance bus), its open-source AMBA (advanced microcontroller bus architecture), and AMBA's latest derivative, AXI (advanced-extensible-interface). Cores may also support DSP instructions and Jazelle, ARM's Java acceleration technology. All variants support the Thumb instruction set, a 16-bit subset that compresses the commonest 32-bit ARM instructions to save on memory cost, expanding them at runtime with minimal performance penalty.
In deliverable terms, there's a further product subdivision into macrocells that comprise a physical layout tailored to a specific semiconductor process, and synthesisable IP (intellectual property) that takes the form of a high-level-language definition suitable for use with a cell library. Importantly from an off-the-shelf silicon user's perspective, the synthesis option allows chipmakers to implement many variations of the full feature set, with the result that not all processors include all of the same facilities. For example, chipmakers may exclude the embedded-trace macrocell (ETM) that furnishes hardware address and data comparators, address-range decoders and comparators, counters, sequencers, and external I/O ports.
ARM7TDMI Rules Embedded
From over 1 billion ARM cores shipped last year, the ARM7TDMI remains the most popular. Understanding this core's major features is the key to the range, as later products extensively build on this foundation. For instance, the ARM720 includes a coprocessor to control its cache and MMU, while ARM9 adds another two stages to the ARM7's three-stage pipeline to simplify the pipeline logic and allow higher clock speeds. This later architecture also splits ARM7's unified databus into separate data and instruction-memory busses to increase memory bandwidth. Crucially, the higher performance architectures are upwardly code-compatible with ARM7, furnishing a scaleable upgrade path that extends to multiple cores. See sidebar "ARM7TDMI Primer" for an overview of the chip's major characteristics.
One feature that sets the ARM7 apart from conventional microcontrollers is its support for coprocessors via a hardware interface and an instruction set extension mechanism. There's support for 16 logical coprocessors, each of which can access up to 16 private registers of arbitrary size. Each coprocessor uses the same load/store architecture as the ARM core, with instructions to move data around its own registers and to exchange data with the ARM's registers and external memory. The coprocessor interface relies on a bus-watch/handshake technique where the coprocessor copies the instruction stream into its own pipeline and synchronously replicates the host's actions. The coprocessor can always start execution providing that it's able to revert to its original state if the handshake doesn't complete. The handshake involves three signals—*cpi (coprocessor instruction), cpa (coprocessor absent), and cpb (coprocessor busy). When the host encounters a coprocessor instruction, it issues the *cpi signal. If no coprocessor is present, the cpa signal remains active and the core enters an undefined instruction trap sequence. If the coprocessor accepts the instruction but is busy, it negates cpa but leaves cpb active. The core then runs a wait sequence during which it may respond to interrupts before retrying. When all three signals are low, the coprocessor is ready to execute its instruction.
This system-on-chip interface stimulates a wide range of stand-alone microcontrollers that especially suit automotive and industrial use. For example, the new mixed-signal ARM7TDMI-powered microcontrollers from Analog Devices ideally suit sensor applications. The eight-member ADµC 70xx family shares 40 MIPS throughput, 62 kbytes of Flash, and 8 kbytes of RAM (Figure 1). Major peripherals comprise a multiple-input 12-bit ADC that runs at 1 MHz and up to four 12-bit output DACs. The smallest package measures just 6mm2 and accommodates 13 general-purpose I/O pins. Larger packages furnish up to 40 GPIO pins and a 16-bit resolution three-phase PWM module that suits medium-complexity drives such as ac-induction motors, while future derivatives may include dedicated multiply/accumulate hardware to tackle sensorless motor control. Some chip variants also offer an external memory interface, but among the common features are a precision 20ppm/°C voltage reference, a temperature sensor accurate to ±3°C, and a voltage comparator. Serial communications include a UART, SPI port, and two I2C ports. There's also a 16-element programmable-logic-array (PLA) that's useful for mopping up glue-logic functions to implement, for example, a programmable threshold-level interrupt using the comparator and a DAC channel.
Donal Killackey, product marketing manager for precision analogue microcontrollers, says that Analog Devices' choice of the ARM core to partner its existing range of 8052-based devices was natural: "We already had an ARM license for the telecoms products and were working on an instrumentation project for fibre transceivers that needed big compute power in a tiny form factor. Coincidentally, our 8052-based customers were asking for options with more processing power, with many expressing a preference for an ARM7 family." As a result, Killackey says, the company adopted a two-prong approach, offering both custom and generic parts to automotive and industrial OEMs. He foresees new chip variants becoming available with more Flash and interfaces such as CAN (controller-area-network) and LIN (local-area-interconnect). Available now is the $249 QuickStart Plus development kit that comes complete with an evaluation board, code-limited versions of IAR's Embedded Workbench and Keil's µVision3 environment, the PLA programming tool, a Windows serial downloader, a power supply and documentation. The kit includes Analog Devices' own nonintrusive JTAG debugger, which—Killackey notes—is RDI (remote-debugger-interface)-compliant for compatibility with a wide variety of debug environments.
Renowned for its AVR microcontrollers and its 8051 range, Atmel is another enthusiastic ARM proponent. Peter Bishop, Atmel's communications manager at its Rousset, France facility, says that the company took this route for its mass-market 32-bit products because "ARM is a rock-solid standard in 32 bits." He notes that the ARM7 especially suits application as a connections engine, providing the bridge between different physical layers and protocols. Atmel offers two subfamilies, with its XC variant offering encryption blocks that include AES (advanced-encryption-standard) and triple DES (data-encryption-standard) cryptographic engines. Bishop notes that encryption is no longer limited to areas such as financial transactions: "Secure Web access is now a prerequisite for industrial Ethernet ports. You don't want some hacker wreaking havoc with industrial processes," he warns. Atmel's entry-level ARM7 device is the SAM7S series, which offers 32 to 256 kbytes of Flash together with 8 to 64 kbytes of SRAM; peripherals include a USB 2.0 device. Bishop says that the 7SE variant will also have an external bus interface to augment on-chip memory. Other derivatives include the 7A that features CAN interfaces, the 7L that targets ultra-low-power applications such as environmental sensors, and the new 7X that boasts CAN, Ethernet, and USB interfaces on its 50-MIPs silicon.
For example, the AT91SAM7X128 carries 128 kbytes of Flash and 32 kbytes of SRAM. Its 100-lead package is also home to two UARTs and two SPI ports, an 8-channel 10-bit 384k-samples/sec ADC, a PWM-capable counter/timer block, and an 18-channel peripheral DMA controller. The latter device constrains a 10-Mbps stream to consuming only about 4% processor overhead whereas a conventional design exhausts its bandwidth at this point. The similar 7X256 houses 256 kbytes of Flash and 64 kbytes of SRAM, while the 7XC versions add encryption engines. Bishop notes that these designs feature Atmel's multilevel vectored-interrupt-controller that circumvents the native core's two-level interrupt restriction: "The ARM7 is just fine as an execution engine, but its lack of a prioritised interrupt controller often requires software implementations that choke performance." Atmel incorporates a bit-set/clear model for all of its peripherals, which facilitates atomic I/O operations within a single non-interruptible instruction cycle—dispensing with the need to mask and re-enable interrupts around a bit-manipulation routine.
Bishop points to Atmel's long-standing relationship with vendors such as IAR for development support. At the top end, the Integrity RTOS from Green Hills tackles critical applications, but developers can get started with products such as the AT91SAM7X-EK evaluation board that's available for $249 (also budget $129 for the AT91SAM-ICE debugger).
Atmel's latest ARM-based product is the AT91SAM9261, which embeds the 200 MIPs ARM926 core. Designed for very low power consumption in battery-powered wireless devices, the device includes an LCD controller that suits black-and-white and colour displays of up to 2048×2048 pixels. A development board showcases these features (Figure 2). To further minimise system-power consumption and also reduce the bill-of-materials, it's possible to configure the chip's 160 kbytes of SRAM as a frame buffer. The chip is fast enough to run security algorithms in software, while its hardware comprises optimisations for real-time use, such as Atmel's tightly-coupled memory architecture that permits users to directly connect external SRAM to the CPU with no latency concerns, as well as a prioritised interrupt controller and DMA subsystem. The chip also carries a USB host controller within a 217-pin ball-grid-array package for less than $10 in production volumes.
Mike Olivarez, principal staff scientist at Freescale's wireless and mobilesystems group, says that from its initial wireless communications background, Freescale is positioning its ARM core-based i.MX application processors ever more toward automotive, industrial, and medical applications: "Anything that has batteries, buttons, or displays is a perfect fit for the i.MX family." He notes that Freescale is an ARM architectural licensee and lead partner, which enables the company to make changes to the ARM architecture while maintaining complete software compatibility. To provide the compute horsepower that multimedia applications need, Freescale's i.MX family embodies ARM920, 926, and 1136 cores. Olivarez points to devices such as the new i.MX31 as enabling a range of ultra-portable computing devices: "Having a hard-disk controller on-chip as well as the video interfaces opens up non-traditional applications, such as embedded security monitoring." Key features couple the MMU that most OS demand with multimedia-specific hardware including a CCIR-656 video interface, MPEG4 hardware video acceleration, a 2- and 3-D graphics engine, and a versatile LCD controller. There's also an array of serial interfaces including wireless support with a SDIO (secure digital input/output) interface to support trusted-content strategies. An internal crossbar switch that overcomes bus-bandwidth limitations and a vectored-interrupt-controller to handle real-time events help build a fully parallel processing system, Olivarez says.
Over at its wireless and mobilesystems group's developer relations team, Fred Stotz views Freescale's success in building third-party communities as enabling a range of toolchain support that customers find irresistible. To help proliferate i.MX processors, Stotz worked with partners to develop entry-level tools such as the i.MXL and i.MX21_litekits that retail for a recommended $499. These kits comprise a 6.35×3.8-cm standalone board developed by Cogent Computer Systems that carries an i.MXL or i.MX21 processor, 64 Mbytes of SDRAM, 8 Mbytes of Flash, and a USB port. You also get Cogent's expansion board that adds peripherals including a touchscreen LCD, Ethernet controller, audio CODEC, and USB host and device ports. Running on a Windows or Linux host via a TFTP (trivial-file-transfer-protocol) USB link, software support comes from the GNU-X toolchain and GX-Linux ports by Microcross that include Ethernet, serial, and video-interface drivers as well as the normal utilities and libraries. The debugger is Visual GDB, a variant of the GNU debugger. Stotz adds that Linux board support packages are available from Freescale's website for these kits as well as for the ADS (application-development system) versions that feature the ubiquitous Metrowerks CodeWarrior IDE and retail for $1,888.
Ata Khan, director of product innovation for microcontrollers at Philips Semiconductors, reflects that the company's involvement with ARM started about 10 years ago for cellphone applications. Today, the company employs ARM cores within its own products, and has publicly offered its ARM7-based LPC2000 family for about five years. The LPC2000 series now comprises some twenty variants that offer from 32 to 256 kbytes of Flash as well as external memory options; 8 to 64k SRAM; various timers that include watchdog and capture/compare facilities; a dedicated PWM unit; serial interfaces including CAN, I2C, and SPI; a multichannel 10-bit ADC; and from 32 to 112 GPIO lines. Packages range from 28-pin PLCC to 144-pin LQFP. Growing application areas span low-end DSP and control to protocol converters and PMBus ports.
Announced last September, the latest family members are the LPC2101/2/3. With pricing set at just $1.47/10,000 for the 8-kybte Flash and 2-kbyte SRAM-equipped LPC2101, this ARM7TDMI machine sets a new low price point. The LPC2102 packs 16 kbytes of Flash and 4 kbytes of SRAM for $1.85/10,000, while the LPC2103 provides 32 kbytes of Flash and 8 kbytes of SRAM for $2.20/10,000. The 7-mm2 outline houses four timers with capture/compare for PWM support, a real-time clock and a watchdog timer, two 16C550-style UARTs, two I2C and two SPI ports, and an 8-channel 10-bit ADC (Figure 3). All of these resources reside on the advanced peripheral bus and communicate with the core via a bridge onto ARM's advanced high-performance bus. Notice too the vectored interrupt controller that augments the core's limited interrupt-handling hardware.
Khan points to the 32 fast 5V-tolerant GPIO ports that sit on the core's local bus: "As it's a RISC machine, there's no intrinsic read-modify-write capability, so we added atomic bit manipulation and moved the GPIO to the CPU bus. These measures ensure that bit manipulations complete in a single instruction cycle, so they are deterministic and non-interruptible—and capable of toggling at speeds up to 17.5 MHz." Khan also notes that the chip's 128-bit wide Flash provides some 280 Mbytes/sec of memory bandwidth that allows the core to run at its full 70 MHz, assisting deterministic behaviour when running from on-chip memory. This Flash architecture employs a two-transistor construction that's a trade-off between size and robustness and also facilitates building EEPROM. The Flash uses prefetch and branch buffers to virtually equal SRAM performance, and includes an 8-bit error-correction-code field that detects and corrects all single-bit errors, a feature that automotive applications increasingly demand. Four power-management modes progressively shut the chip down to its 5-µA sleep mode.
Khan stresses the importance of the chip's debug abilities and his company's toolchain partners. Entry-level development kits are available from suppliers such as IAR and Keil starting at around $99, with tools working up to the $10,000 level for highly optimising compilers that support the very large code sets that, for example, a multi-user Linux development project demands. Because the embedded-trace-module takes 10 pins for a full-speed implementation, Khan suggests starting off with a high-end LPC part that makes the module available to the outside world, then migrating to a part that optimally fits the end-user application. For hands-on development guidance, LPC designers can freely download a book written by Trevor Martin—an engineer at development specialist Hitex, whose website offers a variety of ARM development kits (Reference 3).
Having just announced USB peripherals for the LPC series, Khan advises that Ethernet will shortly be available. He also sees Philips moving on to ARM9 architectures and shrinking the process geometry to 90 nm within the next year: "Although 90 nm decreases part cost, the negative is a big rise in on-chip leakage currents. But if you constrain speed, you can also constrain leakage." For this reason, Khan expects the company to use different cell libraries to provide alternative power-versus-speed choices to complement its multilevel power-domain hardware: "To suit applications such as wireless sensor nodes, we'd really like to get sleep-mode power consumption down below 1µA."
With an ongoing partnership set up of more than 12 years, Sharp Microelectronics was ARM's third licensee and enjoys extensive experience in combining ARM cores with custom graphics engines. Gunter Wagschal, product-marketing manager for Sharp's BlueStreak products, points to a range that now spans ARM7TDMI, ARM720, and ARM9: "As an LCD manufacturer, we benefit from a scaleable range of computation engines to handle graphical interfaces from black-and-white displays in white goods to complex man-machine interfaces in industrial automation." For minimum cost, the ARM7TDMI derivatives employ a 16-bit memory interface and exclude the embedded-trace-module that Wagschal notes is only necessary during early development: "Companies such as Lauterbach offer very good trace tools that work out much less expensive over long production runs than including the ETM," he says. The LH754xx family combines graphical controllers with VGA, XGA, and colour or black-and-white interfaces with 32-kbyte SRAM, an 8-channel 10-bit ADC, three 16-bit timers with capture/compare facilities, a universal serial interface, and a four-channel DMA controller. The chips run at up to 90 MHz from a 3.3-V supply, with derivatives furnishing up to 76 5V-tolerant I/O pins and a CAN controller within a 144-pin outline. There's also a range of complementary ARM720T products that include MMUs and SDRAM interfaces to suit OS applications.
Sharp's latest additions are the 250-MHz LH7A400 and its 266-MHz cousin, the LH7A404. Both devices employ the ARM9 core coupled with a colour XGA graphics engine to suit high-end graphical-user-interface applications, with 16 kbytes of both cache and instruction memory complementing 80 kbytes of SRAM. Other common features include an MMU, SDRAM interface, serial ports including SPI and USB capabilities, an infrared interface port, an interface that supports smart cards and memory cards, and the JTAG port. Wagschal observes that such applications almost invariably run a complex OS such as Linux to serve the application's needs. He stresses Sharp's commitment to customer support, which is crucial win and retain business: "We've done a huge amount of work internally and with third-parties to, for example, port Linux onto our high-end products. We're now running the latest Linux kernel, version 2.6.12—but probably most importantly for the majority of our customers, we provide full support for our peripherals in our board support packages." Usefully, Sharp's driver support takes the form of C-source code rather than object code—and it's freely available under the BlueStreak software library website, where engineers can access a range of resources including Metrowerks' open-source Linux for the LH7A400. Sharp offers several reference designs for media players, with third-party development-board support coming from Logic Product Development.
Stephanie Ordan, product line manager for the STR730 series at STMicroelectronics, says that her company first became involved with ARM some five years ago with the intention of pursuing automotive applications: "From a platform viewpoint, the ARM architecture is upwardly compatible from ARM7 through the ARM9 and 11, which is a huge advantage for developers looking for a migration path within a familiar environment. We will be moving to ARM9 during 2006". ST has recently taken advantage of its CAN-bus-focused ARM silicon from its STR710/720 series to tackle the industrial market. Released only this October, the company promotes its new STR730 range as the first ARM7 devices to tackle industry's needs. Key features include a 5V power supply, –40 to +105°C operation, an enhanced interrupt controller, and full-speed execution from Flash to ensure deterministic response from the 36-MHz, 32-MIPS ARM7TDMI core.
The four-member STR730 range currently offers 64 to 256 kbytes of Flash with 16 kbytes of SRAM in packages that range from 100-pin TQFP to a 144-pin TQFP. A 144-ball BGA version shrinks the outline to just 10 mm2. Peripherals can include as many as twelve serial communications interfaces—three CAN 2.0B channels, two I2C ports, three SSI ports, and four UARTs—as well as a 16-level, 64-vector nested interrupt controller; ten 16-bit timers with capture/compare; six 16-bit PWM modules; three general-purpose 16-bit timers with 8-bit prescalers; sixteen channels of 3 µsec ADC; four four-channel DMA controllers; a watchdog timer and a real-time clock; and up to 112 I/O lines. The chip also implements a five-level power-minimisation strategy via two embedded voltage regulators. Toolchain support includes starter kits from IAR and Raisonance, as well as ST's own STR730-EVAL/WS evaluation board. Crucially, developers can freely access source-code device drivers for all the STR730's peripherals from the company's website (Figure 4). There's even a free uClinux port that's designed for the STR710-EVAL board. The STR730 is sampling now, with prices that span $4.53/10,000 for the CAN-less 100-pin, 64-kbyte Flash STR736FV0 to $8.99/10,000 for the 256-kbyte STR730FZ2 that carries three CAN ports and 112 GPIOs within its 144-lead TQFP.
Speaking for Texas Instruments, Matthias Poppel, marketing manager for TI's advanced embedded control division, reflects that it doesn't make a lot of sense for TI Automotive to keep on developing their own cores when very capable industry-standard designs are readily available: "TI is ARM's biggest single customer, and we have a very tightly coupled relationship." Acknowledging TI's leadership in DSPs, Poppel points to platforms such as TI's OMAP (open-multimedia-applications platform) that combines ARM cores including the 9 and 11 variants with DSPs from the TMS320C family. Principally intended for automotive use, the ARM7TDMI-based TMS470 series runs at the sub-100-MHz level rather than the headline-grabbing GHz rates of its DSP brethren. Poppel explains: "Automotive designers don't want cores running at very high speeds as this not only adds to the RF noise field—possibly disturbing audio and video signals—but could also potentially impact safety-critical subsystems."
Performance is nonetheless a big issue, so TI has developed efficient peripherals that can operate for the most part autonomously. Examples include its high-end timer module that includes intelligence to free the CPU from interrupts, and an ADC module with internal buffers that minimise data movement. For development support, Poppel references TI's Code Composer suite for the TMS470, which comprises a 470-specific compiler and linker but maintains the same graphical user interface as other versions of this popular environment. The company is also developing a peripheral configuration tool that will allow users to set hardware without having to understand the registers that underpin each block—which, as anyone who has tried will know, can be a hugely time-consuming operation. For the future, Poppel confirms that TI will be developing silicon based on ARM's new Cortex architecture, probably sampling as soon as 2007.
ARM Seals New Deals
As we were closing for press, Actel announced that it now supplies a version of the ARM7TDMI-S (the –S signifies the synthesis version) to suit its ProASIC3 Flash-based FPGAs. Dennis Kish, Actel's vice-president of marketing, said: "Up until now there hasn't been a synthesisable ARM solution in the FPGA space…people wanting to put a processor in an FPGA didn't have many choices—and especially not a popular choice like the ARM7, which is arguably the most used and well-known 32-bit core." While noting that FPGA design starts outnumber those of ASICs by about 40:1, Actel reckons that only around 15% of the annual 80,000 FPGA design starts embed a microcontroller.
The deal with ARM is significant because customers don't have to negotiate with the intellectual-property supplier, which always zealously guarded its IP—the subtext being that ARM trusts Actel's Flash devices to protect their contents. Further protection comes from the deliverable's firm core format rather than Verilog or VHDL code, an approach that also guarantees the core's performance. It appears to users as a black box during the synthesis phase, when it's possible to tweak subsystems including the bus interface, peripherals, and I/O. Actel's new Windows-based CoreConsole tool permits users to set the system specification, with hardware development via its Libero integrated development environment and an Actel-specific version of ARM's RealView software tools. According to Kish, the core consumes about 6,000 ProASIC tiles or roughly 250,000 system gates. Actel will provide the core free to customers buying ARM-ready M7 ProASIC3 devices and expects to offer the core for other families. Its M7A3P250, M7A3PE600, and M7A3P1000