Feature
Dress your application for success
Choosing the right processor and logic implementation will drive your available trade-off and differentiation options for current and future generations of your application.
By Robert Cravotta, Technical Editor -- EDN, 11/8/2001
|

Dressing for success is about proclaiming your competence and differentiating skills that meet your customers' needs with the least expense and effort and the right amount of attention to detail. Choosing the right implementation for your application is no different; that is, unless you have a generous beneficiary who is looking for no return on an investment. Wearing a tailor-made suit to an interview for a mechanic job may raise some doubts about your understanding of the job requirements and indicate your poor attention to detail. The most powerful or most expensive approach is not always appropriate; your implementation needs to support all of your application's requirements.
To succeed, an application for an embedded system must include the processor that executes the software, the hardware logic that executes those functions that software does not perform, and the peripherals that interface with the outside world. The processor is the foundation of your application, and it affects nearly every aspect of your design. Selecting a processor and logic implementation is one of the most important design decisions you'll make. This decision drives the available trade-off options for your current effort and future generations of your application. When you dress for success, people notice those things, such as ties and other accessories, that differentiate you from others. Many opportunities exist to differentiate your application, but your implementation decision constrains your choices.
What does processor and logic implementation mean in this context? To narrow the scope of this query, it refers to how a software-programmable processor integrates with the other system components for your application. It abstracts and does not strongly distinguish among using a DSP, a microcontroller, a DSP-microcontroller pair, a DSP-controller hybrid, a multiprocessor, and general-purpose processors in your design. Processor implementation determines your choice of whether a processor is a discrete device or an integrated block of logic residing with some or most of your system logic. Your processor can be a standard off-the-rack device, an alterable device, or even a custom-tailored device. As with clothing styles, your appropriate options and choices may change over time as the market and your competitors shift the criteria for what makes a differentiated product and what makes a commodity feature.
So many stylesMany vendors offer off-the-shelf, standard processors (references 1 to 4). Standard devices are readily available and cost-effective because their development costs are amortized over many projects. Supporting various clock rates, amounts of on-chip memory, and integrated peripherals, a family of processors can meet the requirements of a range of small to large projects while maintaining a company's investment in development tools and software libraries. Processor families can include socket-compatible, interchangeable packages, allowing a project to scale resources by substituting one chip for another without changing the board-level design.
Standard devices allow a development team to quickly start implementing its design with less up-front-investment commitment than alternative methods. The lack of a significant investment in a device mitigates some of the project risks if evolving requirements require a new or modified hardware function. Designers understand the performance characteristics of standard devices, allowing a project to quickly focus on software development, board-level integration, and verification. Board-level integration, debugging, and component replacement is easier than at the chip-level because the logic blocks and interfaces are more direct and accessible. Because many projects include standard devices, a larger pool exists of engineers who have experience with those devices. The stability of standard devices and wide use encourages the availability of strong software-development tools, libraries for common functions, and a third-party support infrastructure that can allow a project team to more quickly bring its product to market.
Sometimes, your project pushes the limit for meeting processor-performance, power-dissipation, space-constraint, or price criteria, so that board-level integration of fixed and discrete devices is insufficient. As device densities have continued to increase, so has the opportunity to integrate your discrete components from a board-level design into a single silicon SOC (system-on-chip) design, reducing the number of components and size of your finished product (see sidebar "Changing your mind"). A finished system having fewer components costs less to assemble and is more reliable. The system power consumption is usually lower than that of the combined total of the discrete components because the internal transistors are driving the loads inside the chip rather than the tracks on the pc board. These smaller loads allow faster operation rates on the internal signals than those between chips.
ASSPs (application-specific standard devices) integrate specialized IP (intellectual property) with general-purpose features in a standard package, targeting applications that have sufficient volume and are mature enough that some of the required algorithm and feature implementations are no longer differentiating factors. These devices offer many of the benefits of integrated devices without the leadtimes, the commitment of up-front development and fabrication costs, or the NRE (nonrecurring-engineering) charges associated with respinning an ASIC. The communications, digital-cellular, and modem segments are big drivers of ASSPs. Consumer products, including DVD players, digital still cameras, and satellite box equipment benefit from ASSPs. Life cycles for products such as digital cameras can be as short as six months, meaning time to market matters more than ever. ASSPs can lower system costs and allow a design team to focus on its value-added strengths; however, these devices also raise the challenge for differentiating your product, because your competitors have access to the same devices and cost benefits.
Configurable implementations generally integrate a soft- or hard-processor block with a programmable-logic component on the same standard device that supports the configuration and reprogramming of hardware logic. Atmel and Triscend integrate processor cores with programmable logic for coprocessing and peripheral support. Proceler embeds a soft-processor core in a PLD and implements optimized program loops in neighboring programmable logic that the company's C compiler generates. Xilinx and Altera offer PLDs with embedded-processor cores. Altera's Excalibur devices are making ARM, and soon MIPS, cores in a configurable implementation available to those applications that cannot justify implementing an ASIC.
These devices deliver the parallel-execution speed of dedicated hardware but offer a great deal of functional flexibility. Integrating peripherals with embedded-processor cores and programmable logic enables designers to implement entire SOC devices with higher system-level performance than discrete components and shorter design cycles than ASICs. In general, configurable implementations do not support integrated analog peripherals driving those types of applications to explore ASICs. Including support for analog peripherals would complicate the migration of programmable logic to advanced fabrication processes and would result in more expensive and slower devices. Cypress Microsystems' programmable-SOC devices currently support runtime-configurable analog peripherals for 8-bit devices for which applications are more sensitive to peripheral selection than they are to higher process densities and processing performance.
In configurable implementations, the logic design remains flexible until the vendor ships the product, allowing rapid iteration during design. You can ship a product that meets the minimum requirements and add features after deployment. This approach is ideal for prototyping designs and releasing products destined for an ASIC implementation due to high-volume cost considerations. Reprogramming the programmable logic makes fixing bugs and administering upgrades in hardware analogous to those tasks in software. Redesigning the programmable logic, downloading the new logic into the system, and restarting it can support evolving protocols with the same device. This approach is configurable computing; reconfigurable computing goes one step further.
Reconfigurable implementations involve manipulating the programmable logic during runtime, allowing larger hardware designs with fewer overall gates. Chameleon Systems and Quicksilver Technologies market reconfigurable implementations. Siroyan offers a clustered array of ALUs as a reconfigurable-coprocessor core for algorithmically intense applications. A theoretical reconfigurable application could be a smart cellular phone that supports multiple protocols, so that automatic reconfiguration of the hardware would support phone service when a user passes from a geographic region using one protocol into a region using another. This scenario is appropriate when a design requires hardware to implement algorithms. Because you do not always have to instantiate all of the hardware logic, you can reduce the cost of supporting additional features to the cost of the memory you need to store the logic design.
However, runtime reconfiguration incurs significant latency. It takes careful scheduling to configure the hardware on exactly the right timetable to keep the software from stalling. Inertia might be the worst problem facing reconfigurable computing. Mature development and automation tools come after designers adopt a technology. A researcher at the University of California—Los Angeles in the late 1960s proposed runtime reconfiguration. However, designers have been slow to adopt this approach because it has only recently reached viability outside academic research. The reason for the slow adoption is that gate densities have increased as the approach tries to find an application niche and compete with the spectrum of programmable and hardwired options.
Custom-tailored devices differ from configurable devices in that they are hardwired ASICs that meet your application's requirements better than a generic, configurable platform. These devices can enjoy lower unit cost in high volume; can better resist reverse engineering and, hence, copying; and offer the greatest flexibility to trade-offs among performance, cost, packaging, and power consumption. However, these advantages come at a price. The use of ASICs requires a higher level of financial and technical commitment at a relatively early stage in the development of your project. You also lose some control over costs and leadtimes as you send the design for manufacture, but the savings that you accrue when a project reaches production can more than make up for the earlier investment. An area of concern for ASIC users is the risk they incur should the delivered chips not work or if delivery is seriously delayed. Using design software and simulators matched with the manufacturing process and working closely with your chip manufacturer go a long way toward mitigating these risks.
Using an ASIC is a decision to make a product more competitive than it would otherwise be or, more significantly, to make a product that would otherwise be impractical a commercial reality. This decision is most relevant in applications in which performance, size, power consumption, cost, or the ability to integrate analog functions on-chip is critical to a product's success and in which no other method meets the specification. Using an ASIC to reduce the cost of production parts demands careful partitioning of the design between the custom chip and the standard parts. Too much integration can limit your flexibility to respond to changes in the market; increase your costs over using standard parts, such as RAMs; or, worse, cut the viable production life cycle of your device and require you to incur additional NRE charges to spin a new device. The right mix of integration can result in your chip's serving as the platform for a variety of product offerings, amortizing your up-front development and fabrication costs across multiple generations or products.
Besides for academic research, someone would rarely design processor architectures from scratch to support an application. Tools are critical to the success of increasingly complex applications and represent a serious investment of resources and time to develop. A common solution is to license proprietary IP building blocks and an appropriate processor core from suppliers that have invested the resources to qualify the core and develop the tools to support your application development. Cores that designers developed under an open-source model are also available (see sidebar "Open source and compatible cores"). The availability and use of IP cores can significantly reduce the time it takes to get your application to market and allows you to focus your resources on your value-added strengths.
Licensable processor cores cover DSPs and microprocessors in hard and synthesizable formats. Hard cores target one manufacturing process and lock in the processor instruction set and features. Some standard devices use ARM- and MIPS-processor cores. You can configure and extend synthesizable cores for whatever manufacturing process you choose, but the extra flexibility requires more effort from your design team to implement and verify the system. DSP cores, such as those available from 3DSP, BOPS, and LSI Logic provide dedicated DSP functions and support pairing with a host-processor core for application control. Arc and Tensilica include extensions to add DSP functions to their cores.
You need not limit processor configuration to including or excluding peripherals, memory, caches, execution units, interfaces, or even multiple processors. You can also include extensions to the instruction set. Eliminating unneeded components or even instructions can reduce real-estate requirements and power consumption of your device. If you get too aggressive in removing logic blocks, you can throw off your circuit timing. Removing the logic for those instructions you do not need may reduce your power consumption but may also severely limit your ability to use legacy software or purchased algorithms that might use those instructions in later generations of your product. Creating extended instructions that execute multiple tasks in parallel can provide the needed performance boost; reduce your code size, requiring less memory and allowing a smaller device; or even allow you to run your processor at lower clock speeds to avoid parasitic effects and reduce your power consumption. Carefully document your design and assumptions for these special instructions, because they can fail and require reengineering as you increase the clock speed. With flexibility comes greater opportunity to optimize your system for those first-order constraints but at a cost of greater responsibility and forethought about how your customizations will affect your downstream support efforts.
What look fits you?Which implementation strategy should you choose? Your decision relies on a number of interdependent and often conflicting requirements specific to your application and how you differentiate it. Among the factors you should consider are time to market, system performance, system cost, power dissipation, size constraints, peripheral support, and obsolescence.
Time to market is a two-edged sword. Bring a product to market before your customer is ready for it with an insufficient support infrastructure or cultural inertia that is not prepared for the revolutionary change your product represents, and you may be laying the ground work for your competitor to reap the rewards. Missing the market window or delivering your product late because of a long development cycle can hurt the product's profitability over its life. A late delivery can have a greater impact on profits than development or product-cost overruns.
A critical strategy for vendors introducing instruction-set architectures is to support them in familiar development-tool suites, supporting ease of use and reducing the amount of material to learn. A strong tool chain can improve your time to market by enabling a large pool of experienced and skilled designers to start your project, supporting a strong third-party development infrastructure, facilitating an automated design flow, and aiding in porting your legacy code. Reuse is a powerful method of reducing design time and a growing market exists for ready-to-use software algorithms, peripheral drivers, operating systems, and licensable IP blocks for circuit design. If time to market is your primary driver, an approach using standard devices reduces your risk with well-defined performance and interface characteristics, the ability for short iteration cycles through reprogramming or component replacement, board versus chip-level system integration, the availability of legacy or purchased software modules, and a fast production ramp-up. The long leadtimes associated with developing ASICs make PLDs a viable approach for prototyping and early release for many applications that you cannot complete using standard devices.
Applications requiring high systemperformance sequentially and quickly execute many tasks or more slowly, but in parallel performing the same work. Superscalar and pipelined architectures derive their performance by maintaining high clock rates and partially executing multiple instructions at each clock cycle. High clock rates mean more logic transitions, possibly higher power consumption, and the need to compensate for parasitic effects. PLD and ASIC implementations can also deliver higher performance by creating specialized parallel data flows and instruction execution, implementing multiple processor blocks, and targeting advanced fabrication processes. The appropriateness of each method depends on the data and processing characteristics. For example, using a Reed-Solomon algorithm for forward-error correction lends itself to parallelism, whereas more recursive algorithms, such as recursive-least squares, benefit from higher clock speeds for processing the sequential instructions.
Another consideration for high-performance systems is slower off-chip interfaces and I/O-bandwidth constraints. Processors can employ a range of special buses, DMA engines, and multilevel caches to minimize those constraints. If a software implementation can adequately support your system performance, consider staying with software to retain flexibility. On the other hand, using hardware blocks, such as a floating-point unit, is a trade-off that can make headroom for your software.
System cost is more than totaling the bill-of-materials and assembly charges; it is a function that also includes design charges, NRE charges that can make design iterations costly, production-support costs, and—if applicable—royalty charges. Using standard devices has lower up-front development costs and avoids the NRE charges, but a product with large volumes can justify an ASIC implementation. An ASIC may also make sense for lower volume applications that have high margins and cannot deliver using a standard PLD. Custom devices can minimize or eliminate the amount of unused logic in an implementation that is common when using standard devices, but a system-cost analysis should also include the minimum order size; inventory-holding charges; and scrap charges, especially for custom devices, long leadtimes, single sources, upswings in demand, and end-of-life buys.
Power dissipation is important for portable applications that rely on battery power and high-density applications, such as central-office-networking equipment for which power budgets and thermal management are critical. Besides lowering clock rates to minimize power dissipation, standard devices with power management can selectively disable logic blocks and integrated peripherals on the chip. Custom designs derived from soft or configurable cores can implement additional power-saving modes and eliminate unneeded blocks to reduce chip-area requirements and lower system power dissipation. PLDs are relatively power-hungry devices that burn extra power at transients because of buffering at interconnects; however, using a PLD can lower system power dissipation by consolidating and eliminating enough devices from a discrete design.
Smaller devices enable embedded control and signal processing in a growing number of size-constrained applications, such as flash-memory cards, in which device area drives cost. Placing distributed controllers closer to sensors and control points eliminates unneeded wiring and can reduce assembly and repair charges. Using the most advanced fabrication process is one way to minimize a device's size. Minimizing memory requirements with a processor that has an instruction-set architecture that supports code compression is another way to reduce a device's size, cost, and power dissipation. Integrated peripheralscontribute to a smaller system design and lower cost. Selecting an 8-bit processor depends significantly on the integrated peripherals. PLDs enable a custom mix of digital peripherals for your application, but you should consider an ASIC if your application needs integrated analog peripherals.
It is a dream to forever generate revenue from a single design implementation. The truth of the matter is that competition prevents this scenario from happening. Your implementation will become obsolete, and you need to consider how your decisions today will affect your move to the next generation of your product. Despite your efforts to make your product flexible and extensible, obsolescence generally results from the need to change features, include new features, or drop your costs to keep up with downward pricing pressures. Choosing a family of devices that range in package size and have a road map that stays ahead of your needs can save time and money as the design grows. The value of package-migration capability within a family lies in getting larger resources in the same package footprint.
Unless you need the highest performance, a faster processor compensates for a less-than-perfect conversion efficiency of your legacy portable and compilable code. Recompiling is an opportunity to reoptimize your legacy code. Reusing software that you have integrated into a production system can be safer, faster, and more cost-effective than building it from scratch or rewriting and certifying it. Using portable code expands the pool of experienced designers who can support your next-generation design, because there is no guarantee that you will have the same team for the updated design. If you design custom hardware logic, consider the scalability issues and the effect they can have on your legacy software. Consider that if you use many ports in your register file, it can complicate implementing a superscalar multipipeline approach.
A more drastic form of obsolescence occurs when your product's once-unique feature becomes a commodity, and everyone has it. Protecting your proprietary IP investment and product life span becomes a serious consideration in this situation. It is relatively easy to copy a traditional board-level design containing standard components, unless you remove identification from key components. By comparison, the techniques for extracting and reproducing functions from a custom integrated device are not usually economically worthwhile.
Choose your wardrobeWe make the best decisions we can with the information available. However, the market does not always long support the assumptions we use. As the market changes the viability of your choices, remember that it may eventually make sense to migrate your implementation or change your differentiation focus. Cost alone is not usually a sustainable differentiator, and changing from one processor and logic implementation to another is less a "greener-pasture" issue but more that your own pasture is brown.
Avoid too narrow a point solution and develop your implementation to be a platform or framework to leverage between generations and products. Your legacy investment is substantial (or will be) and can represent a competitive resource if you strategically use it. The quality of your tools, design processes, development support infrastructure, and what they can produce are in many ways as important as the price and performance of your current implementation as they affect downstream considerations.
You would not choose your shoes without some idea of where you are going, what you will be doing, and what other clothes you were going to wear. Likewise, choosing the right processor and logic implementation requires you to apply a system perspective that considers not the just the hardware and software, but where it fits in the system, what it is going to do, and how your product is likely to evolve. Integrating functions is not limited to electronic components. Sometimes, it makes sense to eliminate a moving mechanical part in your system, trading off higher complexity in your electronics. Your system perspective should go beyond the product and should include the organizational stakeholders, such as engineering, procurement, manufacturing, quality assurance, marketing sales, and technical support. Your implementation decision can affect all these groups, and they in turn can affect the success of your product.
| For more information... | ||
| When you contact any of the following manufacturers directly, please let them know you read about their products in EDN. | ||
| Altera 1-408-544-7000 www.altera.com | Analog Devices 1-781-329-4700 www.analog.com | Arc 1-408-361-7800 www.arccores.com |
| ARM 1-408-579-2200 www.arm.com | Atmel 1-408-441-0311 www.atmel.com | BOPS 1-888-890-2677 www.bops.com |
| Chameleon Systems 1-408-240-3300 www.chameleonsystems.com | Cypress Microsystems 1-425-939-1000 www.cypressmicro.com | DSP Architectures 1-360-573-4084 www.dsparchitectures.com |
| Gnu www.gnu.org | Lexra 1-408-573-1890 www.lexra.com | Linux www.linux.org |
| LSI Logic 1-866-574-5741 www.lsilogic.com | Microchip 1-480-792-7200 www.microchip.com | MIPS Technologies 1-650-567-5000 www.mips.com |
| Motorola 1-512-895-2000 www.motorola.com/semiconductors | OpenCores www.opencores.org | picoTurbo 1-408-586-8801 www.picoturbo.com |
| PMC-Sierra 1-408-565-0300 www.pmcsierra.com | Proceler 1-510-540-1740 www.proceler.com | Quicksilver Technologies 1-408-574-3350 www.qstech.com |
| Siroyan +44 0 118-949-7028 www.siroyan.com | STMicroelectronics 1-781-861-2650 www.st.com | Tensilica 1-408-986-8000 www.tensilica.com |
| Texas Instruments 1-800-477-8924, ext 4500 www.ti.com | 3DSP 1-949-435-0600 www.3dsp.com | Triscend 1-650-968-8668 www.triscend.com |
| Xilinx 1-408-559-7778 www.xilinx.com | ||
| Author Information |
You can reach Technical Editor Robert Cravotta at 1-661-296-5096, fax 1-661-296-1087, e-mail rcravotta@cahners.com. |
| References |
|
|















You can reach Technical Editor Robert Cravotta at 1-661-296-5096, fax 1-661-296-1087, e-mail