
Table 2 EDN µP/µC Directory: 16-Bit Chips
The register-based H8 includes three lines: the H8/300, H8/300L (8-bit µCs with 16-bit instruction words and 16-bit ALU), and the H8/300H, a 32-bit processor. The H8 has eight general-purpose registers supplemented by a PC, PSW, and address-paging registers. These registers are not part of a register-banking or third addressing-space scheme.
The 8-bit H8/300 and 300L chips treat registers as either 8 or 16 bits, referencing registers as a set of eight 16-bit registers or 16 8-bit registers. The 300H registers are accessible as 8, 16, or 32 bits. On the 300H, the 8- or 16-bit-wide external data path is dynamically resizable.
The 300, 300L, and 300H have a fixed instruction word (with a supplemental word for additional data) and a RISC-like load/store architecture.
All CPUs have a single, unified address space: the 300 version addresses up to 64 kbytes; the 300H addresses up to 16 Mbytes. The 300L can only access on-chip memory. The address space includes a 128-byte register file to access on-chip peripherals as memory-mapped I/O.
Power management: The CPU has a sleep mode and hardware and software standby modes. In sleep, CPU operation halts, register and RAM contents remain unchanged, and peripherals continue to function. In standby, CPU and peripheral operations halt, and registers and RAM contents remain unchanged.
Special instructions: The 300, 300L, and 300H are code compatible and all share a common instruction base, mnemonics, and basic addressing philosophy. Bit-manipulation instructions include set, clear, test, and various logic operations. Math functions include add, subtract, increment, decrement, decimal adjust, multiply, divide, and extend sign. The H8 devices also perform block moves.
The register-based architecture of the 8086 has 14 16-bit registers, organized into four general-purpose, four pointer, four segment, and two special registers. The CPU addresses each general-purpose register as a 16-bit register or two 8-bit registers. The segment registers point to code, stack, and two local data segments.
The core architecture breaks down into two sets: the processor execution unit and the bus-interface unit, which asynchronously communicates to the outside world via an 8- or 16-bit multiplexed system bus. The unit uses a 6-byte instruction prefetch queue to hold pending instructions fetched by the bus-interface unit.
The 8086/186 can run concurrently with a math coprocessor (8087/187) that adds floating-point and transcendental operations. The 8087/187 has six 80-bit registers the CPU can access as either a stack or as discrete registers. The 8087/187 shares the system bus and uses three chip-select pins for handshaking with the CPU. To reduce costs, AMD removed the coprocessor interface from the 80C186/188 and 186/188EM devices.
All memory addressing is base relative, which is a help for embedded code, because code is easily relocatable (you change the address base to relocate). Address segmentation lets the CPU address up to 1 Mbyte of memory. A 16-bit offset (supporting a 64-kbyte segment) is added to the segment base address (segment register shifts four bits left) to attain a 20-bit address. The CPU bus supports multiprocessing. The local-bus controller deploys a HOLD/HLDA protocol that enables another bus master, typically DMA, to take over the common system bus.
Power management: The 186 has two power-saving modes: idle and powerdown (only Intel versions). Idle shuts off the CPU clock, leaving all integrated peripherals active. Powerdown disables the clock input. In addition, you can programmably divide the internal processor frequency (by a factor up to 256) and slow all internal logic.
Special instructions: Math instructions include signed and unsigned multiply and divide, add, subtract, BCD, and decimal adjust. The 80x86 performs a register exchange, repeat prefix for repeating string operations (execute until zero or equal). WAIT examines the TEST pin and suspends instruction execution if pin is HIGH.
Second sources: For the 8086: Fujitsu (San Jose, CA), Matra (Santa Clara, CA), Siemens (Cupertino, CA), and Oki (Sunnyvale, CA). For the 80186: AMD (Austin, TX) and Siemens (Cupertino, CA). (Chips and Technologies (San Jose, CA), NEC (Mountain View, CA), Sharp (Mahwah, NJ), and Vadem (San Jose, CA) make code-compatible µPs and µCs.)
Instructions can have one, two, or three operands. Some instructions are more than one word. Register windowing helps minimize instruction size by letting eight bits address a register in a movable window.
The address space of the MCS-96 works with both 8- and 16-bit external data buses. The external bus multiplexes data and address lines, so a buffer must hold the address stable during data transfers. However, the 8xC196NP has a demultiplexed external bus. An on-chip memory controller lets the MCS-96 use a range of memory types and speeds. External memory wait states are programmable.
The CPU can use autoprogramming to program the internal EPROM via an 8-bit external data interface. All MCS-96 chips (except the Mx) have a full duplex serial port, which the 196Kx uses to program the µC.
The MCS-196 has an event processor array that contains two 16-bit timers and 10 capture/compare modules. An event interrupt generates edges, starts A/D conversions, and resets timers. A high-speed I/O structure has up to four input and six output timer/counter-driven lines. A peripheral-transaction server is a microcoded hardware-interrupt handler for responding to data transfers, starting an ADC, etc.
Power management: The MCS-96 has two power-saving modes: idle and powerdown. Idle shuts off the CPU clock, leaving all integrated peripherals active. Powerdown disables the clock input.
Special instructions: Math instructions include add, subtract, multiply, divide, and multiply and accumulate (MAC). Special instructions include a block move of data, indirect-autoincrement addressing, and a table-indexed jump, which lets you jump via a table value.
Second source: IBM Microelectronics (Fishkill, NY).
The 16-Mbyte address space divides into 256 64-kbyte banks. The high-order bits of a 24-bit address reference the bank; this field is supplied by an 8-bit program- or data-bank register. Bank 0 holds the special-function registers, internal RAM, and internal ROM. In single-chip mode, executing from on-chip ROM and RAM, the CPU has only one 64-kbyte bank. For debugging, the chip can run in µP mode, in which it executes from off-chip program memory.
The 37700 has a 256-byte "direct page" for time-critical routines. This page can lie in the first 64-kbyte memory bank or between the first and second banks. The 16-bit direct-page register points to the base (lower) address of the direct page. Accessing the direct page using the direct-page register is faster and takes only 2 bytes.
The external-memory bus can be multiplexed or demultiplexed. For a 16-bit address, the bus is not multiplexed; it uses 16-bit addresses and 8-bit data. The CPU can access 16-bit data from odd or even bytes, but performance degrades when using an odd byte.
Power management: The 37700 has two power-saving modes: wait and stop. During wait, oscillation continues, the internal clock stops, and the integrated peripherals are active. In stop, oscillation stops and peripherals are disabled.
Special instructions: The 37700's bit-manipulation instructions include bit set, clear, and test for certain flag bits. Math instructions include unsigned multiply and divide, add, subtract, and decimal adjust. The 37700 performs register A and B exchange and a forced execution breakpoint.
The stack can switch between a user stack and an interrupt stack. A frame pointer indicates the end of a stack region that a specific subroutine uses. This register is used in conjunction with enter and exit instructions necessary to set up a particular stack frame.
The external bus is 16 bits wide and nonmultiplexed. The CPU can access 16-bit data from odd or even bytes. A 32-bit data access to external memory is automatically divided into two 16-bit accesses. The CPU or DMAC can be a bus master; arbitration is accomplished without adding extra bus cycles. On-chip peripherals are accessed as memory-mapped I/O.
Power management: The M16 has two power-saving modes: sleep and stop. In sleep mode, the internal clock deactivates, peripherals remain active, the internal CPU state is maintained, and the built-in DRAM controller continues to perform refresh cycles; recovery is via NMI or reset. Stop mode stops the internal clock and source oscillation; recovery requires a reset.
Special instructions: Bit-manipulation instructions include bit set, clear, invert, search bit, extract, insert, test, and compare bit field (signed and unsigned). The M16 performs math instructions such as add, subtract, signed and unsigned multiply and divide, and negate. Special queue-related instructions manipulate a queue consisting of double-linked linear lists.
An integrated MAC unit consists of a 16-bit multiplicand register, a 16-bit multiplier register, a 36-bit accumulator, and two 8-bit address mask registers. It performs a MAC cycle in 720 nsec. The MAC unit uses a simplified form of modulo addressing to implement finite impulse filters and circular buffers.
The 68HC16 has a modular architecture built on the internal Intermodule bus (IMB), which simplifies the addition of on-chip peripherals. Bus protocols are based on the 68020 bus. The IMB contains circuitry to support exception processing, address-space partitioning, multiple interrupt levels, and vectored interrupts. The 68HC16 has a system-integration module that supports an external memory interface with 20 address, 16 data, and up to 12 programmable chip selects. The module includes watchdog and periodic timers and a PLL that boosts a 32-kHz or 4-MHz crystal to a 17-MHz internal clock. On-chip peripherals are memory mapped and accessed through dedicated peripheral registers.
Program and data can share a common address or use two separate spaces. Each space divides into 16 64-kbyte banks. The 68HC16's addressing space expands to 1 Mbyte--or 2 Mbytes for separate code and data spaces for larger applications. Instruction boundaries are on even boundaries and use little-endian addressing. The CPU accesses words on word or byte boundaries.
Power management: WAIT reduces current by stopping CPU execution while leaving the clock running. LPSTOP stops the clock.
Special instructions: The 68HC16 performs bit manipulation via instructions such as bit set, clear, and test. It also supports a variety of math instructions such as add, subtract, BCD, DAA, signed and unsigned multiply and divide. A background operating mode uses special debugging instructions.
NEC's K0, K2, and K3 µCs have a 16-bit PC and stack pointer; the K4 has a 20-bit PC and a 24-bit SP. The program status words (PSWs) in the K0/K2 and K3/K4 µCs are 8 and 16 bits, respectively. K0 and K2 chips operate around four banks of eight 8-bit registers; K3 and K4 have a base of eight banks of 16 8-bit registers. These registers can be paired to function as 16-bit registers. Additionally, the K4 can combine any four of the 16-bit registers (actually four pairs of 8-bit registers) with 8-bit extension registers--and use the combination registers for 24-bit address specification.
The register banks are in on-chip RAM along with directly accessible RAM. The CPU addresses the registers symbolically as the current register bank or as memory. On-chip peripherals are memory-mapped and are accessed either by main memory addressing or by special-function-register addressing. All families separate RAM into fast RAM inside the execution unit and separate data RAM. The fast RAM includes 128-byte register and 128-byte data RAM. For context switching with the K3/K4 families, you can specify an alternate register as part of the interrupt vector itself. However, with the K0/K2 chips, the context switch is accomplished via a bank-select instruction executed prior to branching to the ISR referencing the new bank.
K3/K4 µCs contain a 3-byte instruction-prefetch queue. The bus-control unit can fetch an instruction byte from memory during cycles in which the execution unit is not using the memory bus.
Power management: The series has two power-saving modes: halt and stop. In halt mode, the CPU discontinues while all peripherals continue to operate. In stop mode, only the subsystem clock (if used) and interrupts operate. In addition, the K4 has an idle mode, in which the oscillation circuitry continues to operate but the entire system stops. All 78K devices have a programmable clock divider to conserve power during less performance-demanding operations.
Special instructions: Bit-manipulation instructions are bit set, clear, complement, test, and various logic operations. Math instructions include add, subtract, multiply, divide, and decimal adjust. K0/K2 CPUs handle 16-bit operations by pairing adjacent registers in banks. Their 16-bit arithmetic operations include ADDW, SUBW, INCW, DECW, and SHR/LW. K3/K4 µCs can perform a 16x16 multiply and a 32x16 divide, as well as MAC and multiply-and-subtract instructions. Hardware implements a 32-word branch-destination address table that adds a level of indirection to branches and subroutine calls. This option is useful when making frequent calls to specific subroutines, because the special call instruction (CALLT) is a 1-byte instruction versus a standard 3-byte call to a direct address.
A 24-bit PC provides access to up to 16 Mbytes of linear unsegmented code space. A 7-byte prefetch queue holds pending instructions. Two segment registers provide the upper 8 address bits used for accessing up to 16 Mbytes of data memory. Memory-mapped special-function registers (SFRs) control and monitor on-chip peripherals. Processor stacks include one for supervisor code and one for applications; the stacks, up to 64 kbytes in size, can reside in on- or off-chip memory.
The external data bus is configurable for 8- or 16-bit accesses and includes a programmable-wait-state generator. Depending on the amount of code space required for your application, some of the upper address lines can be configured as I/O ports.
Power management: Software controls idle and power-down modes: Idle shuts down processor functions but leaves most of the on-chip peripherals and external interrupts functioning; power-down mode shuts down everything--including the on-chip oscillator.
Special instructions: The 80C51XA performs extensive bit manipulation via instructions such as jump on bit set or clear, set, clear, move, AND, and OR. Math instructions include add, subtract, 16x16 multiply and 32x16 divide (signed and unsigned), and 32-bit shifts. The XA also has instructions to normalize and sign extend operands for floating-point support, move data blocks, jump double indirect, breakpoint and trap, and reset.
The main core of the CPU consists of a four-stage pipeline (fetch, decode, execute, writeback), a one-cycle barrel shifter, and a fast multiply/divide function unit. Pipeline stages clock in 100-nsec cycles, so most instructions appear to execute in a single cycle. Instruction latency is four cycles, or 400 nsec.
The CPU uses code segmentation and data paging to address up to 256 kbytes (the 166) or 16 Mbytes (165/167) of the unified instruction-data memory space. The external-memory bus controller has four programmable modes, chip selects, and a wait-state generator. You can partition physical memory into multiple segments and five address ranges (166 only has two), each having a different type of memory with or without wait states. A hold/acknowledge mechanism on the external bus can be programmed so external devices take control for critical data transfers. A system stack of up to 512 bytes stores temporary data.
Instructions are 2 or 4 bytes long. The µCs can handle a 4-byte instruction fetch from on-chip ROM in one 100-nsec stage. A single fetch gets an entire instruction. However, because the 16-bit external bus permits only a single-word access, off-chip program accesses suffer at least a one-cycle stall for a 4-byte instruction.
The 166/165/167 µCs cache branch-target instructions and use them to supply the next iteration of a branch, allowing execution without pipeline stalls. First-pass loop branches pay a single-cycle penalty. Nonaligned, double-word, branch-target instructions also pay a one-cycle penalty.
Power management: The 166/165/167 µCs have two power-saving modes: idle and powerdown. Idle shuts off the CPU clock, leaving all integrated peripherals active. Powerdown disables the clock input. Idle mode can be terminated by any reset or interrupt request; powerdown can only be terminated by a hardware reset.
Special instructions: Bit-manipulation instructions include bit set, clear, move, and various logical operations. Math instructions are add, subtract, 16x16 multiply and divide, 32x16 divide. The µCs can perform up to 15 shifts or rotates in one instruction cycle. Every jump has 16 different conditions.
Second source: SGS-Thomson (Phoenix, AZ).
The TLCS-900 is backward compatible with the TLCS-90 but offers a substantial performance increase by using a three-stage pipeline in combination with a 4-byte prefetch queue. The 32-bit maximum mode accommodates large-scale arithmetic and addressing (16 Mbytes) with a basic 16-bit CPU. Additionally, the hardware maintains two 32-bit stack pointers, one for system code and the other for application code. The upper byte of the 16-bit program status word can be accessed in system mode; the lower byte can be accessed in either system or normal modes.
I/O processing is enhanced by configuring peripheral interrupts to bypass CPU interrupts and, instead, be handled by an I/O controller or a special peripheral µDMA processor. Using the I/O controller avoids the overhead of bank switching or interrupt processing. Peripheral events trigger I/O controller processing and "DMA" the data to or from memory and internal peripherals. The I/O controller handles up to four µDMA channels. The CPU can execute from external memory and can dynamically shift bus sizes between 8 and 18 bits while running.
Power management: The µCs have idle and stop power-saving modes. Idle shuts off the CPU clock, leaving all integrated peripherals active. Powerdown disables the clock input. Any reset or interrupt request can terminate idle mode; only a hardware reset (NMI and INT0) can terminate powerdown.
Special instructions: Bit-manipulation instructions include bit set, clear, change, test, search forward and reverse, and various logical operations. Math instructions include add, subtract, decimal adjust, signed and unsigned 8x8 and 16x16 multiply, signed and unsigned 16x8 divide, and shift one bit 1 to 16 times. The TLCS-900 also has a MAC instruction and modulo increment/decrement instructions used for circular buffer pointers. It can also perform block moves and pattern searches in memory.