Introducing the world’s first, 28 nm semiconductor for space, part 3

-December 11, 2015

My previous two articles introducing Altera's plan to offer the world's first, 28 nm, COTS FPGA for space applications have generated a lot of interest within our industry, and I have received many emails from readers of Out-of-this-World Design.

This month I discuss radiation hardness and SEU mitigation: ultra, deep-submicron semiconductors such as the 5SGSMD5H3F35I4, COTS FPGA for space intrinsically offer higher levels of total-dose immunity as thinner oxides between the gate and the conducting channel trap less positive charge. Additionally, the use of trench isolation prevents charge forming within the insulating field oxide between neighbouring transistors as shown below:

Figure 1: The use of shallow trench isolation to prevent leakage between neighbouring FETs.

Ultra, deep-submicron semiconductors use lower power-supply rails which have become smaller than latch-up holding voltages. To provide further protection from SEL, the 5SGSMD5H3F35I4 has been fabricated using an epitaxial layer whose resistivity is significantly higher than the underlying substrate. The parallel combination of the epitaxy and bulk resistivity reduces the gain of the parasitic thyristor, and during latch-up, significantly more current would be needed to trigger the SCR.

The 5SGSMD5H3F35I4 requires approximately 213 Mbits to configure the FPGA to control its functionality, specify the routing between the logic resources and define the user-memory storage. The integrity of the configuration SRAM cells is paramount; however, in practice, only a small percentage of bits are actually used due to low-routing utilization. The 5SGSMD5H3F35I4 provides on-chip, CRC, error-detection logic to protect the configuration memory cells from SEUs without impacting the fitting or performance of the part.

When the Quartus II software generates the configuration bitstream, a CRC value is computed for each frame as shown below. As the data is loaded into the FPGA, the calculated, CRC number also gets stored in the error-detection cyclic redundancy check (EDCRC) logic. At the same time, CRC circuitry within the device re-computes the value for the data frame comparing it with the original figure. If the numbers do not match, a configuration error is flagged as shown below. If the Quartus II software detects an error in the configuration bitstream, the data is reconstructed from the error-correcting code calculated for that frame and the corrected frame is re-written into the FPGA's configuration memory.
Figure 2: Illustration of configuration-memory EDCRC during device configuration.

During normal FPGA operation, the EDCRC logic can be enabled to detect soft errors in the configuration memory cells without the need to re-configure the device. Each frame contains a CRC value calculated during device programming, and the checking circuitry continuously re-computes a 32-bit CRC value comparing this with the stored number. If there is a mismatch, the resulting signature is stored in a syndrome register indicating the error type and its location. Within a frame, the EDCRC logic can identify all single, double, tripe, quadruple and quintuple-bit errors. When a single or double-adjacent error is detected, the location and type of error is reported.

The EDCRC exploits redundancy and the 5SGSMD5H3F35I4 offers space users the ability to correct soft errors during normal operation. Internal scrubbing can automatically restore single or double-adjacent errors in each data frame without the need to re-configure the device. Furthermore, the user can control the time taken to compute the CRC value based on the frequency of the error-detection clock.

The majority of the configuration memory cells are not essential to normal operation and an SEU strike may not have any impact on the intended functionality. If an SEU inverts an unused bit, the effect will be harmless, however, if critical routing or logic is affected, e.g. a lookup table, in the worst case, this could result in complete system failure causing a SEFI, necessitating device re-configuration.

Given that not all of configuration memory is essential to normal operation, space users can partition designs and assign up to 256 sensitivity classifications to each region and hence SEU recovery responses. Using Altera's Advanced SEU Detection IP core, users can implement either an autonomous, on-chip or an external sensitivity processor to assess the criticality of an SEU. If a soft error is detected, its designated importance is checked by searching a sensitivity map stored in memory. A change to an unused bit within configuration memory cells can be ignored leading to a reduction in the overall FIT rate. The CRC engine has been implemented as fixed-layout, hard IP and not the standard, configurable place and route flow.

Figure 3: On-chip, hard-fitted IP, CRC logic and sensitivity processor.

As SEUs can randomly strike any bit within the configuration memory cells, testing is required to ensure a suitable recovery response. The Quartus II, development software includes a Fault-Injection Debugger,  which allows space users to deliberately invert bits and understand the impact of such changes. Faults can be injected randomly, or specific bits can be targeted, to characterize the SEFI behaviour of the FPGA and plan and validate a system-recovery strategy. Single, adjacent-double and multi-bit errors can be flipped. The Fault-Injection Debugger complements radiation testing and allows space users to limit the amount of accelerated-beam time required at a cyclotron. In-house SEFI characterization allows space users to calculate soft-error rates and scale FIT rates to meet your mission's reliability requirements.

The 5SGSMD5H3F35I4 offers space users the ability to protect the embedded, user memory against SEUs. ECC circuitry generates an EDCRC code at the data input to the RAM and checks the value output by the SRAM. If an SEU affects any of the stored bits, the ECC automatically corrects the error when it is read from the memory as shown below.

The ECC-enabled memory reports the occurrence of a single, adjacent double, and adjacent triple-bit flips, and will correct single and double inversions. Adjacent triple-bit corruptions are detected and identified using a status bit, but not corrected. The CRC value can also account for bit flips in the CRC code itself and the 5SGSMD5H3F35I4 supports both SECDED and DECTED.

Figure 4:
User memory SEU mitigation.

The ECC checks and corrects memory at system speeds and has been implemented as fixed-layout, hard IP and not the standard, configurable place & route flow. The user memory has been developed using hardened, eight-transistor, SRAM cells with bits interleaved to avoid to multi-bit upsets, i.e. logical bits in a single data word are physically separate.

The next article in this series will demonstrate the SEU fault injection capability and the Sensitivity Maps offered by Quartus II which can be used to characterize the SEFI behaviour for your specific application to achieve the lowest, soft-error rates. To support future space users, Altera is actively developing a high-integrity technology to assist SEU mitigation.

Until next month, I wish all readers of Out-of-this-World Design a very happy Christmas!

Also see:

Loading comments...

Write a Comment

To comment please Log In