Fault-tolerant processing in aerospace applications is the subject of intense research. The authors very correctly identify such environments as posing great challenges in the correct operation of digital systems that are subjected to radiation effects that can either cause circuit malfunction, via single event upsets (SEUs) and multiple event upsets (MEUs), or, even worse, lead to the destruction of the digital application-specific integrated circuit (ASIC).
The researchers have designed and implemented a low-complexity, system-on-a-chip processor architecture, which consists of an 8051-compatible microcontroller core along with a number of peripherals, including a controller area network (CAN) controller, a universal asynchronous receiver/transmitter (UART), and an Internet communications engine (ICE) module for software development. This architecture, known as SPECIES, makes use of multiple levels of fault-tolerant techniques, including triple modular redundancy (TMR), applied in the processor internal state (flop-based state), and error detection correction codes (EDACs), based on Hamming coding, which is applied in the internal memory of the system-on-a-chip (SoC).
The built-in self-test (BIST) infrastructure of the SoC is used to test not only the internal static random access memory (SRAM), but the banked external memory as well; it implements the Nair-Abraham-Thatte test, which can detect a large number of possible faults with moderate hardware cost. An interesting aspect in the use of the internal EDACs is the ability to correct internally stored data demonstrating single errors, or detect such data with multiple errors, thus giving an indication of the processor go/no-go status. Finally, the authors have prototyped the architecture in a field-programmable gate array (FPGA), and are using the system for code development.
Overall, this brief and extremely relevant contribution discusses a fault-tolerant SoC architecture for mission-critical systems (such as on-board computers on space vehicles). The paper does contribute to our understanding of digital techniques for harsh environments; however, I would have liked to see some performance data, perhaps from SEU-simulated fault-injection experiments.