Design of Self-Fault-Tolerant Systems based on Self-Reconfigurable FPGAs (Field Programmable Gate Arrays)
In the last few years, Application Specific Integrated Circuits (ASICs), with their lengthy design and development time, escalating development costs and low flexibility, have seen their use restricted to high-volume manufacturing of chips requiring cutting edge-of-the-technology densities and speed processing.
Meanwhile, the exponential growth in density and performance of configurable logic devices, such as SRAM-based Field Programmable Gate Arrays (FPGAs), and the addition of new features, greatly expanded the areas where they can advantageously replace ASICs. FPGAs have lower development costs, faster time-to-market, and an unparalleled flexibility. Recently, two new features were added: dynamic reconfiguration, which allowed the implementation in real-time of dynamic resource allocation strategies, enabling multiple independent functions from different applications to share the same logic resources in the space and temporal domain; and self-reconfiguration, which added the possibility of self-adaptation of the FPGA, dynamically and almost instantaneously, to new functional requirements.
These new FPGAs, while enabling their use in a wide range of applications, such as reconfigurable hardware platforms, also create new challenges to test. The nanometre technologies used in their manufacturing increases their vulnerability to soft errors, due to environmental radiation, and make them more prone to defects emerging from small manufacturing imperfections not detected during production tests, giving rise to transient or permanent changes in the configured functionality. Therefore, their expanding use, even in critical systems, requires the design of fault tolerant circuits able to assure a high reliability and availability. This goal involves the online concurrent detection of permanent and transient faults and their masking, to avoid their propagation, while triggering a test procedure to determine their origin, either functional or structural, and to assure the repair of their cause(s), avoiding cumulative effects that may lead to a general system's failure.
Therefore, it is imperative to study the specific fault inducement mechanisms of these devices, to correlate a set of faults and, fully exploring new FPGAs' features and performance, to develop innovative test methodologies tailored to their unique architecture and to the new sort of applications they enable to implement. These methods have to be able to guarantee both fault tolerance and repair when complex functions are configured, and also to avoid previously detected structural errors when reusing the same hardware resources to implement new incoming functions required by each new application.
The incorporation of self-reconfiguration capabilities in recent FPGAs, allied to the use of a controller core, enables the development of self-contained fault tolerant reconfigurable systems, with automatic rerouting and floorplanning being performed by this controller, and thus enabling the implementation of fault detection, test and repair procedures in a transparent and autonomous way.