Lattice Blog

Share:

Enhancing FPGA Reliability in Space Applications

Radiant Blog - Enhancing FPGA Reliability in Space Applications
Posted 03/25/2024 by Guest blog: Adam Taylor, CEng FIET Embedded Systems Consultant

Posted in

The introduction of Lattice Nexus™ and Lattice Avant™ platforms marked a significant advancement in the realm of Aerospace technology. These devices offer developers a powerful combination of low power dissipation and robust performance against Single Event Effects (SEEs), common in the harsh space environment.

Understanding the Challenge of SEEs

Space applications demand unparalleled reliability, as even a minor error can lead to catastrophic failures. The Certus-NX, CertusPro-NX, and Avant FPGAs utilize architectural enhancements, sophisticated design techniques, built-in scrubbers and purpose-built tools to mitigate such risks, but the onus is on developers to ensure the Register-Transfer Level (RTL) design within these devices can withstand SEEs. This includes protecting critical components such as Finite State Machines (FSMs), Block RAMs (BRAMs), and registers from corruption.

Key Areas for SEE Mitigation

Developers must focus on several critical areas within their designs to ensure resilience against SEEs:

  1. Finite State Machine – Ensuring FSM within a design do not lock up or operate incorrectly.
  2. BRAM – Ensure the contents of information stored within BRAM are not corrupted.
  3. Registers – As data moves between registers, it's essential to ensure it remains uncorrupted.

Implementing Mitigation Strategies

To protect these structures, developers can employ several effective techniques:

  1. Safe State Machine encoding – Utilizing a Hamming-3 Encoding scheme for FSM states ensures they cannot lock up or fail.
  2. Error Detection and Correction (EDAC) – Incorporating Hamming Distance 3 EDAC codes within BRAMs allows for the correction of single-bit errors and the detection of double-bit errors.
  3. Triple Modular Redundancy – Triplicating registers and employing majority voting on the output. TMR offers robust protection. Different applications may benefit from distributed TMR, block TMR, or full TMR implementations, depending on the specific needs for physical isolation or comprehensive protection.

As designs grow in complexity, manually implementing these mitigation strategies becomes impractical. This is where leveraging advanced synthesis tools like Synopsys Synplify®’s High Reliability (HiRel) features becomes invaluable. Lattice recently announced its collaboration with Synopsys to integrate the Synopsys Synplify® FPGA synthesis tool with the latest release of Lattice Radiant® design software. This collaboration creates an advanced design automation flow solution that enables designers to more easily develop Lattice FPGA-based applications with the robust functional safety protections, high availability, and dependable operation required for the Industrial, Automotive, and Avionics markets.

Leveraging Synopsys Synplify for Enhanced Reliability

Synopsys Synplify offers a simple way to implement these SEE mitigation strategies. By leveraging logic design constraints developers can easily apply different TMR configurations or encode state machines and BRAMs with error-correcting codes. This approach not only saves significant design time, but also reduces the risk of errors that could compromise system reliability.

This fine control over the TMR implementation enables developers to select the most appropriate TMR structure. For some applications block / distributed TMR can be very useful if floor planning to separate the different TMR channels is to be implemented during place and route to ensure physical isolation between channels. Similarly, if every register needs to be triplicated and voted on the TMR option can be used to implement global TMR.

Radiant Blog - Figure 1. An Example of physical isolation, triplication and the voter logic.
Figure 1. An Example of physical isolation, triplication and the voter logic.

As the constraints can be used to implement the TMR scheme, the implementation options can also be used to constrain how state machines within the design are protected. Using Synplify, designers have several options on the implementation of state machine protection. These options range from preserving and decoding unreachable states which supports the designer implementing their own FSM protection scheme, to single-bit Error Correcting and Double-Bit Error Detection using Hamming encoding.

Radiant Blog - Figure 2. Reference output for SpaceWire hamming3
Figure 2. Reference output for SpaceWire hamming3

The final mitigation approach, which can be implemented using synthesis, is to set the ram-style constraint to implement Error Correcting codes on block RAMs. This enables the Error Correction Code (ECC) logic, which is inherent in the BRAM structure within the FPGA, to implement protection of BRAM elements.

Of course, when any mitigation structure is implemented within a FPGA, we need to be sure it functions as required. Synopsys Synplify combined with Synopsys Identify debugger enables engineers to inject faults into the design using the instrumentor to ensure the fault mitigation implementation is behaving as expected by the design.

Practical Application: Case Studies

To illustrate the effectiveness of these strategies, several reference designs were run by an independent third party through Synplify.

  1. SpaceWire Robot Arm Application
  2. Image Histogram IP Module
  3. Quad SPI to AXIS Interface

Radiant Blog - Figure 3. List of reference designs with and without TMR
Figure 3. List of reference designs with and without TMR

Running these cases through Synplify and enabling the distributed and block TMR options shows the number of registers increases as would be expected for the selected TMR Option. This is because, in the case of distributed TMR, internal registers to the block are also triplicated and voted upon.

As we can also see and would expect, the more logic and voting structures implemented, the overall FMAX decreases slightly. This is indicative of the added reliability without compromising the system's performance.

In addition to the TMR solution, these designs were also rerun through Synplify with state machine protection enabled. This and any other mitigation schemes are reported by Synplify in the HiRel section of the log. This enables developers to demonstrate at reviews the configuration of the tool when the netlist was generated, this can be very important for design implementation reports.

Radiant Blog - Figure 4. Design implementation report out
Figure 4. Design implementation report out

Conclusion

The integration of FPGAs like Avant, Certus-NX and CertusPro-NX into space applications represents a leap forward in combining low power with HiRel By leveraging tools such as Synopsys Synplify, developers can efficiently implement Single Event Effect mitigation strategies, ensuring their designs meet the rigorous demands of space environments.

To learn more about how Lattice can help you accelerate your space application development with advanced design automation flow, reach out to speak with the team at Lattice.

Share: