SOCFIT: QUESTIONS AND ANSWERS
1 What is SOCFIT?
SOCFIT is reliability-focused design characterization platform that predicts quickly and accurately the failure rate (FIT) and various derating factors of ASIC and SoC .
2 Why should I use SOCFIT?
SOCFIT will provide architect designers with the overall reliability data (FIT/event rate) of the circuit and its internal features or blocks. Such analysis will help them understand their design and build a reliable design by identifying its points of weakness.
The reliability grading of a new circuit design before manufacturing is made possible by SOCFIT. The tool will generate detailed sensitivity data for every cell, bloc, hierarchical module from the circuit and overall FIT data. If the computed FIT values are within the specification, then the circuit doesn’t require any modification. If the sensitivity of the design is higher than allowed, a re-design of the circuit might be required. However, the optimization of the FIT values of the circuit will cost silicon overhead area and performances impact. SOCFIT-provided hierarchical sensitivity data provides the designer with the required information to do an educated choice of the most critical areas of the circuit to protect. Additionally, SOCFIT can help the designer implement optimization techniques. Changing cell instances with functionally-equivalent ones but with better FIT rate from the standard library is offered with the tool. This function comes with overhead cost vs. FIT optimization budget.
3 What are the formats of data used by SOCFIT? Are there any choice?
SOCFIT requires a circuit RTL or a gate-level netlist description and a Standard Delay Format (SDF) file for the timing information.
4 What are the connections between SOCFIT and fast prototyping tools? Please explain in a case.
The fast prototyping EDA platforms include all the tools (synthesis, place&routing, extraction) required to transform the original high-level (RTL, SystemC, etc) description of the circuit in a low-level representation (layout, gate-level netlist, etc), close to the final implementation. Because Soft Errors event and propagation are closely related to the actual physical structure of the circuit, SOCFIT requires a detailed understanding of the physical circuit. At the same time, working at a too deep level of detail is detrimental to the reliability analysis duration. Therefore the most appropriate representation of the circuit is a gate-level netlist completed with timing information. All prototyping platforms are generating such representation.
Thus, a typical reliability analysis session consists in:
- importing the original circuit sources in the prototyping platform
- running a synthesis in order to map the circuit to the technological process and optimize its performances
- generating the files containing structural information about the circuit (typically gate-level netlists)
- running additional operations such as place&route in order to bring the circuit closer to the final implementation
- extracting timing information about the circuit (typically a SDF – Standard Delay Format file). The timing information could also be extracted after synthesis, but the accuracy will be lower
- importing the structural & timing files in SOCFIT
- running the Soft Error analyzing in SOCFIT
- back-annotating the prototyping platform representation of the circuit (the circuit database) with SE-related information such as per-cell FIT, hierarchical FIT breakdown, etc
- using prototyping platform tools for result representation, such as sensitivity maps of the layout, hierarchical representation, etc
5 How does SOCFIT include ECC in the FIT calculation?
If the memory block is not protected, then the radiation-induced effects may appear as SBUs (Single Bit Upsets) or MCUs (Multiple-Cell Upsets). In this case, the contribution of the memory block to the global sensitivity of the circuit is calculated as the native FIT (per megabit) multiplied by the size (in megabits) of the memory.
If the memory is ECC-protected, then most of the standard techniques are able to detect and correct SBUs. In this case, the sensitivity of the protected memory drops to zero SBUs (which are the predominant effects). The MCUs can also be corrected by these techniques, provided that multiple upsets induced by the particle don’t affect the same word. However, the probability associated to this effect depends on the memory internal organization and bit cell structure and is very difficult to compute beforehand. By default, SOCFIT considers that ECC is able to correct all errors. However, the designer can input the percentage of multiple-bit upsets in the same word (which should be very low). This data can be either estimated by the designer or measured in a test.
6 How is the alpha particles environment taken into account?
Alpha particles are generated by the packaging, unlike the cosmic rays, and they differ according to the geometry and materials of the packaging. SOCFIT simulates the die sensitivity to alpha, neutrons, or other particles. As such, one of the inputs is the description of the radiative environment of the die. The user will therefore have to specify the alpha flux at the sensitive area level. Defining this spatial flux distribution is not a trivial task. First, the designer will have to measure the alpha emission of every component of the package. The second step is to build the equivalent flux by adding the contribution of all the alpha generators that compose the package. The next step is then to compute the LET and range modification due to the transit of the alpha particles through the different layers of materials.
IROC can help in the steps described above, but this is not automated in SOCFIT. Once this alpha flux distribution has been defined, the SOCFIT tool can calculate the FIT rate.
7 What is the maximum size of device SOCFIT can work with?
SOCFIT is not limited by a maximum circuit size. Circuits of tens of thousands of cells have been successfully analyzed using SOCFIT. The analysis took only a few minutes. Thus, the analysis duration of larger circuits will be reasonable.
8 Which algorithms are used to calculate FIT rate?
Static (probabilistic) and dynamic (fault injection and simulation) algorithms have been implemented and validated extensively.
9 How is derating taken into account?
Two de-rating factors are used to compute the propagation probability of the faults. The first one is related to the propagation of faulty values through the logic network of the circuit, up to the inputs of memorizing cells. The second one is the timing derating factor and takes into account various time-related aspects such as the arrival of the event with respect to the clock and intrinsic circuit delays.
10 What is SOCFIT accuracy?
As for ASIC analysis, accuracy of a FIT rate is a complex question. This FIT rate depends on the derating factor, the internal frequency, the input vectors and the amount of resources that are involved in a computation when the events occur.