Improving the quality of embedded software

MathWorks Australia
By Jay Abraham, Polyspace product manager at MathWorks
Friday, 16 September, 2011

Embedded software intended for aerospace and military applications must operate with the utmost concern for safety. To achieve reliability goals, software development teams strive for these applications to meet rigorous verification processes and the goal of zero defects.

Edsger Dijkstra observed that testing can only show the presence of errors, not the absence. If testing cannot prove the absence of run-time errors, how can embedded software development teams be certain that their software is free of these errors?

Code verification based on mathematical proof is a solution to consider.

Recent advances in the practical application of scalable and high-performance mathematical techniques for the verification of software will help achieve the quest of proving the absence of run-time errors in software.

The complexity of embedded software in high-integrity systems continues to increase. In military applications, the avionics software for the F-22 Raptor consists of 1.7 million lines of code and avionics software for the F-35 joint strike fighter is expected to have 5.7 million lines of code.

For commercial airliners, the Boeing 787 aircraft flight control system will have about 6.5 million lines of code.

The increased software content and complexity of these aircraft compounds the risk of failure and complicates the process of achieving high confidence in software reliability.

Studying past embedded device failures can be instructive in understanding coding-related issues. For example, the failure of a test flight of an expendable rocket was traced to a code defect.

In this situation the launch vehicle self-destructed less than a minute after lift-off because the vehicle encountered high aerodynamic loads as a result of an angle of attack that exceeded the prescribed safety limit.

Post-incident investigation revealed the root cause of failure - an overflow resulting in a run-time error in the embedded software. The overflow, found in a pair of redundant inertial reference systems that determined the rocketâ€™s attitude and position, occurred when a 64-bit floating-point number was converted to a 16-bit signed integer, which moved the rocket nozzles to an extreme position. The presence of a redundant system did not help because the backup system implemented the same behaviour.

Run-time errors like this represent a specific class of software errors known as latent faults. The faults exist in code, but unless specific tests are run under particular conditions, the faults will not be detected during system testing.

Therefore, the code can appear to function normally but gives unexpected system failures in the field. A few examples of run-time errors are:

Non-initialised data;
Out-of-bounds array access;
Null pointer de-reference;
Overflow and underflow;
Incorrect computation;
Concurrent access to shared data;
Illegal type conversions.

Traditionally, software verification at the source-code level has involved code reviews, static analysis and dynamic testing. Each approach has its shortcoming. Code reviews, which rely solely on the expertise of the reviewers, may be tedious for large code bases.

Classic static analysis techniques, which rely mainly on a pattern-matching approach to detect unsafe code patterns, cannot prove the absence of run-time errors.

With the increasing complexity of embedded software, dynamic testing of all operating conditions may be impossible to achieve, which reinforces Edsger Dijkstraâ€™s position that testing can show only the presence of errors, not their absence.

A new verification method known as abstract interpretation builds on static analysis methods by using formal mathematical proofs, which can identify certain run-time errors or prove their absence. Abstract interpretation is applied to the source code without executing it.

Abstract interpretation as a proof-based verification method can be illustrated by multiplying three large integers:

-4586 x 34985 x 2389 = ?

Although computing the answer to the problem by hand can be time consuming, we can quickly apply the rules of multiplication to determine that the sign of the computation will be negative. Determining the sign of this computation is an application of abstract interpretation.

The technique enables us to know precisely some properties of the final result, such as the sign, without having to multiply the integers fully. We also know from applying the rules of multiplication that the resulting sign will never be positive for this computation.

In a similar manner, abstract interpretation can be applied to the semantics of software to prove certain properties of the software. Abstract interpretation bridges the gap between conventional static analysis techniques and dynamic testing by verifying certain dynamic properties of source code without executing the program itself.

Abstract interpretation investigates all possible behaviours of a program - that is, all possible combinations of values - in a single pass to determine how and under what conditions the program may exhibit certain classes of run-time failures.

The results from abstract interpretation are considered sound because we can mathematically prove that the technique will predict the correct outcome as it relates to the operation under consideration.

Abstract interpretation can be used as a static analysis tool to detect and mathematically prove the absence of certain run-time errors in source code, such as overflow, divide by zero, and out-of-bounds array access.

The verification is performed without requiring program execution, code instrumentation or test cases. Polyspace code verification products from MathWorks rely on this type of static analysis.

The input to a Polyspace product is C, C++ or Ada source code. The Polyspace product first examines the source code to determine where potential run-time errors could occur. Then it generates a report that uses colour-coding to indicate the status of each element in the code, as shown in Figure 1 and Table 1.

Figure 1: Polyspace colour-coding that indicates the status of each element in the code.

Polyspace results that are all tagged in green indicate that the code is free of certain run-time errors. In cases where run-time errors have been detected and colour-coded in red, grey or orange, software developers and testers can use information generated from the verification process to correct identified run-time errors.

Static analysis augmented with abstract interpretation can improve the quality and reliability of embedded software in high-integrity systems.

This approach helps engineers attain the quest of proving the absence of certain run-time errors in software. A code verification solution that includes abstract interpretation can be instrumental in attaining a good-quality process. It is a robust verification process that helps achieve high integrity in embedded devices.

Improving the quality of embedded software

Research breakthrough to advance OLED technology

What's next for the semiconductor industry?

Electron–phonon interactions boost 2D semiconductor activity

Content from other channels on our network