CS5218: Assignment 3
Comparison of the CLANG and INFER Analyzers
Deadline: 23:59 Sunday 6th April 2025
In this assignment, you will compare two well-established Static Analysis tools Clang and Inferon a set of programs with different sizes. Clang and Infer static analysers use flow and partial path-sensitive analyses. The set of benchmark programs contain defect types related to dynamic memory allocation, error handling, multi-threading, etc.
You are to organize yourselves into three-person teams (with an exception of one size 4 team). Please indicate the team structure for this project in the document: Google Spreadsheet by 23:59 Monday 24th March 2025. We need this information by the given deadline as we are going to share different sets of programs to each team individually.
1 Academic Programs
We consider two academic programs P1 and P2. The programs are written with a macro TEAM ID, whose value is to be instantiated by each team by its identification number, so each team will consider a slightly different version. We wish to determine if the error point (indicated) is reachable. The expression Ψ(x) represents some predicate on the variable x. This is the error condition. An analyzer reports an alarm if it cannot determine if the error point is unreachable; otherwise, it reports bug-free.
P1
#define TEAM ID . . . // (substitute your team ID here
#define N . . . // (N is statically fixed)
main() {
int x = TEAM ID;
for (int i = 1; i <= N; i++) x += 1;
if (x >= 0 && error(x))
x = 1 / 0; // error point
}
int error(int x) { Ψ(x) }
|
P2
#define TEAM ID . . . // (substitute your team ID here #define N . . .
int a[] = {1, 3, 5, 7, 9};
main() {
int x = TEAM ID, choice;
for (int i = 0; i < N; i++) {
scanf("%d", &choice);
if (choice) x += a[i];
}
if (error(x)) x = 1 / 0; // error point
}
int error(int x) { Ψ(x) }
|
Program P1 has a straight line execution sequence. Thus whether an error is detected will depend on the analyzer to sufficient accurately track the value of x to the error point, and determine if error(x) is definitely false (and so no alarm is raised). Program P2, on the other hand, has many possible (in fact 2N ) execution paths can reach the error point. Thus to determine that there is no error, the analyzer needs to track up to 2N different values of x.
Now consider the possible answers (alarm or bug-free) an analyzer can report, and what these answers can mean (true or false):
FP: alarm but in fact the error point is unreachable. This is a false positive. (This occurrence is frequent in practice and is the biggest challenge to static analysis for bug finding).
TP: alarm AND in fact the error point is reachable. This is a true positive. (This occurrence is probably due to “pure dumb luck”!)
FN: bug-free but in fact the error point is reachable. This is a false negative. (This occurrence is probably
due to the [criminal?] decision to reduce the frequency of false positives by remain silent.)
TN: The analyzer reports bug-free AND in fact the error point is unreachable. This is a true negative.
(Whether or not this is pure dumb luck is debatable, but this occurrence is the happiest outcome.)
The four outcomes above can be summarized as a “confusion matrix” .
The following are example conditions for Ψ(x) and what the results are for the analyzer Infer.
• P1
x < TEAM ID - 99 returns BUG-FREE; result is a True Negative
x != TEAM ID + N returns ALARM; False Positive
x == TEAM ID + N + 1 returns ALARM; False Positive
x == TEAM ID + N returns ALARM; True Positive
• P2
x >= TEAM ID returns ALARM; result is a True Positive
x == TEAM ID + 23 returns BUG-FREE; True Negative
x == TEAM ID + 22 returns BUG-FREE; False Negative
x <= TEAM ID returns ALARM; True Positive
Task 1:
An error condition e1(x) is a predicate on a variable x, and it is more general than another error con- dition e2(x) if e2(x) -→ e1(x). Your task is to revise the (eight) error conditions with new conditions so that your conditions (a) return the same answers (ie. same Infer outcome and same result), and (b) are as general as possible. As a trivial example, you could revise the “98” in the first error condition to “100”.
2 Benchmarks
In this part of the assignment, you will use Clang and Infer to perform static code analysis on a set of C programs from the ITC-Benchmarks (static analysis benchmarks from Toyota ITC). The set contains programs categorized into:
• With Defects: C programs that contain intentional errors (defects) injected.
• Without Defects: C programs where the defects from the “With Defects” set have been resolved.
We will email different sets of 20 C programs to each team separately. This set contains C programs with their both versions a) With defects , and b) Without defects.
Example 1: For example, consider the program below.
1 i n t f ( i n t y ) {
2 i n t x , z ;
3 i f ( y )
4 x = 1 ; / / True Neg ative ( No e rr o r bec au se x i s d e fin e d )
5 p r i n t f ( ”%d \n ” , y ) ;
6 i f ( y )
7 r e t u r n x ; / / F a l s e P o s i t i v e ( F a l s e alarm i s g e n er at e d s i n c e x i s
i n i t i a l i z e d )
8 r e t u r n z ; / / True P o s i t i v e ( z i s not i n i t a i l i z e d and t r u e alarm i s g e n er at e d )
9 }
In this program Example 1, there is a bug only in one statement i.e., UNINITIALIZED VALUE of z in statement-8. However, when we ran the program with Infer analyzer, it reported alarm for two statements. Please check Figure 1. between which, the first reported alarm for statement-10 is a FALSE POSITIVE because this path is unreachable and this bug will never get invoked. On the other hand, the second alarm for statement-12 is a true bug and correctly got identified by the analyzer as a TRUE POSITIVE. Statement- 4 has no bug and Infer has not given alarm for it, so it is an example of TRUE NEGATIVE. The fourth case
Figure 1: Terminal output of Infer for the sample C-Program
of FALSE NEGATIVE is difficult to show with a small example program. But you can refer to this article to get a some more understanding on FALSE NEGATIVE.
Task 2:
Your task is to analyze the provided set of programs; and report if the tools have found any errors; and then reason about the scalability of the tools. For each program, and evaluate their performance based on the “confusion matrix” described above.
The ITC-Benchmarks categorize defects into the following nine categories. For each program, classify the defects detected by Clang and Infer into the appropriate categories based on the following:
1. Static memory defects: Issues related to the incorrect use of static memory.
2. Dynamic memory defects: Issues like memory leaks or invalid memory accesses.
3. Stack-related defects: Problems with the stack, such as stack overflows or improper function calls.
4. Numerical defects: Issues in mathematical operations such as division by zero or overflow.
5. Resource management defects: Defects related to improper resource allocation, such as file handle leaks.
6. Pointer-related defects: Issues such as dereferencing null or uninitialized pointers.
7. Concurrency defects: Issues related to multithreading, such as race conditions or deadlocks.
8. Inappropriate code: Code that is inefficient or logically incorrect but doesn’t directly lead to defects (e.g., redundant operations).
9. Misc defects: Any other types of defects that don’t fit into the above categories.
You can also follow the supplementary official specification to know more about the test bench and defect sub-types.
Task 3:
Finally, for each tool, you should create two programs: one that showcases the tool’s strengths and another that highlights its limitations. The objective is to gain a clear understanding of the pros and cons of each tool through concrete examples involving 4 to 5 different types of bugs. You can compare the tools based on various factors, including the types of bugs they are able to detect, their performance (such as the time taken to analyze the program), and their memory usage during the analysis.
3 Suggested Command-line Arguments for Each Tool
Some useful command-line arguments for each of the tools are presented here.
3.1 Clang
Clang is free and open-source. Please download it from here: http://clang-analyzer.llvm.org/ Important options are as follows:
1. Run a program with default command options.
Command Line: ~/Desktop/$ clang –analyze dead store.c
For more detail please run clang –help.
3.2 Infer
Install Infer from here (https://github.com/facebook/infer/blob/main/INSTALL.md)
1. Run a program with default command options.
Command Line: ~/Desktop/Infer/infer-linux64-v0.17.0$ bin/infer run -o dead store – clang -c dead store.c
2. Run a program to find specific type of bug. For example, to run div zero.c program to find the errors where the division operation’s denominator has value 0, in result the output will be undefined.
Command Line: ~/Desktop/Infer/infer-linux64-v0.17.0$ bin/infer run –enable-issue-type DIVIDE BY ZERO
-o div zero – clang -c div zero.c
3. Run a program to find the mix errors, with convenient command options. For example, run mixtype.c program with command options where DEAD STORE and UNINITIALIZED VALUE types errors are disabled and DIVIDE BY ZERO type error is enabled to report.
Command Line: ~/Desktop/Infer/infer-linux64-v0.17.0$ bin/infer run –enable-issue-type DIVIDE BY- ZERO –disable-issue-type DEAD STORE –disable-issue-type UNINITIALIZED VALUE -o mix-
type – clang -c mixtype.c
First try to run both tools (Clang and Infer) on their ‘default’ settings, i.e., clang --analyze and infer run. If this yields unsatisfactory results, then try to enable the non-default checkers and re-run the test. It will generate better results (ie. detect more true positives).
4 Submission
Submit a report (up to 10 pages, in PDF) on your findings where you can emphasize your evaluation of the systems (project.pdf). It should describe your experiment setting, results and, importantly, some discussion on which strategy you found more useful. Include the statistics for the tested programs. This report is to be submitted by 23:59 Sunday 6th April 2025.
Make sure your submission contains information about your name, matric number, and email. Your zip files should have the format Surname-Matric-asg3.zip. Submit all files above to the appropriate Canvas Submission Link.