rizsotto / scan-build Goto Github PK

Clang's scan-build re-implementation in python

License: Other

C 11.59% Python 59.41% CMake 0.40% CSS 0.51% JavaScript 7.14% Batchfile 0.40% Shell 20.08% QMake 0.12% C++ 0.36%

clang static-analyzer compilation-database build-system

scan-build's Introduction

scan-build

A package designed to wrap a build so that all calls to gcc/clang are intercepted and logged into a compilation database and/or piped to the clang static analyzer. Includes intercept-build tool, which logs the build, as well as scan-build tool, which logs the build and runs the clang static analyzer on it.

How to get

It's available from the Python Package Index :

$ pip install scan-build

Portability

Should be working on UNIX operating systems.

It has been tested on FreeBSD, GNU/Linux, OS X and Windows.

Prerequisites

clang compiler, to compile the sources and have the static analyzer.
python interpreter (version 3.6, 3.7, 3.8, 3.9).

How to use

To run the Clang static analyzer against a project goes like this:

$ scan-build <your build command>

To generate a compilation database file goes like this:

$ intercept-build <your build command>

To run the Clang static analyzer against a project with compilation database goes like this:

$ analyze-build

Use --help to know more about the commands.

Limitations

Generally speaking, the intercept-build and analyze-build tools together does the same job as scan-build does. So, you can expect the same output from this line as simple scan-build would do:

$ intercept-build <your build command> && analyze-build

The major difference is how and when the analyzer is run. The scan-build tool has three distinct model to run the analyzer:

Use compiler wrappers to make actions. The compiler wrappers does run the real compiler and the analyzer. This is the default behaviour, can be enforced with --override-compiler flag.
Use special library to intercept compiler calls during the build process. The analyzer run against each modules after the build finished. Use --intercept-first flag to get this model.
Use compiler wrappers to intercept compiler calls during the build process. The analyzer run against each modules after the build finished. Use --intercept-first and --override-compiler flags together to get this model.

The 1. and 3. are using compiler wrappers, which works only if the build process respects the CC and CXX environment variables. (Some build process can override these variable as command line parameter only. This case you need to pass the compiler wrappers manually. eg.: intercept-build --override-compiler make CC=intercept-cc CXX=intercept-c++ all where the original build command would have been make all only.)

The 1. runs the analyzer right after the real compilation. So, if the build process removes removes intermediate modules (generated sources) the analyzer output still kept.

The 2. and 3. generate the compilation database first, and filters out those modules which are not exists. So, it's suitable for incremental analysis during the development.

The 2. mode is available only on FreeBSD, Linux and OSX. Where library preload is available from the dynamic loader. Security extension/modes on different operating systems might disable library preload. This case the build behaves normally, but the result compilation database will be empty. (Notable examples for enabled security modes are: SIP on OS X Captain and SELinux on Fedora, RHEL and CentOS.) The program checks the security modes for SIP, and falls back to 3. mode.

intercept-build command uses only the 2. and 3. mode to generate the compilation database. analyze-build does only run the analyzer against the captured compiler calls.

Known problems

Because it uses LD_PRELOAD or DYLD_INSERT_LIBRARIES environment variables, it does not append to it, but overrides it. So builds which are using these variables might not work. (I don't know any build tool which does that, but please let me know if you do.)

Problem reports

If you find a bug in this documentation or elsewhere in the program or would like to propose an improvement, please use the project's issue tracker. Please describing the bug and where you found it. If you have a suggestion how to fix it, include that as well. Patches are also welcome.

License

The project is licensed under University of Illinois/NCSA Open Source License. See LICENSE.TXT for details.

scan-build's People

Contributors

Stargazers

Watchers

scan-build's Issues

expose less symbols from `libear`

Maybe merge the source files into a single module and hide the internal symbols. (Do it only when tests are around.)

'attribute ignored' message from analyzer

hi All,

would like to know is it possible to get the same warning while run the analyzer as it was produce by simple compilation?

when i give this input test.c

typedef int __attribute__((visibility("default"))) bar;

and run Clang against it, i get this warning...

$ clang -c test.c 
test.c:1:28: warning: 'visibility' attribute ignored [-Wignored-attributes]
typedef int __attribute__((visibility("default"))) bar;
                          ^
1 warning generated.

but when i run the analyzer, i got nothing like that

$ clang -### --analyze -x c test.c

then i execute the last line of this output

$ "clang" "-cc1" ....

and it does not generate the warning. my question would be: is it possible to get that warning by passing some extra flags? or it will never appear during the analysis? thanks for any help.

intercept-build does not work on git without --override-compiler

Hello,

I was trying out intercept-build on git 2.9.2 to experiment with the tool.

If I use bear, I can do the following to get a compilation database:

echo DEVELOPER=1 >> config.mak
make configure
bear make -j9

If I try to substitute bear with intercept-build, that results in a crash:

$ intercept-build make -j9          
GIT_VERSION = 2.9.2
    * new build flags
# command exited with status 245

If I use intercept-build with the --override-compiler option, then it's working fine:

intercept-build --override-compiler make -j9

Maybe it's not a bug but the crash surprised me.
I have to say I'm not sure what --override-compiler exactly do,
but since it is in the "advanced options" category I thought it should not be needed.

What do you think,
is it the expected behavior?

Thanks.

decompose scan_build into execution pipeline

is it possible to write scan_build as a shell pipe like this?

intercept <build command> | plan <opts> | executor <opts> | document <opts>

the intercept part generate a "compilation database" from a "build command". the plan reads a "compilation database" and create an "execution plan". the executor might return a result directory, which read by a document generator.

each of those phases could have different options and might share some common ones.

catch up with scan-build (210971)

diff --git a/tools/scan-build/scan-build b/tools/scan-build/scan-build
index f46f093..862bd3a 100755
--- a/tools/scan-build/scan-build
+++ b/tools/scan-build/scan-build
@@ -368,6 +368,7 @@ sub ScanFile {

   my $BugType        = "";
   my $BugFile        = "";
+  my $BugFunction    = "";
   my $BugCategory    = "";
   my $BugDescription = "";
   my $BugPathLength  = 1;
@@ -395,8 +396,13 @@ sub ScanFile {
     elsif (/<!-- BUGDESC (.*) -->$/) {
       $BugDescription = $1;
     }
+    elsif (/<!-- FUNCTIONNAME (.*) -->$/) {
+      $BugFunction = $1;
+    }
+
   }

+
   close(IN);

   if (!defined $BugCategory) {
@@ -409,7 +415,7 @@ sub ScanFile {
     return;
   }

-  push @$Index,[ $FName, $BugCategory, $BugType, $BugFile, $BugLine,
+  push @$Index,[ $FName, $BugCategory, $BugType, $BugFile, $BugFunction, $BugLine,
                  $BugPathLength ];
 }

@@ -701,6 +707,7 @@ print OUT <<ENDTEXT;
   <td>Bug Group</td>
   <td class="sorttable_sorted">Bug Type<span id="sorttable_sortfwdind">&nbsp;&#x25BE;</span></td>
   <td>File</td>
+  <td>Function/Method</td>
   <td class="Q">Line</td>
   <td class="Q">Path Length</td>
   <td class="sorttable_nosort"></td>
@@ -758,13 +765,17 @@ ENDTEXT
       }
       print OUT "</td>";

+      print OUT "<td class=\"DESC\">";
+      print OUT $row->[4];
+      print OUT "</td>";
+
       # Print out the quantities.
-      for my $j ( 4 .. 5 ) {
+      for my $j ( 5 .. 6 ) {
         print OUT "<td class=\"Q\">$row->[$j]</td>";
       }

       # Print the rest of the columns.
-      for (my $j = 6; $j <= $#{$row}; ++$j) {
+      for (my $j = 7; $j <= $#{$row}; ++$j) {
         print OUT "<td>$row->[$j]</td>"
       }

multiple -arch flags

hi all,

i'm in a rewrite of scan-build in python. and reading the original perl code... where the 'ccc-analyzer' (@ line 490 and 716) does prepare for multiple '-arch' flags.

i'm trying to understand this. what is a use case when users call Clang with multiple -arch flags at compile time? the analyzer can't handle that, that's why line 716 does a for loop there. so, if it's make no sense for analyzer, does it for the compiler? or prepocessor? or the linker?

add pep8 check to travis ci

something like this http://cramer.io/2012/05/03/using-travis-ci/

iterate through `-arch` when there are multiple

detect architecture and choose default compiler based on that

review binary decoding

currently child process stdout/stderr read as binary, and decoded as 'ascii'.
how about to use this piece of code to determinate the decoding?

opts['encoding'] = getattr(sys.stdout, 'encoding', None)

find a way to strip common prefixes

maybe this should go into bear and pass it separately. or scan the compilation database for it.

interposition could write compilation database

continue to work on fixed build

when the build breaks 'bear' record only those files which were compiled up until the break. but developers usually fix the break, and just build further (and not start a clean build) to see it goes well or not. this case the new files would overwrite the previous files.

before write a new compilation database, would read the old one and merge with the new set of files.

this would solve the use case describe above
this would deal with edit-compile-run cycles on normal development.
- newly added files would be handled normally
- deleted files could be detected by file existence check before write
this would not detect compiler flag changes, and would cause double entries in the output.

plist+html output

shall it generate output file name?
shall it generate index.html when plist?

update methods from bear 2.1

the new version of bear has a lot simplification on the interception and command line parsing.

for loops else does different what needs to do

in files_loop there is an extra message at the end of the run:

'skip analysis, source file not found'

this should not be there

fix filelist linker flag

https://gcc.gnu.org/onlinedocs/gcc/Darwin-Options.html
http://www.manpages.info/macosx/ld.1.html

catch up with scan-build (210970)

diff --git a/tools/scan-build/scan-build b/tools/scan-build/scan-build
index 7502a42..f46f093 100755
--- a/tools/scan-build/scan-build
+++ b/tools/scan-build/scan-build
@@ -498,6 +498,8 @@ my $baseDir;
 sub FileWanted {
     my $baseDirRegEx = quotemeta $baseDir;
     my $file = $File::Find::name;
+
+    # The name of the file is generated by clang binary (HTMLDiagnostics.cpp)
     if ($file =~ /report-.*\.html$/) {
        my $relative_file = $file;
        $relative_file =~ s/$baseDirRegEx//g;
@@ -1175,6 +1177,13 @@ ADVANCED OPTIONS:
  -analyzer-config <options>

    Provide options to pass through to the analyzer's -analyzer-config flag.
+   Several options are separated with comma: 'key1=val1,key2=val2'
+
+   Available options:
+     * stable-report-filename=true or false (default)
+       Switch the page naming to:
+       report-<filename>-<function/method name>-<id>.html
+       instead of report-XXXXXX.html

 CONTROLLING CHECKERS:

interposition calls analyzer when compilation fails

saw it during testing

check precompiled header handling

original ccc-analyzer:

ignores when output not defined
copy source file as output file, and truncate the 'ghc' from the end of the output name

OSError: [Errno 2] No such file or directory

analyze-cc: Problem occured during analyzis.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 96, in run
return arch_check(opts)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 61, in wrapper
return function(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 274, in arch_check
return continuation(opts)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 61, in wrapper
return function(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 246, in language_check
return continuation(opts)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 61, in wrapper
return function(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 216, in set_file_path_relative
return continuation(opts)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 61, in wrapper
return function(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 205, in filter_debug_flags
return continuation(opts)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 61, in wrapper
return function(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/runner.py", line 174, in run_analyzer
cwd)
File "/usr/local/lib/python2.7/dist-packages/libscanbuild/clang.py", line 43, in get_arguments
output = subprocess.check_output(cmd, cwd=cwd, stderr=subprocess.STDOUT)
File "/usr/lib/python2.7/subprocess.py", line 566, in check_output
process = Popen(stdout=PIPE, _popenargs, *_kwargs)
File "/usr/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

follow up on $IncludeParserRejects flag

i have a question related to ccc-analyzer. it is about the Analyze method (line 272). it does check the $IncludeParserRejects variable, which is initialized to zero once and has a comment "Set this to 1 if we want to include 'parser rejects' files." my question is: who will set that to 1? shall it be an environment which turn this on? can i leave it out from the rewritten version?

libear configure does not use cflags

method and symbol checking function shall use previously passed cflags otherwise it loose defines and other important flags.

generate man page from code

argparse and active checkers are currently on the man page. these are dynamic parts. would be better to make some process to stream those into the man page rather than update two separate documents.

differences captured by SATestBuild

one of the differences was the file names in the .plist files. the reason was that the new implementation was using absolute path for source files, while the old implementation were using relative.

generated report is not copyable

links are using absolute path instead of relative

do not run analyzer when command was `configure` or `autogen`

the old implementation checks the build command first word, if it's match to configure or autogen it sets the target directory empty. (i guess the intent here is that we do not want any reports from these commands.) so, later the wrappers do call the compiler and do call the analyzer. (append the random generated file name to the empty target directory.) and the analyzer has this behavior with no parent directory and it guarantee that the report won't be written.

result code for plist

When .plist only reports are requested '-plist' and status code shall be set based on analyzer results '-status-bugs', what should be the result? Shall it grep the plist files?

compiler calls with `-###` flags shall be ignored from compilation database

saw it while do testing

create a version where scan-build uses the ccc-analyzer scripts

catch up with scan-build (213235)

diff --git a/tools/scan-build/scan-build b/tools/scan-build/scan-build
index e66b185..585efd5 100755
--- a/tools/scan-build/scan-build
+++ b/tools/scan-build/scan-build
@@ -388,6 +388,10 @@ sub ScanFile {
     }
     elsif (/<!-- BUGFILE (.*) -->$/) {
       $BugFile = abs_path($1);
+      if (!defined $BugFile) {
+         # The file no longer exists: use the original path.
+         $BugFile = $1;
+      }
       UpdatePrefix($BugFile);
     }
     elsif (/<!-- BUGPATHLENGTH (.*) -->$/) {

`--enable-checker` shall take coma separated checkers as list

--enable-checker osx,cplusplus,core currently create a list with a single element, instead of a list with 3 elements.

generate output directory name

original script does a lot of magic with output dir generation. version number generated, etc...

review string formating

evaluate which one is faster:

string.format() method
the printf like string formating

duplicate `-o` when running static analyzer

found it during testing

--override-compiler shall be added to command line options

SATestBuild.py script is using that.

uniqueness check on `-isysroot` switch

following message was sent to the clang developers' list...

and i got another question: about -isysroot flag uniqueness checking. (ccc-analyzer: line 520) the current behavior insert the first usage of this flag. (although it does not check the --sysroot uniqueness.) i'm wondering that, shall this wrapper change or correct incorrect invocations?

i did run a test against gcc 4.9 on linux. (which shows that actually the last -isysroot flag wins.) here it comes:

$ gcc -c functional_test/divide_zero.cpp
$ gcc -c functional_test/divide_zero.cpp -isysroot /
$ gcc -c functional_test/divide_zero.cpp -isysroot /tmp -isysroot /
$ gcc -c functional_test/divide_zero.cpp -isysroot / -isysroot /tmp
In file included from functional_test/divide_zero.cpp:1:0:
/usr/include/c++/4.9.0/cassert:43:20: fatal error: assert.h: No such file or directory
 #include <assert.h>
                    ^
compilation terminated.
$

another test against clang 3.4 on linux (which shows that it ignores -isysroot on this platform) here it comes:

$ clang -c functional_test/divide_zero.cpp -isysroot /tmp -isysroot /                        
$ clang -c functional_test/divide_zero.cpp -isysroot / -isysroot /tmp                        
$ clang -c functional_test/divide_zero.cpp -isysroot /tmp 
$ clang -c functional_test/divide_zero.cpp                                         
$

Fixed unescaped quotes

rizsotto/Bear#89

make analyzer running separated from command generation

the driver module currently is taking a compilation command and generates analysis command (parse, filter, etc..) then it executes it. would it be better to divide this two steps?

then 'beye' would generate a compilation database (with a help from 'bear'). then from a compilation database would generate a list of commands (cmd + cwd) to run the analyzer. then those commands can be run and generate report files. then from those report files it would generate the final report.

this intermediate steps would give more insights what's going on. (easier to understand) would be easier to test. and maybe reuse some parts would be also easier.

check -filelist

rizsotto/Bear#79

Running SATestAdd.py on open source benchmarks

I’ve written up some instructions below on how to run the $CLANG_SOURCES_DIR/utils/analyzer/SATestAdd.py script. We have a buildbot that uses this script to make sure we catch regressions on openssl and postgresql, as well as internal projects.

Instructions:

Create a directory to hold the benchmarks:
```
$ mkdir clang-analyzer-tests-open-source
```
Download a benchmark (e.g., openssl from https://www.openssl.org/source/openssl-1.0.0s.tar.gz)
Untar it in the clang-analyzer-tests-open-source directory.
Add a run_static_analyzer.cmd file to the untarred project directory to tell SATestAdd.py how to build the project. For openssl, the contents of this file should be:
```
./config
make clean
make -j1
```
Add a cleanup_run_static_analyzer.sh file to the untarred project directory to tell SATestBuild.py how to clean up after building the project. For openssl, this is:
```
make clean
exit 0
```

Make sure you have both scan-build and the clang you want to analyze with in your path:

$ export PATH=$LLVM_BUILD_DIR/Release+Asserts/bin:$CLANG_SOURCES_DIR/tools/scan-build/:$PATH

Run SATestAdd.py to add the benchmark to the projectMap (make sure you are in the directory you created in step 1.:
```
$ python $CLANG_SOURCES_DIR/utils/analyzer/SATestAdd.py openssl-1.0.0s 1
```

Here the ‘1’ indicates that the project should be built with scan-build (a 0 or 2 instead of a 1 would indicate that it is a single-file benchmark — but these aren’t important for scan-build).
This will run the analyzer on the project to create a reference set of results and create a projectMap.csv file.

You should see something like:

--- Building project openssl-1.0.0s
  Build directory: /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s.
Log file: /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s/RefScanBuildResults/Logs/run_static_analyzer.log
Output directory: /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s/RefScanBuildResults
  Executing: /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s/cleanup_run_static_analyzer.sh
  Executing: scan-build --use-analyzer /usr/bin/clang -plist-html -o /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s/RefScanBuildResults -enable-checker alpha.unix.SimpleStream,alpha.security.taint,cplusplus.NewDeleteLeaks,core,cplusplus,deadcode,security,unix,osx --keep-empty --override-compiler  ./config
  Executing: scan-build --use-analyzer /usr/bin/clang -plist-html -o /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s/RefScanBuildResults -enable-checker alpha.unix.SimpleStream,alpha.security.taint,cplusplus.NewDeleteLeaks,core,cplusplus,deadcode,security,unix,osx --keep-empty --override-compiler  make clean -j6
  Executing: scan-build --use-analyzer /usr/bin/clang -plist-html -o /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s/RefScanBuildResults -enable-checker alpha.unix.SimpleStream,alpha.security.taint,cplusplus.NewDeleteLeaks,core,cplusplus,deadcode,security,unix,osx --keep-empty --override-compiler  make -j1
  Executing: /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s/cleanup_run_static_analyzer.sh
Build complete (time: 326.35). See the log for more details: /Volumes/Data/SATests/clang-analyzer-tests-open-source/openssl-1.0.0s/RefScanBuildResults/Logs/run_static_analyzer.log
Number of bug reports (non-empty plist files) produced: 87
Completed tests for project openssl-1.0.0s (time: 326.71).
Warning: Creating the Project Map file!!
The project map is updated:  /Volumes/Data/SATests/clang-analyzer-tests-open-source/projectMap.csv

This will create a RefScanBuildResults directory in the project directory with reference analysis results. (Note: you will get an error if you try to add the same project twice.)

It is important to us that scan-build-py works with these scripts (this is a indicator that other build-bot-style uses of scan-build out in the wild can be replaced with scan-build-py) and that it reports the same issues on benchmarks like openssl as the old scan-build does (including issues with multi-file paths reported in the .plist output).