Giter Site home page Giter Site logo

skyhover / deckard Goto Github PK

View Code? Open in Web Editor NEW
206.0 206.0 78.0 4.84 MB

Code clone detection; clone-related bug detection; sematic clone analysis

License: Other

Shell 4.71% Python 8.96% Java 1.31% C 42.03% C++ 23.28% Objective-C 0.26% GAP 2.55% Makefile 2.09% Lex 4.24% Yacc 9.98% ANTLR 0.59%

deckard's People

Contributors

codequal avatar jianglx avatar lxjiang avatar npalix avatar pieman72 avatar skyhover avatar wagnerst avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deckard's Issues

Build fails

Hi, on a current Mac OS system, the build fails because malloc.h is not at the place you expect that to be. I fixed it for me by a symbolic link but that can only be a workaround. It would be better fixed in the build script.

Bug Report: cluster: Possible errors occurred with LSH.

Hi,
I executed Deckard to detect clones on a dataset of 47k source files. However, after a day of execution I faced with the an error. following,, you can find the content of different log files.

cluster_vdb_50_4_g9_2.50998_30_100000

Clustering 'vectors/vdb_50_4_g9_2.50998_30_100000' 6.513064 ...
/home/local/SAIL/amir/tasks/RQ2/RQ2.2/Deckard/src/lsh/bin/enumBuckets -R 6.513064 -M 7600000000 -b 2 -A -f vectors/vdb_50_4_g9_2.50998_30_100000 -c -p vectors/vdb_50_4_g9_2.50998_30_100000.param > clusters/cluster_vdb_50_4_g9_2.50998_30_100000
Warning: output all clones. Takes more time...
Warning: will compute parameters
Error: the structure supports at most 2097151 points (3238525 were specified).

real 2m58.162s
user 2m50.464s
sys 0m7.492s
cluster: Possible errors occurred with LSH. Check log: times/cluster_vdb_50_4_g9_2.50998_30_100000

paramsetting_50_4_0.79_30

paramsetting: 50 4 0.79 ...Looking for optimal parameters by Clustering 'vectors/vdb_50_4_g9_2.50998_30_100000' 6.513064 ...
/home/local/SAIL/amir/tasks/RQ2/RQ2.2/Deckard/src/lsh/bin/enumBuckets -R 6.513064 -M 7600000000 -b 2 -A -f vectors/vdb_50_4_g9_2.50998_30_100000 -c -p vectors/vdb_50_4_g9_2.50998_30_100000.param > clusters/cluster_vdb_50_4_g9_2.50998_30_100000
cluster: Possible errors occurred with LSH. Check log: times/cluster_vdb_50_4_g9_2.50998_30_100000
Error: paramsetting failure...exit.

grouping_50_4_2.50998_30

grouping: vectors/vdb_50_4 with distance=2.50998...Total 7602630 vectors read in; 11282415 vectors dispatched into 57 ranges (actual groups may be many fewer).

real 410m12.610s
user 6m43.592s
sys 26m6.544s
Done grouping 50 4 2.50998. See groups in vectors/vdb_50_4_g[0-9]_2.50998_30

Note that I have sufficient memory for execution; Thus, I added two other conditions for the memory limit setting in both vecquery and vertical-param-batch files. The reason I increased the memory limit is that my vectors size is greater than 2G and I have no problem with the availability of enough memory. Now the conditions are like this:

# dumb (not flexible) memory limit setting
mem=`wc "$vdb" | awk '{printf("%.0f", $3/1024/1024+0.5)}'`
if [ $mem -lt 2 ]; then
	mem=10000000
elif [ $mem -lt 5 ]; then
	mem=20000000
elif [ $mem -lt 10 ]; then
	mem=30000000
elif [ $mem -lt 20 ]; then
	mem=60000000
elif [ $mem -lt 50 ]; then
	mem=150000000
elif [ $mem -lt 100 ]; then
	mem=300000000
elif [ $mem -lt 200 ]; then
	mem=600000000
elif [ $mem -lt 500 ]; then
	mem=900000000
elif [ $mem -lt 1024 ]; then
	mem=1900000000
elif [ $mem -lt 2048 ]; then
	mem=3800000000
elif [ $mem -lt 4096 ]; then  # this condition is added by me
	mem=7600000000
elif [ $mem -lt 8192 ]; then  # this condition is added by me
	mem=15200000000
else
	echo "Error: Size of $vdb > 8G. I don't want to do it before you think of any optimization." | tee -a "$TIME_DIR/cluster_${vfile}"
	exit 1;
fi

The parameters of deckard is set to the following values:

  • MIN_TOKENS='50'
  • STRIDE='4'
  • SIMILARITY='0.79'
  • MAX_PROCS = 40

I attached the log files. please help me to mitigate this problem, I need your tool for my experiments.
deckard log.zip

Build on Linux fails

Hi
I am trying to compile Deckard on a Linux system, but it stops, because it tries to find "dot2d".
Is this some kind of third party lib I should add? If so were should it be placed?

Here is a part of the log:

Everything cool above here:

  • -c -O3 -DREAL_FLOAT enumBuckets.cpp
    g++ -o ../bin/enumBuckets -O3 enumBuckets.o BucketHashing.o Geometry.o LocalitySensitiveHashing.o Random.o Util.o GlobalVars.o SelfTuning.o NearNeighbors.o -lm
    g++ -c -O3 -DREAL_FLOAT exploreBuckets.cpp
    g++ -o ../bin/exploreBuckets -O3 exploreBuckets.o BucketHashing.o Geometry.o LocalitySensitiveHashing.o Random.o Util.o GlobalVars.o SelfTuning.o NearNeighbors.o -lm
    make[1]: Leaving directory `/home/janO/Deckard/src/lsh/sources'
    ./build.sh: Zeile 98: cd: ../lib: Datei oder Verzeichnis nicht gefunden (File or Folder not found)
    ./build.sh: Zeile 111: cd: ../dot2d/grammars/output: Datei oder Verzeichnis nicht gefunden (File or Folder not found)
    ./build.sh: Zeile 123: cd: ../dot2d: Datei oder Verzeichnis nicht gefunden (File or Folder not found)

In braces I translated the error from German to English.

Cheers and Thanks

build fails

Hi, I'm getting:

a - token-counter.o
a - sq-tree.o
a - node-vec-gen.o
a - vector-output.o
a - vector-merger.o
a - tree-accessor.o
a - token-tree-map.o
a - clone-context-php.o
rm -f vectorsort dispatchvectors computeranges *~ *.o
gcc -O3  -O3  vectorsort.c  -lm -o vectorsort
gcc -O3  -O3  dispatchvectors.c  -lm -o dispatchvectors
gcc -O3  -O3  computeranges.c  -lm -o computeranges
rm -f *.o cvecgen jvecgen cbugfilters jbugfilters out2html phpvecgen phpbugfilters out2xml cParseTreeMain jParseTreeMain phpParseTreeMain
g++  -o ptreeC.o -O3 -I../include -I../vgen/treeTra -c -DCLANG ptree.cc
make: *** No rule to make target '../ptgen/gcc/gccptgen.a', needed by 'cvecgen'.  Stop.
Error: main make failed. Exit.
./build.sh  7.49s user 0.35s system 85% cpu 9.207 total

by just executing the build.sh in src/main

Crash on "return A?B:C"

When I use the command: "cvecgen -i ../../src/dircolors.c -o tmp.vec --start-line-number 508 --end-line-number 508"
The output is "cvecgen: tree-accessor.C:81: static TreeVector* TreeAccessor::get_node_vector(Tree*): Assertion `attr_itr!=t->attributes.end()' failed."

Please refer to the attachment for the file dircolors.c.

dircolors.c.zip

build fails

In Mac OS(Mojave 10.14.5) and Linux(Ubuntu 18.04.2 LTS), cannot build.
I command $sh build.sh in src/main/

Error Message

Mac OS

rm -f *.pyc
make -C simple clean
rm -f .o lex.yy.cc pt_c.tab pt_c.y head.cc c_ptgen
make -C gcc clean
rm -f .o lex.yy.cc pt_c.tab pt_c.y head.cc gccptgen.a
make -C java clean
rm -f .o lex.yy.cc pt_j.tab pt_j.y head.cc javaptgen.a
make -C php5 clean
rm -f .o lex.yy.cc pt_zend_language_parser.tab pt_zend_language_parser.y head.cc phpptgen.a
make -C sol clean
rm -f .o lex.yy.cc pt_solidity. head.cc solidityptgen.a
make -C gcc
./mainc.py c.y
Traceback (most recent call last):
File "./mainc.py", line 43, in
import YaccParser,YaccLexer
File "../YaccParser.py", line 77
except antlr.RecognitionException, ex:
^
SyntaxError: invalid syntax
make[1]: *** [pt_c.y] Error 1
make: *** [TARGET] Error 2
Error: ptgen make failed. Exit.
Error: ptgen make failed. Deckard build fails.

Linux

rm -f *.pyc
make -C simple clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/simple'
rm -f .o lex.yy.cc pt_c.tab pt_c.y head.cc c_ptgen
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/simple'
make -C gcc clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/gcc'
rm -f .o lex.yy.cc pt_c.tab pt_c.y head.cc gccptgen.a
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/gcc'
make -C java clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/java'
rm -f .o lex.yy.cc pt_j.tab pt_j.y head.cc javaptgen.a
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/java'
make -C php5 clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/php5'
rm -f .o lex.yy.cc pt_zend_language_parser.tab pt_zend_language_parser.y head.cc phpptgen.a
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/php5'
make -C sol clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/sol'
rm -f .o lex.yy.cc pt_solidity. head.cc solidityptgen.a
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/sol'
make -C gcc
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/gcc'
./mainc.py c.y
bison -d pt_c.y -o pt_c.tab.cc
make[1]: bison: Command not found
Makefile:59: recipe for target 'pt_c.tab.cc' failed
make[1]: *** [pt_c.tab.cc] Error 127
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/gcc'
Makefile:35: recipe for target 'TARGET' failed
make: *** [TARGET] Error 2
Error: ptgen make failed. Exit.
Error: ptgen make failed. Deckard build fails.

plz, help me.

how to use a slice ?

Hi,
I am trying to detect clones from a slice, how can I use Deckard to detect clones from a slice?

Thanks!

Building errors

Hi.
I want to build the Deckard but got error in Error: ptgen make failed. Exit.Error: ptgen make failed. Deckard build fails.
I have tried the solutions in other issues like install the newest version of packages, edit the file /src/ptgen/gcc/mainc.py to use python2 .
I also changed my OS to the Ubuntu 12.
But still get the errors below.
Can anyone help me? Thanks a lot!

syu@ubuntu:~/workspaces/Deckard/src/main$ sudo ./build.sh
rm -f *.pyc
make -C simple clean
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/simple' rm -f *.o lex.yy.cc pt_c.tab* pt_c.y head.cc c_ptgen make[1]: Leaving directory /home/syu/workspaces/Deckard/src/ptgen/simple'
make -C gcc clean
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/gcc' rm -f *.o lex.yy.cc pt_c.tab* pt_c.y head.cc gccptgen.a make[1]: Leaving directory /home/syu/workspaces/Deckard/src/ptgen/gcc'
make -C java clean
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/java' rm -f *.o lex.yy.cc pt_j.tab* pt_j.y head.cc javaptgen.a make[1]: Leaving directory /home/syu/workspaces/Deckard/src/ptgen/java'
make -C php5 clean
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/php5' rm -f *.o lex.yy.cc pt_zend_language_parser.tab* pt_zend_language_parser.y head.cc phpptgen.a make[1]: Leaving directory /home/syu/workspaces/Deckard/src/ptgen/php5'
make -C sol clean
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/sol' rm -f *.o lex.yy.cc pt_solidity.* head.cc solidityptgen.a make[1]: Leaving directory /home/syu/workspaces/Deckard/src/ptgen/sol'
make -C gcc
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/gcc' ./mainc.py c.y bison -d pt_c.y -o pt_c.tab.cc pt_c.y: conflicts: 11 shift/reduce flex -olex.yy.cc c.l g++ -O3 -I../../include -c -o lex.yy.o lex.yy.cc g++ -O3 -I../../include -c -o pt_c.tab.o pt_c.tab.cc pt_c.tab.cc: In function ‘int yyparse()’: pt_c.tab.cc:13685:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] pt_c.tab.cc:13827:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] g++ -O3 -I../../include -c -o head.o head.cc ar -csrv gccptgen.a lex.yy.o pt_c.tab.o head.o a - lex.yy.o a - pt_c.tab.o a - head.o make[1]: Leaving directory /home/syu/workspaces/Deckard/src/ptgen/gcc'
make -C java
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/java' ./mainj.py j.y bison -d pt_j.y -o pt_j.tab.cc pt_j.y: conflicts: 24 shift/reduce, 259 reduce/reduce flex -olex.yy.cc j.l g++ -O3 -I../../include -c -o lex.yy.o lex.yy.cc g++ -O3 -I../../include -c -o pt_j.tab.o pt_j.tab.cc pt_j.tab.cc: In function ‘int yyparse()’: pt_j.tab.cc:17408:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] pt_j.tab.cc:17550:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] g++ -O3 -I../../include -c -o head.o head.cc ar -csrv javaptgen.a lex.yy.o pt_j.tab.o head.o a - lex.yy.o a - pt_j.tab.o a - head.o make[1]: Leaving directory /home/syu/workspaces/Deckard/src/ptgen/java'
make -C php5
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/php5' ./mainphp.py zend_language_parser.y sed -i -e "s/'\"'/'\\\\\"'/" head.cc bison -d pt_zend_language_parser.y -o pt_zend_language_parser.tab.cc flex -i -olex.yy.cc zend_language_scanner.l g++ -O3 -I../../include -c -o lex.yy.o lex.yy.cc zend_language_scanner.l: In function ‘int yylex(YYSTYPE*)’: zend_language_scanner.l:906:67: warning: format ‘%s’ expects argument of type ‘char*’, but argument 3 has type ‘int’ [-Wformat] zend_language_scanner.l:906:67: warning: format ‘%d’ expects a matching ‘int’ argument [-Wformat] lex.yy.cc:4873:57: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘int yy_get_next_buffer()’: lex.yy.cc:4894:61: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:4962:51: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:4975:3: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:4975:3: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:5005:68: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘void yyunput(int, char*)’: lex.yy.cc:5102:54: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘yy_buffer_state* yy_create_buffer(FILE*, int)’: lex.yy.cc:5261:65: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:5270:65: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘void yyensure_buffer_stack()’: lex.yy.cc:5427:71: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:5447:71: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘yy_buffer_state* yy_scan_buffer(char*, yy_size_t)’: lex.yy.cc:5473:63: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘yy_buffer_state* yy_scan_bytes(const char*, int)’: lex.yy.cc:5522:62: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:5531:51: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘void yy_push_state(int)’: lex.yy.cc:5557:68: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘void yy_pop_state()’: lex.yy.cc:5568:53: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] g++ -O3 -I../../include -c -o pt_zend_language_parser.tab.o pt_zend_language_parser.tab.cc pt_zend_language_parser.tab.cc: In function ‘int yyparse()’: pt_zend_language_parser.tab.cc:11522:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] pt_zend_language_parser.tab.cc:11664:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] g++ -O3 -I../../include -c -o head.o head.cc ar -csrv phpptgen.a lex.yy.o pt_zend_language_parser.tab.o head.o a - lex.yy.o a - pt_zend_language_parser.tab.o a - head.o make[1]: Leaving directory /home/syu/workspaces/Deckard/src/ptgen/php5'
make -C sol
make[1]: Entering directory /home/syu/workspaces/Deckard/src/ptgen/sol' ./mainsol.py solidity.y bison -d pt_solidity.y -o pt_solidity.tab.cc -v -g pt_solidity.y:255.1-11: invalid directive: %precedence'
pt_solidity.y:254.8-10: %type redeclaration for UFIXED
pt_solidity.y:231.62-67: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for FIXED
pt_solidity.y:231.56-60: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for BYTE
pt_solidity.y:231.51-54: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for BYTES
pt_solidity.y:231.45-49: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for UINT
pt_solidity.y:231.40-43: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for INT
pt_solidity.y:231.36-38: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for VAR
pt_solidity.y:231.32-34: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for STRING
pt_solidity.y:231.25-30: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for BOOL
pt_solidity.y:231.20-23: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for ADDRESS
pt_solidity.y:231.12-18: previous declaration
pt_solidity.y:270.1-11: invalid directive: %precedence' pt_solidity.y:269.8-10: %type redeclaration for DELETE pt_solidity.y:233.39-44: previous declaration pt_solidity.y:269.8-10: %type redeclaration for AFTER pt_solidity.y:233.33-37: previous declaration pt_solidity.y:273.1-11: invalid directive: %precedence'
make[1]: *** [pt_solidity.tab.cc] Error 1
make[1]: Leaving directory `/home/syu/workspaces/Deckard/src/ptgen/sol'
make: *** [TARGET] Error 2
Error: ptgen make failed. Exit.
Error: ptgen make failed. Deckard build fails.

Make error

Error starts at line...

"make[1]: execvp: ./mainc.py: Permission denied"

and then ends at...

"make: *** No rule to make target ../ptgen/gcc/gccptgen.a', needed bycvecgen'. Stop."

Problem in running Deckard for C project

Hi and thanks for the tool!
I set up a config file to test my C project, following the one reported as sample in scripts/clonedect, but I obtain this error after running ./deckard.sh:

==== Configuration checking...Error: missing file ~/Deckard-rel2.0solidity/scripts/clonedetect/src/main/cvecgen. Check your config

any suggestion?
Thanks in advance

typefile and nodefiles

I noticed that the Deckard 2 config parameters for TYPE_FILE, RELEVANT_NODEFILE, LEAF_NODEFILE and PARENT_NODEFILE of the sample config point to the nonexistent directory Deckard/testdata.
I assume that they are pretty important, as the detection outputs a lot of garbage if they are not changed.

What is supposed to be in these files? I assume this is about the node types for the ASTs, but I cant figure out how to specify them.

I'm using Java and want to run Deckard on BigCloneEval. The clones should have method level granularity.
It is especially important that I can configure Deckard to prune irrelevant NODE types early, as I want to run a performance analysis and comparison, and it doesn't feel fair to run Deckard on a lot more ASTs than necessary.

Command line options for filter IDs not implemented

When I run the bugfiltering command, the results showed "Command line options for filters IDs not implemented" and "Cannot open file : src/AbstractAsyncTableRendering.java". The command I used is "scripts/bugdetect/bugfiltering samples/clusters/post_cluster_vdb_50_0_allg_0.95_30 java > bug_result". Do you have any idea about how to solve the problem? Thanks.

post_cluster file is 0 bytes

I have run Deckard on the code of about 30 java projects. The resulting cluster_vdb_50_0_allg_0.95_30 is not empty but the corresponding post_cluster_vdb_50_0_allg_0.95_30 file is empty. Why does this happen? Is it because there are too much suspicious clones in cluster file and then in the post-process all the clones are excluded leading to empty post_cluster file?
Screenshot from 2020-03-15 16-06-52

Why does Deckard act differently from one run to another?

Hi,
Thank you for your great tool.
I am currently using Deckard for my research. However, when I run it multiple times with the same set of hyperparameters on the same dataset, I get different results. This affects the reproducibility of my research. Any chance to set seed?

Kind regards.

Clone detection failure?need help

I followed the steps what README.md say.
But when I installed the Deckard,I want to test the clone detection...
I create a "config" file in the path /home/xx/projects/Deckard,And the content is same as "config" in /sample,
The configuration file is as follows:


FILE_PATTERN='*.java' # used in the 'find' command below
#where are the source files?
SRC_DIR="src"

The following are for Deckard2's support for dot only####

PDG_DIR="ddgs" # used by Deckard2 for 'find $SRC_DIR -ipath "*/$PDG_DIR/$FILE_PATTERN"'
AST_DIR="asts" # each pdg should have an ast with the same name in a different folder
#where are node definition files? used by Deckard2
TYPE_FILE='/home/ly/projects/Deckard/testdata/deckard3/AstNodeTypeNamesIDs.txt'
RELEVANT_NODEFILE='/home/ly/projects/Deckard/testdata/deckard3/AstRelevantNodes.txt'
LEAF_NODEFILE='/home/ly/projects/Deckard/testdata/deckard3/AstLeafNodes.txt'
PARENT_NODEFILE='/home/ly/projects/Deckard/testdata/deckard3/AstParentNodes.txt'
#####The above are for Deckard2 only #####

#where is Deckard?
DECKARD_DIR="/home/ly/projects/Deckard"
#clone parameters; refer to paper.
MIN_TOKENS='30 50' # can be a sequence of integers
STRIDE='2 0' # can be a sequence of integers
SIMILARITY='1.0 0.95' # can be a sequence of values <= 1
#DISTANCE='0 0.70711 1.58114 2.236'

###########################################################
#Where to store result files?

#where to output generated vectors?
VECTOR_DIR="vectors"
#where to output detected clone clusters?
CLUSTER_DIR="clusters"
#where to output timing/debugging info?
TIME_DIR="times"

##########################################################
#where are several programs we need?

#where is the vector generator?
VGEN_EXEC="$DECKARD_DIR/src"
case $FILE_PATTERN in
*.dot )
VGEN_EXEC="$VGEN_EXEC/dot2d/dotvgen" ;; # for Deckard2 dot only
*.java )
VGEN_EXEC="$VGEN_EXEC/main/jvecgen" ;;
*.php )
VGEN_EXEC="$VGEN_EXEC/main/phpvecgen" ;;
*.c | *.h )
VGEN_EXEC="$VGEN_EXEC/main/cvecgen" ;;

  • )
    echo "Error: invalid FILE_PATTERN: $FILE_PATTERN"
    VGEN_EXEC="$VGEN_EXEC/invalidvecgen" ;;
    esac
    #how to divide the vectors into groups?
    GROUPING_EXEC="$DECKARD_DIR/src/vgen/vgrouping/runvectorsort"
    #where is the lsh for vector clustering/querying?
    CLUSTER_EXEC="$DECKARD_DIR/src/lsh/bin/enumBuckets"
    QUERY_EXEC="$DECKARD_DIR/src/lsh/bin/queryBuckets"
    #how to post process clone groups?
    POSTPRO_EXEC="$DECKARD_DIR/scripts/clonedetect/post_process_groupfile"
    #how to transform source code html? Used by Deckard1 only
    SRC2HTM_EXEC=source-highlight
    SRC2HTM_OPTS=--line-number-ref

MAX_PROCS=8

GROUPING_S='30' # should be a single value
#GROUPING_D
#GROUPING_C

export DECKARD_DIR
export FILE_PATTERN
export SRC_DIR
export PDG_DIR
export AST_DIR

export TYPE_FILE
export RELEVANT_NODEFILE
export LEAF_NODEFILE
export PARENT_NODEFILE

export VECTOR_DIR
export TIME_DIR
export CLUSTER_DIR

export VGEN_EXEC
export GROUPING_EXEC
export CLUSTER_EXEC
export POSTPRO_EXEC
export SRC2HTM_EXEC
export SRC2HTM_OPTS

export MIN_TOKENS
export STRIDE
#export DISTANCE
export SIMILARITY
export GROUPING_S
export GROUPING_D
export GROUPING_C
export MAX_PROCS


But when I follow the next step to run,there will be a error.


`ly@ubuntu:~/projects/Deckard$ sh /home/ly/projects/Deckard/scripts/clonedetect/deckard.sh
DECKARD--A Tree-Based Code Clone Detection Toolkit.
/home/ly/projects/Deckard/scripts/clonedetect/deckard.sh: 4: /home/ly/projects/Deckard/scripts/clonedetect/deckard.sh: [[: not found

  • Version Unknown. Missing README.
    Copyright (c) 2007-2018. University of California / Singapore Management University
    Distributed under the three-clause BSD license.

==== Configuration checking.../home/ly/projects/Deckard/scripts/clonedetect/deckard.sh: 81: /home/ly/projects/Deckard/scripts/clonedetect/configure: [[: not found
Error: no config file in current directory


I don't know how to fix it.....
Can someone give me some advice,Thx

On Bugfiltering

Ln 52 scripts/bugfiltering
filterpath = os.environ.get("DECKARD_DIR")

The bash crashes, stating it cannot find the Deckard path.

Error: problem in vec generator step. Stop and check logs in times/

I receive this error message when running on sample code in /Deckard/samples/src
DECKARD--A Tree-Based Code Clone Detection Toolkit.

  • Version 2.0 + support for Solidity syntax
    Copyright (c) 2007-2018. University of California / Singapore Management University
    Distributed under the three-clause BSD license.

==== Configuration checking...Done.

==== Start clone detection ====

Vector generation.../home/shijing/ra/codeReuse/Deckard/src/main/jvecgen *.java

vgen: 30 2 ...Done. Log: times/vgen_30_2
...deleting intermediate vector files...Done

vgen: 30 0 ...Done. Log: times/vgen_30_0
...deleting intermediate vector files...Done

vgen: 50 2 ...Done. Log: times/vgen_50_2
...deleting intermediate vector files...Done

vgen: 50 0 ...Done. Log: times/vgen_50_0
...deleting intermediate vector files...Done

Error: problem in vec generator step. Stop and check logs in times/

Did anyone encounter similar situation?

Vec generator failure

I've got a problem when doing clone detecting with my C codes. The feed back is like this
"Error: problem in vec generator step. Stop and check logs in times/"
Could you tell me what might be the problem? Thanks a lot.

Upgrade to Python 3

Most of the Yacc parser (and maybe other portions) were written in Python 2. Since Python 2 was deprecated in 2020, we should update the codebase to use Python 3.

What parameters are fine? Need help

Recently, I try deckard to find bugs in clone code. I find it not work well for the following java file. I set
MIN_TOKENS='15' STRIDE='2' SIMILARITY='0.8' .
According to the FSE 07 it is supposed to find out the bug.
The bug line is:
cmp = lhsType.compareTo(lhsType);
if (cmp != 0)
return cmp;
it looks similar several line ahead:
cmp = lhsName.compareTo(rhsName);
if (cmp != 0)
return cmp;
Can anyone help me?

public class VersionInsensitiveBugComparator implements WarningComparator {

private ClassNameRewriter classNameRewriter = IdentityClassNameRewriter.instance();

private boolean exactBugPatternMatch = true;

private boolean comparePriorities = false;
public VersionInsensitiveBugComparator() {
}

public void setClassNameRewriter(ClassNameRewriter classNameRewriter) {
    this.classNameRewriter = classNameRewriter; 
}
public void setComparePriorities(boolean b) {
    comparePriorities = b;
}

/**
 * Wrapper for BugAnnotation iterators, which filters out
 * annotations we don't care about.
 */
private class FilteringAnnotationIterator implements Iterator<BugAnnotation> {
    private Iterator<BugAnnotation> iter;
    private BugAnnotation next;

    public FilteringAnnotationIterator(Iterator<BugAnnotation> iter) {
        this.iter = iter;
        this.next = null;
    }

    public boolean hasNext() {
        findNext();
        return next != null;
    }

    public BugAnnotation next() {
        findNext();
        if (next == null)
            throw new NoSuchElementException();
        BugAnnotation result = next;
        next = null;
        return result;
    }

    public void remove() {
        throw new UnsupportedOperationException();
    }

    private void findNext() {
        while (next == null) {
            if (!iter.hasNext())
                break;
            BugAnnotation candidate = iter.next();
            if (!isBoring(candidate)) {
                next = candidate;
                break;
            }
        }
    }

}

private boolean isBoring(BugAnnotation annotation) {
    return !annotation.isSignificant();
}

private static int compareNullElements(Object a, Object b) {
    if (a != null)
        return 1;
    else if (b != null)
        return -1;
    else
        return 0;
}

private static String getCode(String pattern) {
    int sep = pattern.indexOf('_');
    if (sep < 0)
        return "";
    return pattern.substring(0, sep);
}

public int compare(BugInstance lhs, BugInstance rhs) {
    // Attributes of BugInstance.
    // Compare abbreviation 
    // Compare class and method annotations (ignoring line numbers).
    // Compare field annotations.

    int cmp;

    BugPattern lhsPattern = lhs.getBugPattern();
    BugPattern rhsPattern = rhs.getBugPattern();

    if (lhsPattern == null || rhsPattern == null) {
        // One of the patterns is missing.
        // However, we can still accurately match by abbrev (usually) by comparing
        // the part of the type before the first '_' character.
        // This is almost always equivalent to the abbrev.

        String lhsCode = getCode(lhs.getType());
        String rhsCode = getCode(rhs.getType());

        if ((cmp = lhsCode.compareTo(rhsCode)) != 0) {
            return cmp;
        }
    } else {
        // Compare by abbrev instead of type. The specific bug type can change
        // (e.g., "definitely null" to "null on simple path").  Also, we often
        // change bug pattern types from one version of FindBugs to the next.
        //
        // Source line and field name are still matched precisely, so this shouldn't
        // cause loss of precision.
        if ((cmp = lhsPattern.getAbbrev().compareTo(rhsPattern.getAbbrev())) != 0)
            return cmp;
        if (isExactBugPatternMatch() && (cmp = lhsPattern.getType().compareTo(rhsPattern.getType())) != 0)
            return cmp;
    }




    if (comparePriorities) {
        cmp = lhs.getPriority() - rhs.getPriority();
        if (cmp != 0) return cmp;
    }


    Iterator<BugAnnotation> lhsIter = new FilteringAnnotationIterator(lhs.annotationIterator());
    Iterator<BugAnnotation> rhsIter = new FilteringAnnotationIterator(rhs.annotationIterator());

    while (lhsIter.hasNext() && rhsIter.hasNext()) {
        BugAnnotation lhsAnnotation = lhsIter.next();
        BugAnnotation rhsAnnotation = rhsIter.next();

        // Different annotation types obviously cannot be equal,
        // so just compare by class name.
        if (lhsAnnotation.getClass() != rhsAnnotation.getClass())
            return lhsAnnotation.getClass().getName().compareTo(rhsAnnotation.getClass().getName());

        if (lhsAnnotation.getClass() == ClassAnnotation.class) {
            // ClassAnnotations should have their class names rewritten to
            // handle moved and renamed classes.

            String lhsClassName = classNameRewriter.rewriteClassName(
                    ((ClassAnnotation)lhsAnnotation).getClassName());
            String rhsClassName = classNameRewriter.rewriteClassName(
                    ((ClassAnnotation)rhsAnnotation).getClassName());

            cmp = lhsClassName.compareTo(rhsClassName);
            if (cmp != 0)
                return cmp;
        } else if(lhsAnnotation.getClass() == MethodAnnotation.class ) {
            // Rewrite class names in MethodAnnotations
            MethodAnnotation lhsMethod = ClassNameRewriterUtil.convertMethodAnnotation(
                    classNameRewriter, (MethodAnnotation) lhsAnnotation);
            MethodAnnotation rhsMethod = ClassNameRewriterUtil.convertMethodAnnotation(
                    classNameRewriter, (MethodAnnotation) rhsAnnotation);

            cmp = lhsMethod.compareTo(rhsMethod);
            if (cmp != 0)
                return cmp;

        } else if(lhsAnnotation.getClass() == FieldAnnotation.class) {
            // Rewrite class names in FieldAnnotations
            FieldAnnotation lhsField = ClassNameRewriterUtil.convertFieldAnnotation(
                    classNameRewriter, (FieldAnnotation) lhsAnnotation);
            FieldAnnotation rhsField = ClassNameRewriterUtil.convertFieldAnnotation(
                    classNameRewriter, (FieldAnnotation) rhsAnnotation);

            cmp = lhsField.compareTo(rhsField);
            if (cmp != 0)
                return cmp;
        } else if(lhsAnnotation.getClass() == StringAnnotation.class) {
            // Rewrite class names in FieldAnnotations
            String lhsString = ((StringAnnotation)lhsAnnotation).getValue();
            String rhsString = ((StringAnnotation)rhsAnnotation).getValue();
            cmp = lhsString.compareTo(rhsString);
            if (cmp != 0)
                return cmp;
        } else if(lhsAnnotation.getClass() == LocalVariableAnnotation.class) {
            // Rewrite class names in FieldAnnotations
            String lhsName = ((LocalVariableAnnotation)lhsAnnotation).getName();
            String rhsName = ((LocalVariableAnnotation)rhsAnnotation).getName();
            if (lhsName.equals("?") && rhsName.equals("?"))
                continue;
            cmp = lhsName.compareTo(rhsName);
            if (cmp != 0)
                return cmp;
        } else if(lhsAnnotation.getClass() == TypeAnnotation.class) {
            // Rewrite class names in FieldAnnotations
            String lhsType = ((TypeAnnotation)lhsAnnotation).getTypeDescriptor();
            String rhsType = ((TypeAnnotation)rhsAnnotation).getTypeDescriptor();
            lhsType = ClassNameRewriterUtil.rewriteSignature(classNameRewriter, lhsType);
            rhsType = ClassNameRewriterUtil.rewriteSignature(classNameRewriter, rhsType);
            cmp = lhsType.compareTo(lhsType);
            if (cmp != 0)
                return cmp;
        } else if(lhsAnnotation.getClass() == IntAnnotation.class) {
            // Rewrite class names in FieldAnnotations
            int lhsValue = ((IntAnnotation)lhsAnnotation).getValue();
            int rhsValue = ((IntAnnotation)rhsAnnotation).getValue();
            cmp = lhsValue - rhsValue;
            if (cmp != 0)
                return cmp;
        } else if (isBoring(lhsAnnotation)) {
            throw new IllegalStateException("Impossible");
        } else
            throw new IllegalStateException("Unknown annotation type: " + lhsAnnotation.getClass().getName());
    }

    if (rhsIter.hasNext())
        return -1;
    else if (lhsIter.hasNext())
        return 1;
    else
        return 0;
}

/**
 * @param exactBugPatternMatch The exactBugPatternMatch to set.
 */
public void setExactBugPatternMatch(boolean exactBugPatternMatch) {
    this.exactBugPatternMatch = exactBugPatternMatch;
}

/**
 * @return Returns the exactBugPatternMatch.
 */
public boolean isExactBugPatternMatch() {
    return exactBugPatternMatch;
}

}

Build fails

Build fails. It seems that it is related to solidity parser.

/mainsol.py solidity.y
bison -d pt_solidity.y -o pt_solidity.tab.cc -v -g
pt_solidity.y:213.9-15: syntax error, unexpected identifier, expecting string
make[1]: *** [pt_solidity.tab.cc] Error 1
make: *** [TARGET] Error 2

Building error

Hi , I got this error when running build.sh:
rm -f *.pyc make -C simple clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/simple' rm -f *.o lex.yy.cc pt_c.tab* pt_c.y head.cc c_ptgen make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/simple' make -C gcc clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc' rm -f *.o lex.yy.cc pt_c.tab* pt_c.y head.cc gccptgen.a make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc' make -C java clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/java' rm -f *.o lex.yy.cc pt_j.tab* pt_j.y head.cc javaptgen.a make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/java' make -C php5 clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/php5' rm -f *.o lex.yy.cc pt_zend_language_parser.tab* pt_zend_language_parser.y head.cc phpptgen.a make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/php5' make -C sol clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/sol' rm -f *.o lex.yy.cc pt_solidity.* head.cc solidityptgen.a make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/sol' make -C gcc make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc' ./mainc.py c.y Traceback (most recent call last): File "/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc/./mainc.py", line 43, in <module> import YaccParser,YaccLexer File "/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc/../YaccParser.py", line 8 False = 0 ^^^^^ SyntaxError: cannot assign to False make[1]: *** [Makefile:62: pt_c.y] Error 1 make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc' make: *** [Makefile:35: TARGET] Error 2 Error: ptgen make failed. Exit. Error: ptgen make failed. Deckard build fails.
it seemed that YaccParser.py assigned to False, which is not accepted in python.
Did I have the wrong environment or something went wrong ?

Clone detection on sample fails(?)

After I put my directory into the config in the sample directory, I can run the clone detection but I get the following output:

= Vector clustering w/ MIN_TOKENS=30, STRIDE=2, SIMILARITY=0.95 ...

grouping: vectors/vdb_30_2 with distance=5,477226...Done grouping 30 2 5,477226. See >groups in vectors/vdb_30_2_g[0-9]_5,477226_30
paramsetting: 30 2 0.95 ...Error: paramsetting failure: no vector group found: 30 2 0.95
Error: problem in vec clustering step. Stop and check logs in times/

So I'm not sure I can trust what is output in clusters/post_cluster...

What is wrong?

Thanks,
Stefan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.