skyhover / deckard Goto Github PK
View Code? Open in Web Editor NEWCode clone detection; clone-related bug detection; sematic clone analysis
License: Other
Code clone detection; clone-related bug detection; sematic clone analysis
License: Other
Recently, I try deckard to find bugs in clone code. I find it not work well for the following java file. I set
MIN_TOKENS='15' STRIDE='2' SIMILARITY='0.8' .
According to the FSE 07 it is supposed to find out the bug.
The bug line is:
cmp = lhsType.compareTo(lhsType);
if (cmp != 0)
return cmp;
it looks similar several line ahead:
cmp = lhsName.compareTo(rhsName);
if (cmp != 0)
return cmp;
Can anyone help me?
public class VersionInsensitiveBugComparator implements WarningComparator {
private ClassNameRewriter classNameRewriter = IdentityClassNameRewriter.instance();
private boolean exactBugPatternMatch = true;
private boolean comparePriorities = false;
public VersionInsensitiveBugComparator() {
}
public void setClassNameRewriter(ClassNameRewriter classNameRewriter) {
this.classNameRewriter = classNameRewriter;
}
public void setComparePriorities(boolean b) {
comparePriorities = b;
}
/**
* Wrapper for BugAnnotation iterators, which filters out
* annotations we don't care about.
*/
private class FilteringAnnotationIterator implements Iterator<BugAnnotation> {
private Iterator<BugAnnotation> iter;
private BugAnnotation next;
public FilteringAnnotationIterator(Iterator<BugAnnotation> iter) {
this.iter = iter;
this.next = null;
}
public boolean hasNext() {
findNext();
return next != null;
}
public BugAnnotation next() {
findNext();
if (next == null)
throw new NoSuchElementException();
BugAnnotation result = next;
next = null;
return result;
}
public void remove() {
throw new UnsupportedOperationException();
}
private void findNext() {
while (next == null) {
if (!iter.hasNext())
break;
BugAnnotation candidate = iter.next();
if (!isBoring(candidate)) {
next = candidate;
break;
}
}
}
}
private boolean isBoring(BugAnnotation annotation) {
return !annotation.isSignificant();
}
private static int compareNullElements(Object a, Object b) {
if (a != null)
return 1;
else if (b != null)
return -1;
else
return 0;
}
private static String getCode(String pattern) {
int sep = pattern.indexOf('_');
if (sep < 0)
return "";
return pattern.substring(0, sep);
}
public int compare(BugInstance lhs, BugInstance rhs) {
// Attributes of BugInstance.
// Compare abbreviation
// Compare class and method annotations (ignoring line numbers).
// Compare field annotations.
int cmp;
BugPattern lhsPattern = lhs.getBugPattern();
BugPattern rhsPattern = rhs.getBugPattern();
if (lhsPattern == null || rhsPattern == null) {
// One of the patterns is missing.
// However, we can still accurately match by abbrev (usually) by comparing
// the part of the type before the first '_' character.
// This is almost always equivalent to the abbrev.
String lhsCode = getCode(lhs.getType());
String rhsCode = getCode(rhs.getType());
if ((cmp = lhsCode.compareTo(rhsCode)) != 0) {
return cmp;
}
} else {
// Compare by abbrev instead of type. The specific bug type can change
// (e.g., "definitely null" to "null on simple path"). Also, we often
// change bug pattern types from one version of FindBugs to the next.
//
// Source line and field name are still matched precisely, so this shouldn't
// cause loss of precision.
if ((cmp = lhsPattern.getAbbrev().compareTo(rhsPattern.getAbbrev())) != 0)
return cmp;
if (isExactBugPatternMatch() && (cmp = lhsPattern.getType().compareTo(rhsPattern.getType())) != 0)
return cmp;
}
if (comparePriorities) {
cmp = lhs.getPriority() - rhs.getPriority();
if (cmp != 0) return cmp;
}
Iterator<BugAnnotation> lhsIter = new FilteringAnnotationIterator(lhs.annotationIterator());
Iterator<BugAnnotation> rhsIter = new FilteringAnnotationIterator(rhs.annotationIterator());
while (lhsIter.hasNext() && rhsIter.hasNext()) {
BugAnnotation lhsAnnotation = lhsIter.next();
BugAnnotation rhsAnnotation = rhsIter.next();
// Different annotation types obviously cannot be equal,
// so just compare by class name.
if (lhsAnnotation.getClass() != rhsAnnotation.getClass())
return lhsAnnotation.getClass().getName().compareTo(rhsAnnotation.getClass().getName());
if (lhsAnnotation.getClass() == ClassAnnotation.class) {
// ClassAnnotations should have their class names rewritten to
// handle moved and renamed classes.
String lhsClassName = classNameRewriter.rewriteClassName(
((ClassAnnotation)lhsAnnotation).getClassName());
String rhsClassName = classNameRewriter.rewriteClassName(
((ClassAnnotation)rhsAnnotation).getClassName());
cmp = lhsClassName.compareTo(rhsClassName);
if (cmp != 0)
return cmp;
} else if(lhsAnnotation.getClass() == MethodAnnotation.class ) {
// Rewrite class names in MethodAnnotations
MethodAnnotation lhsMethod = ClassNameRewriterUtil.convertMethodAnnotation(
classNameRewriter, (MethodAnnotation) lhsAnnotation);
MethodAnnotation rhsMethod = ClassNameRewriterUtil.convertMethodAnnotation(
classNameRewriter, (MethodAnnotation) rhsAnnotation);
cmp = lhsMethod.compareTo(rhsMethod);
if (cmp != 0)
return cmp;
} else if(lhsAnnotation.getClass() == FieldAnnotation.class) {
// Rewrite class names in FieldAnnotations
FieldAnnotation lhsField = ClassNameRewriterUtil.convertFieldAnnotation(
classNameRewriter, (FieldAnnotation) lhsAnnotation);
FieldAnnotation rhsField = ClassNameRewriterUtil.convertFieldAnnotation(
classNameRewriter, (FieldAnnotation) rhsAnnotation);
cmp = lhsField.compareTo(rhsField);
if (cmp != 0)
return cmp;
} else if(lhsAnnotation.getClass() == StringAnnotation.class) {
// Rewrite class names in FieldAnnotations
String lhsString = ((StringAnnotation)lhsAnnotation).getValue();
String rhsString = ((StringAnnotation)rhsAnnotation).getValue();
cmp = lhsString.compareTo(rhsString);
if (cmp != 0)
return cmp;
} else if(lhsAnnotation.getClass() == LocalVariableAnnotation.class) {
// Rewrite class names in FieldAnnotations
String lhsName = ((LocalVariableAnnotation)lhsAnnotation).getName();
String rhsName = ((LocalVariableAnnotation)rhsAnnotation).getName();
if (lhsName.equals("?") && rhsName.equals("?"))
continue;
cmp = lhsName.compareTo(rhsName);
if (cmp != 0)
return cmp;
} else if(lhsAnnotation.getClass() == TypeAnnotation.class) {
// Rewrite class names in FieldAnnotations
String lhsType = ((TypeAnnotation)lhsAnnotation).getTypeDescriptor();
String rhsType = ((TypeAnnotation)rhsAnnotation).getTypeDescriptor();
lhsType = ClassNameRewriterUtil.rewriteSignature(classNameRewriter, lhsType);
rhsType = ClassNameRewriterUtil.rewriteSignature(classNameRewriter, rhsType);
cmp = lhsType.compareTo(lhsType);
if (cmp != 0)
return cmp;
} else if(lhsAnnotation.getClass() == IntAnnotation.class) {
// Rewrite class names in FieldAnnotations
int lhsValue = ((IntAnnotation)lhsAnnotation).getValue();
int rhsValue = ((IntAnnotation)rhsAnnotation).getValue();
cmp = lhsValue - rhsValue;
if (cmp != 0)
return cmp;
} else if (isBoring(lhsAnnotation)) {
throw new IllegalStateException("Impossible");
} else
throw new IllegalStateException("Unknown annotation type: " + lhsAnnotation.getClass().getName());
}
if (rhsIter.hasNext())
return -1;
else if (lhsIter.hasNext())
return 1;
else
return 0;
}
/**
* @param exactBugPatternMatch The exactBugPatternMatch to set.
*/
public void setExactBugPatternMatch(boolean exactBugPatternMatch) {
this.exactBugPatternMatch = exactBugPatternMatch;
}
/**
* @return Returns the exactBugPatternMatch.
*/
public boolean isExactBugPatternMatch() {
return exactBugPatternMatch;
}
}
Only PHP 5 is supported at the moment.
When I run the bugfiltering command, the results showed "Command line options for filters IDs not implemented" and "Cannot open file : src/AbstractAsyncTableRendering.java". The command I used is "scripts/bugdetect/bugfiltering samples/clusters/post_cluster_vdb_50_0_allg_0.95_30 java > bug_result". Do you have any idea about how to solve the problem? Thanks.
Hi, on a current Mac OS system, the build fails because malloc.h is not at the place you expect that to be. I fixed it for me by a symbolic link but that can only be a workaround. It would be better fixed in the build script.
Hi
I am trying to compile Deckard on a Linux system, but it stops, because it tries to find "dot2d".
Is this some kind of third party lib I should add? If so were should it be placed?
Here is a part of the log:
Everything cool above here:
In braces I translated the error from German to English.
Cheers and Thanks
Hi and thanks for the tool!
I set up a config file to test my C project, following the one reported as sample in scripts/clonedect
, but I obtain this error after running ./deckard.sh
:
==== Configuration checking...Error: missing file ~/Deckard-rel2.0solidity/scripts/clonedetect/src/main/cvecgen. Check your config
any suggestion?
Thanks in advance
I receive this error message when running on sample code in /Deckard/samples/src
DECKARD--A Tree-Based Code Clone Detection Toolkit.
==== Configuration checking...Done.
==== Start clone detection ====
Vector generation.../home/shijing/ra/codeReuse/Deckard/src/main/jvecgen *.java
vgen: 30 2 ...Done. Log: times/vgen_30_2
...deleting intermediate vector files...Done
vgen: 30 0 ...Done. Log: times/vgen_30_0
...deleting intermediate vector files...Done
vgen: 50 2 ...Done. Log: times/vgen_50_2
...deleting intermediate vector files...Done
vgen: 50 0 ...Done. Log: times/vgen_50_0
...deleting intermediate vector files...Done
Error: problem in vec generator step. Stop and check logs in times/
Did anyone encounter similar situation?
Hi,
Thank you for your great tool.
I am currently using Deckard for my research. However, when I run it multiple times with the same set of hyperparameters on the same dataset, I get different results. This affects the reproducibility of my research. Any chance to set seed?
Kind regards.
Ln 52 scripts/bugfiltering
filterpath = os.environ.get("DECKARD_DIR")
The bash crashes, stating it cannot find the Deckard path.
Build fails. It seems that it is related to solidity parser.
/mainsol.py solidity.y
bison -d pt_solidity.y -o pt_solidity.tab.cc -v -g
pt_solidity.y:213.9-15: syntax error, unexpected identifier, expecting string
make[1]: *** [pt_solidity.tab.cc] Error 1
make: *** [TARGET] Error 2
In Mac OS(Mojave 10.14.5) and Linux(Ubuntu 18.04.2 LTS), cannot build.
I command $sh build.sh
in src/main/
rm -f *.pyc
make -C simple clean
rm -f .o lex.yy.cc pt_c.tab pt_c.y head.cc c_ptgen
make -C gcc clean
rm -f .o lex.yy.cc pt_c.tab pt_c.y head.cc gccptgen.a
make -C java clean
rm -f .o lex.yy.cc pt_j.tab pt_j.y head.cc javaptgen.a
make -C php5 clean
rm -f .o lex.yy.cc pt_zend_language_parser.tab pt_zend_language_parser.y head.cc phpptgen.a
make -C sol clean
rm -f .o lex.yy.cc pt_solidity. head.cc solidityptgen.a
make -C gcc
./mainc.py c.y
Traceback (most recent call last):
File "./mainc.py", line 43, in
import YaccParser,YaccLexer
File "../YaccParser.py", line 77
except antlr.RecognitionException, ex:
^
SyntaxError: invalid syntax
make[1]: *** [pt_c.y] Error 1
make: *** [TARGET] Error 2
Error: ptgen make failed. Exit.
Error: ptgen make failed. Deckard build fails.
rm -f *.pyc
make -C simple clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/simple'
rm -f .o lex.yy.cc pt_c.tab pt_c.y head.cc c_ptgen
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/simple'
make -C gcc clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/gcc'
rm -f .o lex.yy.cc pt_c.tab pt_c.y head.cc gccptgen.a
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/gcc'
make -C java clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/java'
rm -f .o lex.yy.cc pt_j.tab pt_j.y head.cc javaptgen.a
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/java'
make -C php5 clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/php5'
rm -f .o lex.yy.cc pt_zend_language_parser.tab pt_zend_language_parser.y head.cc phpptgen.a
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/php5'
make -C sol clean
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/sol'
rm -f .o lex.yy.cc pt_solidity. head.cc solidityptgen.a
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/sol'
make -C gcc
make[1]: Entering directory '/home/imseongbin/Deckard/src/ptgen/gcc'
./mainc.py c.y
bison -d pt_c.y -o pt_c.tab.cc
make[1]: bison: Command not found
Makefile:59: recipe for target 'pt_c.tab.cc' failed
make[1]: *** [pt_c.tab.cc] Error 127
make[1]: Leaving directory '/home/imseongbin/Deckard/src/ptgen/gcc'
Makefile:35: recipe for target 'TARGET' failed
make: *** [TARGET] Error 2
Error: ptgen make failed. Exit.
Error: ptgen make failed. Deckard build fails.
plz, help me.
I followed the steps what README.md say.
But when I installed the Deckard,I want to test the clone detection...
I create a "config" file in the path /home/xx/projects/Deckard,And the content is same as "config" in /sample,
The configuration file is as follows:
FILE_PATTERN='*.java' # used in the 'find' command below
#where are the source files?
SRC_DIR="src"
PDG_DIR="ddgs" # used by Deckard2 for 'find $SRC_DIR -ipath "*/$PDG_DIR/$FILE_PATTERN"'
AST_DIR="asts" # each pdg should have an ast with the same name in a different folder
#where are node definition files? used by Deckard2
TYPE_FILE='/home/ly/projects/Deckard/testdata/deckard3/AstNodeTypeNamesIDs.txt'
RELEVANT_NODEFILE='/home/ly/projects/Deckard/testdata/deckard3/AstRelevantNodes.txt'
LEAF_NODEFILE='/home/ly/projects/Deckard/testdata/deckard3/AstLeafNodes.txt'
PARENT_NODEFILE='/home/ly/projects/Deckard/testdata/deckard3/AstParentNodes.txt'
#####The above are for Deckard2 only #####
#where is Deckard?
DECKARD_DIR="/home/ly/projects/Deckard"
#clone parameters; refer to paper.
MIN_TOKENS='30 50' # can be a sequence of integers
STRIDE='2 0' # can be a sequence of integers
SIMILARITY='1.0 0.95' # can be a sequence of values <= 1
#DISTANCE='0 0.70711 1.58114 2.236'
###########################################################
#Where to store result files?
#where to output generated vectors?
VECTOR_DIR="vectors"
#where to output detected clone clusters?
CLUSTER_DIR="clusters"
#where to output timing/debugging info?
TIME_DIR="times"
##########################################################
#where are several programs we need?
#where is the vector generator?
VGEN_EXEC="$DECKARD_DIR/src"
case $FILE_PATTERN in
*.dot )
VGEN_EXEC="$VGEN_EXEC/dot2d/dotvgen" ;; # for Deckard2 dot only
*.java )
VGEN_EXEC="$VGEN_EXEC/main/jvecgen" ;;
*.php )
VGEN_EXEC="$VGEN_EXEC/main/phpvecgen" ;;
*.c | *.h )
VGEN_EXEC="$VGEN_EXEC/main/cvecgen" ;;
MAX_PROCS=8
GROUPING_S='30' # should be a single value
#GROUPING_D
#GROUPING_C
export DECKARD_DIR
export FILE_PATTERN
export SRC_DIR
export PDG_DIR
export AST_DIR
export TYPE_FILE
export RELEVANT_NODEFILE
export LEAF_NODEFILE
export PARENT_NODEFILE
export VECTOR_DIR
export TIME_DIR
export CLUSTER_DIR
export VGEN_EXEC
export GROUPING_EXEC
export CLUSTER_EXEC
export POSTPRO_EXEC
export SRC2HTM_EXEC
export SRC2HTM_OPTS
export MIN_TOKENS
export STRIDE
#export DISTANCE
export SIMILARITY
export GROUPING_S
export GROUPING_D
export GROUPING_C
export MAX_PROCS
But when I follow the next step to run,there will be a error.
`ly@ubuntu:~/projects/Deckard$ sh /home/ly/projects/Deckard/scripts/clonedetect/deckard.sh
DECKARD--A Tree-Based Code Clone Detection Toolkit.
/home/ly/projects/Deckard/scripts/clonedetect/deckard.sh: 4: /home/ly/projects/Deckard/scripts/clonedetect/deckard.sh: [[: not found
==== Configuration checking.../home/ly/projects/Deckard/scripts/clonedetect/deckard.sh: 81: /home/ly/projects/Deckard/scripts/clonedetect/configure: [[: not found
Error: no config file in current directory
I don't know how to fix it.....
Can someone give me some advice,Thx
Most of the Yacc parser (and maybe other portions) were written in Python 2. Since Python 2 was deprecated in 2020, we should update the codebase to use Python 3.
Hi , I got this error when running build.sh:
rm -f *.pyc make -C simple clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/simple' rm -f *.o lex.yy.cc pt_c.tab* pt_c.y head.cc c_ptgen make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/simple' make -C gcc clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc' rm -f *.o lex.yy.cc pt_c.tab* pt_c.y head.cc gccptgen.a make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc' make -C java clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/java' rm -f *.o lex.yy.cc pt_j.tab* pt_j.y head.cc javaptgen.a make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/java' make -C php5 clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/php5' rm -f *.o lex.yy.cc pt_zend_language_parser.tab* pt_zend_language_parser.y head.cc phpptgen.a make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/php5' make -C sol clean make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/sol' rm -f *.o lex.yy.cc pt_solidity.* head.cc solidityptgen.a make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/sol' make -C gcc make[1]: Entering directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc' ./mainc.py c.y Traceback (most recent call last): File "/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc/./mainc.py", line 43, in <module> import YaccParser,YaccLexer File "/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc/../YaccParser.py", line 8 False = 0 ^^^^^ SyntaxError: cannot assign to False make[1]: *** [Makefile:62: pt_c.y] Error 1 make[1]: Leaving directory '/home/ayf/Deckard-rel2.0solidity/src/ptgen/gcc' make: *** [Makefile:35: TARGET] Error 2 Error: ptgen make failed. Exit. Error: ptgen make failed. Deckard build fails.
it seemed that YaccParser.py assigned to False, which is not accepted in python.
Did I have the wrong environment or something went wrong ?
Hi.
I want to build the Deckard but got error in Error: ptgen make failed. Exit.Error: ptgen make failed. Deckard build fails.
I have tried the solutions in other issues like install the newest version of packages, edit the file /src/ptgen/gcc/mainc.py to use python2 .
I also changed my OS to the Ubuntu 12.
But still get the errors below.
Can anyone help me? Thanks a lot!
syu@ubuntu:~/workspaces/Deckard/src/main$ sudo ./build.sh
rm -f *.pyc
make -C simple clean
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/simple' rm -f *.o lex.yy.cc pt_c.tab* pt_c.y head.cc c_ptgen make[1]: Leaving directory
/home/syu/workspaces/Deckard/src/ptgen/simple'
make -C gcc clean
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/gcc' rm -f *.o lex.yy.cc pt_c.tab* pt_c.y head.cc gccptgen.a make[1]: Leaving directory
/home/syu/workspaces/Deckard/src/ptgen/gcc'
make -C java clean
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/java' rm -f *.o lex.yy.cc pt_j.tab* pt_j.y head.cc javaptgen.a make[1]: Leaving directory
/home/syu/workspaces/Deckard/src/ptgen/java'
make -C php5 clean
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/php5' rm -f *.o lex.yy.cc pt_zend_language_parser.tab* pt_zend_language_parser.y head.cc phpptgen.a make[1]: Leaving directory
/home/syu/workspaces/Deckard/src/ptgen/php5'
make -C sol clean
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/sol' rm -f *.o lex.yy.cc pt_solidity.* head.cc solidityptgen.a make[1]: Leaving directory
/home/syu/workspaces/Deckard/src/ptgen/sol'
make -C gcc
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/gcc' ./mainc.py c.y bison -d pt_c.y -o pt_c.tab.cc pt_c.y: conflicts: 11 shift/reduce flex -olex.yy.cc c.l g++ -O3 -I../../include -c -o lex.yy.o lex.yy.cc g++ -O3 -I../../include -c -o pt_c.tab.o pt_c.tab.cc pt_c.tab.cc: In function ‘int yyparse()’: pt_c.tab.cc:13685:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] pt_c.tab.cc:13827:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] g++ -O3 -I../../include -c -o head.o head.cc ar -csrv gccptgen.a lex.yy.o pt_c.tab.o head.o a - lex.yy.o a - pt_c.tab.o a - head.o make[1]: Leaving directory
/home/syu/workspaces/Deckard/src/ptgen/gcc'
make -C java
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/java' ./mainj.py j.y bison -d pt_j.y -o pt_j.tab.cc pt_j.y: conflicts: 24 shift/reduce, 259 reduce/reduce flex -olex.yy.cc j.l g++ -O3 -I../../include -c -o lex.yy.o lex.yy.cc g++ -O3 -I../../include -c -o pt_j.tab.o pt_j.tab.cc pt_j.tab.cc: In function ‘int yyparse()’: pt_j.tab.cc:17408:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] pt_j.tab.cc:17550:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] g++ -O3 -I../../include -c -o head.o head.cc ar -csrv javaptgen.a lex.yy.o pt_j.tab.o head.o a - lex.yy.o a - pt_j.tab.o a - head.o make[1]: Leaving directory
/home/syu/workspaces/Deckard/src/ptgen/java'
make -C php5
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/php5' ./mainphp.py zend_language_parser.y sed -i -e "s/'\"'/'\\\\\"'/" head.cc bison -d pt_zend_language_parser.y -o pt_zend_language_parser.tab.cc flex -i -olex.yy.cc zend_language_scanner.l g++ -O3 -I../../include -c -o lex.yy.o lex.yy.cc zend_language_scanner.l: In function ‘int yylex(YYSTYPE*)’: zend_language_scanner.l:906:67: warning: format ‘%s’ expects argument of type ‘char*’, but argument 3 has type ‘int’ [-Wformat] zend_language_scanner.l:906:67: warning: format ‘%d’ expects a matching ‘int’ argument [-Wformat] lex.yy.cc:4873:57: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘int yy_get_next_buffer()’: lex.yy.cc:4894:61: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:4962:51: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:4975:3: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:4975:3: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:5005:68: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘void yyunput(int, char*)’: lex.yy.cc:5102:54: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘yy_buffer_state* yy_create_buffer(FILE*, int)’: lex.yy.cc:5261:65: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:5270:65: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘void yyensure_buffer_stack()’: lex.yy.cc:5427:71: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:5447:71: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘yy_buffer_state* yy_scan_buffer(char*, yy_size_t)’: lex.yy.cc:5473:63: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘yy_buffer_state* yy_scan_bytes(const char*, int)’: lex.yy.cc:5522:62: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc:5531:51: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘void yy_push_state(int)’: lex.yy.cc:5557:68: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] lex.yy.cc: In function ‘void yy_pop_state()’: lex.yy.cc:5568:53: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] g++ -O3 -I../../include -c -o pt_zend_language_parser.tab.o pt_zend_language_parser.tab.cc pt_zend_language_parser.tab.cc: In function ‘int yyparse()’: pt_zend_language_parser.tab.cc:11522:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] pt_zend_language_parser.tab.cc:11664:35: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings] g++ -O3 -I../../include -c -o head.o head.cc ar -csrv phpptgen.a lex.yy.o pt_zend_language_parser.tab.o head.o a - lex.yy.o a - pt_zend_language_parser.tab.o a - head.o make[1]: Leaving directory
/home/syu/workspaces/Deckard/src/ptgen/php5'
make -C sol
make[1]: Entering directory/home/syu/workspaces/Deckard/src/ptgen/sol' ./mainsol.py solidity.y bison -d pt_solidity.y -o pt_solidity.tab.cc -v -g pt_solidity.y:255.1-11: invalid directive:
%precedence'
pt_solidity.y:254.8-10: %type redeclaration for UFIXED
pt_solidity.y:231.62-67: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for FIXED
pt_solidity.y:231.56-60: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for BYTE
pt_solidity.y:231.51-54: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for BYTES
pt_solidity.y:231.45-49: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for UINT
pt_solidity.y:231.40-43: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for INT
pt_solidity.y:231.36-38: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for VAR
pt_solidity.y:231.32-34: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for STRING
pt_solidity.y:231.25-30: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for BOOL
pt_solidity.y:231.20-23: previous declaration
pt_solidity.y:254.8-10: %type redeclaration for ADDRESS
pt_solidity.y:231.12-18: previous declaration
pt_solidity.y:270.1-11: invalid directive:%precedence' pt_solidity.y:269.8-10: %type redeclaration for DELETE pt_solidity.y:233.39-44: previous declaration pt_solidity.y:269.8-10: %type redeclaration for AFTER pt_solidity.y:233.33-37: previous declaration pt_solidity.y:273.1-11: invalid directive:
%precedence'
make[1]: *** [pt_solidity.tab.cc] Error 1
make[1]: Leaving directory `/home/syu/workspaces/Deckard/src/ptgen/sol'
make: *** [TARGET] Error 2
Error: ptgen make failed. Exit.
Error: ptgen make failed. Deckard build fails.
I noticed that the Deckard 2 config parameters for TYPE_FILE, RELEVANT_NODEFILE, LEAF_NODEFILE and PARENT_NODEFILE of the sample config point to the nonexistent directory Deckard/testdata.
I assume that they are pretty important, as the detection outputs a lot of garbage if they are not changed.
What is supposed to be in these files? I assume this is about the node types for the ASTs, but I cant figure out how to specify them.
I'm using Java and want to run Deckard on BigCloneEval. The clones should have method level granularity.
It is especially important that I can configure Deckard to prune irrelevant NODE types early, as I want to run a performance analysis and comparison, and it doesn't feel fair to run Deckard on a lot more ASTs than necessary.
These items need to be installed before 'build.sh' is executed...
(sudo apt-get install )
Bison
Flex
Hi,
I am trying to detect clones from a slice, how can I use Deckard to detect clones from a slice?
Thanks!
Hi,
I executed Deckard to detect clones on a dataset of 47k source files. However, after a day of execution I faced with the an error. following,, you can find the content of different log files.
Clustering 'vectors/vdb_50_4_g9_2.50998_30_100000' 6.513064 ...
/home/local/SAIL/amir/tasks/RQ2/RQ2.2/Deckard/src/lsh/bin/enumBuckets -R 6.513064 -M 7600000000 -b 2 -A -f vectors/vdb_50_4_g9_2.50998_30_100000 -c -p vectors/vdb_50_4_g9_2.50998_30_100000.param > clusters/cluster_vdb_50_4_g9_2.50998_30_100000
Warning: output all clones. Takes more time...
Warning: will compute parameters
Error: the structure supports at most 2097151 points (3238525 were specified).
real 2m58.162s
user 2m50.464s
sys 0m7.492s
cluster: Possible errors occurred with LSH. Check log: times/cluster_vdb_50_4_g9_2.50998_30_100000
paramsetting: 50 4 0.79 ...Looking for optimal parameters by Clustering 'vectors/vdb_50_4_g9_2.50998_30_100000' 6.513064 ...
/home/local/SAIL/amir/tasks/RQ2/RQ2.2/Deckard/src/lsh/bin/enumBuckets -R 6.513064 -M 7600000000 -b 2 -A -f vectors/vdb_50_4_g9_2.50998_30_100000 -c -p vectors/vdb_50_4_g9_2.50998_30_100000.param > clusters/cluster_vdb_50_4_g9_2.50998_30_100000
cluster: Possible errors occurred with LSH. Check log: times/cluster_vdb_50_4_g9_2.50998_30_100000
Error: paramsetting failure...exit.
grouping: vectors/vdb_50_4 with distance=2.50998...Total 7602630 vectors read in; 11282415 vectors dispatched into 57 ranges (actual groups may be many fewer).
real 410m12.610s
user 6m43.592s
sys 26m6.544s
Done grouping 50 4 2.50998. See groups in vectors/vdb_50_4_g[0-9]_2.50998_30
Note that I have sufficient memory for execution; Thus, I added two other conditions for the memory limit setting in both vecquery and vertical-param-batch files. The reason I increased the memory limit is that my vectors size is greater than 2G and I have no problem with the availability of enough memory. Now the conditions are like this:
# dumb (not flexible) memory limit setting
mem=`wc "$vdb" | awk '{printf("%.0f", $3/1024/1024+0.5)}'`
if [ $mem -lt 2 ]; then
mem=10000000
elif [ $mem -lt 5 ]; then
mem=20000000
elif [ $mem -lt 10 ]; then
mem=30000000
elif [ $mem -lt 20 ]; then
mem=60000000
elif [ $mem -lt 50 ]; then
mem=150000000
elif [ $mem -lt 100 ]; then
mem=300000000
elif [ $mem -lt 200 ]; then
mem=600000000
elif [ $mem -lt 500 ]; then
mem=900000000
elif [ $mem -lt 1024 ]; then
mem=1900000000
elif [ $mem -lt 2048 ]; then
mem=3800000000
elif [ $mem -lt 4096 ]; then # this condition is added by me
mem=7600000000
elif [ $mem -lt 8192 ]; then # this condition is added by me
mem=15200000000
else
echo "Error: Size of $vdb > 8G. I don't want to do it before you think of any optimization." | tee -a "$TIME_DIR/cluster_${vfile}"
exit 1;
fi
The parameters of deckard is set to the following values:
I attached the log files. please help me to mitigate this problem, I need your tool for my experiments.
deckard log.zip
After I put my directory into the config in the sample directory, I can run the clone detection but I get the following output:
= Vector clustering w/ MIN_TOKENS=30, STRIDE=2, SIMILARITY=0.95 ...
grouping: vectors/vdb_30_2 with distance=5,477226...Done grouping 30 2 5,477226. See >groups in vectors/vdb_30_2_g[0-9]_5,477226_30
paramsetting: 30 2 0.95 ...Error: paramsetting failure: no vector group found: 30 2 0.95
Error: problem in vec clustering step. Stop and check logs in times/
So I'm not sure I can trust what is output in clusters/post_cluster...
What is wrong?
Thanks,
Stefan
Error starts at line...
"make[1]: execvp: ./mainc.py: Permission denied"
and then ends at...
"make: *** No rule to make target ../ptgen/gcc/gccptgen.a', needed by
cvecgen'. Stop."
.....deleting intermediate vector files....Done
I have run Deckard on the code of about 30 java projects. The resulting cluster_vdb_50_0_allg_0.95_30 is not empty but the corresponding post_cluster_vdb_50_0_allg_0.95_30 file is empty. Why does this happen? Is it because there are too much suspicious clones in cluster file and then in the post-process all the clones are excluded leading to empty post_cluster file?
Hi, I'm getting:
a - token-counter.o
a - sq-tree.o
a - node-vec-gen.o
a - vector-output.o
a - vector-merger.o
a - tree-accessor.o
a - token-tree-map.o
a - clone-context-php.o
rm -f vectorsort dispatchvectors computeranges *~ *.o
gcc -O3 -O3 vectorsort.c -lm -o vectorsort
gcc -O3 -O3 dispatchvectors.c -lm -o dispatchvectors
gcc -O3 -O3 computeranges.c -lm -o computeranges
rm -f *.o cvecgen jvecgen cbugfilters jbugfilters out2html phpvecgen phpbugfilters out2xml cParseTreeMain jParseTreeMain phpParseTreeMain
g++ -o ptreeC.o -O3 -I../include -I../vgen/treeTra -c -DCLANG ptree.cc
make: *** No rule to make target '../ptgen/gcc/gccptgen.a', needed by 'cvecgen'. Stop.
Error: main make failed. Exit.
./build.sh 7.49s user 0.35s system 85% cpu 9.207 total
by just executing the build.sh in src/main
When I use the command: "cvecgen -i ../../src/dircolors.c -o tmp.vec --start-line-number 508 --end-line-number 508"
The output is "cvecgen: tree-accessor.C:81: static TreeVector* TreeAccessor::get_node_vector(Tree*): Assertion `attr_itr!=t->attributes.end()' failed."
Please refer to the attachment for the file dircolors.c.
I've got a problem when doing clone detecting with my C codes. The feed back is like this
"Error: problem in vec generator step. Stop and check logs in times/"
Could you tell me what might be the problem? Thanks a lot.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.