A Hybrid Binary Static Vulnerability Detection. Decompile by retdec
file tree:
- cfgdetect : the GCN model to detect vulnerability
- main.ipynb : train & test the GCN model
- gendata.py : generate data to model
- main.py : no use
- w2v : use gensim to generate w2vmodel used to present token in 256bit vector
- model_out : store the w2v model. imported in gendata.py
- token_out : store the token.txt used to generate w2v model
- dot2token.py : load and tokenlize .dot file, output in token_out
- word2vec.py : load token.txt and generate w2v model
- scrap.py : notice that good .dot files is more than bad .dot files. So scrap.py is used to match num of both
- decompile : use "retdec" to decompile .c or .cpp file
- dec_cwe.py : decompile .c or .cpp file, output = [ .dot file | .dc file | .dsm file ] in "CWExx_good" or "CWExx_bad" floder
- scrap.py : remove the useless files generated by the "retdec"