Giter Site home page Giter Site logo

python-security / pyt Goto Github PK

View Code? Open in Web Editor NEW
2.2K 68.0 238.0 3.28 MB

A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications

License: GNU General Public License v2.0

Python 99.99% Makefile 0.01%
pyt control-flow-graph static-analysis python python3 security static-code-analysis program-analysis fixed-point fixed-point-analysis

pyt's People

Contributors

adrianbn avatar alecxe avatar ankitxjoshi avatar bcaller avatar bchurchill avatar chrisgavin avatar davidoc avatar jwilk avatar kevinhock avatar kvnloo avatar omergunal avatar stannum-l avatar stefanmich avatar thalmann avatar the-compiler avatar tjdev7 avatar vinaygw avatar wchresta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyt's Issues

Add CodeClimate

E.g. If we look at https://github.com/trailofbits/protofuzz we can see the test coverage at the top and a link to code climate.

To fix this we more or less copy the codeclimate.yml and relevant parts of the top of the README.

(So the [easy] issues are good for new people who want to start contributing to look at.)

SyntaxError: Non-ASCII character

Due to https://github.com/python-security/pyt/blob/master/pyt/cfg.py#L21 we get

Kevins-MacBook-Pro-2:example kevinhock$ python cfg_example.py 
Traceback (most recent call last):
  File "cfg_example.py", line 5, in <module>
    from cfg import CFG, print_CFG, generate_ast
  File "/Users/kevinhock/kpyt/pyt/pyt/cfg.py", line 21
SyntaxError: Non-ASCII character '\xc2' in file /Users/kevinhock/kpyt/pyt/pyt/cfg.py on line 21, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

which I think is fixed by adding

# -*- coding: utf-8 -*-

to the top of cfg.py

But, I think it'd be better to just get a new CALL_IDENTIFIER, thoughts?

(Same holds for https://github.com/python-security/pyt/blob/master/pyt/base_cfg.py#L16)

Trim the "Reassigned in:" nodes to the ones that are relevant

So if we have the following code:

@app.route('/menu', methods=['POST'])
def menu():
    param = request.form['suggestion']
    command = 'echo ' + param + ' >> ' + 'menu.txt'
    hey = 'echo ' + param + ' >> ' + 'menu.txt'
    yo = 'echo ' + hey + ' >> ' + 'menu.txt'

    subprocess.call(command, shell=True)

    with open('menu.txt','r') as f:
        menu = f.read()

    return render_template('command_injection.html', menu=menu)

We show the vulnerability output as:

1 vulnerability found:
Vulnerability 1:
File: example/vulnerable_code/command_injection.py
 > User input at line 15, trigger word "form[": 
	param = request.form['suggestion']
Reassigned in: 
	File: example/vulnerable_code/command_injection.py
	 > Line 16: command = 'echo ' + param + ' >> ' + 'menu.txt'
	File: example/vulnerable_code/command_injection.py
	 > Line 17: hey = 'echo ' + param + ' >> ' + 'menu.txt'
	File: example/vulnerable_code/command_injection.py
	 > Line 18: yo = 'echo ' + hey + ' >> ' + 'menu.txt'
File: example/vulnerable_code/command_injection.py
 > reaches line 20, trigger word "subprocess.call(": 
	subprocess.call(command,shell=True)

Where we don't really care about Line 17 and 18 in the output, right?

I ran into this while doing #45, once I fix this then I can make the PR fixing both of them.

Search the code base for each repo

Right now we search overall on a string in github search
Search more specific on each repo
This could give better results and maybe lesser requests per minute
But this off course could also give more requests per minute as it would take two requests per repo

Support __init__ files

Just wanted to let people know I'm working on this now 👍 (so nobody shows up w/ a PR adding support for them in a few days.)

Uses of self.nodes[-1] cause false positives

I didn't realize this until I investigated the failure of the test_orelse test on my ugly branch, oh well.

Here are the relevant test case and output.

def does_this_kill_us(diff):
	return subprocess.call(diff, shell=True)

@app.route('/poc', methods=['POST'])
def poc():
	try:
	    value = None
	except ImportError:
	    value = request.args.get('foo')
	else:
	    does_this_kill_us(value)

output with -trim flag on.

Vulnerability 1:
File: example/example_inputs/try_orelse.py
 > User input at line 9, trigger word "get(": 
	value = request.args.get('foo')
Reassigned in: 
	File: example/example_inputs/try_orelse.py
	 > Line 11: temp_1_diff = value
	File: example/example_inputs/try_orelse.py
	 > Line 1: diff = temp_1_diff
File: example/example_inputs/try_orelse.py
 > reaches line 2, trigger word "subprocess.call(": 
	ret_does_this_kill_us = subprocess.call(diff,shell=True)

Where does it connect the basic block of the source to does_this_kill_us?
That would be in save_local_scope where self.nodes[-1] here is the last node that was added, which is Label: value = request.args.get('foo')

            previous_node = self.nodes[-1]
            r = RestoreNode(save_name + ' = ' + assignment.left_hand_side,
                            save_name, [assignment.left_hand_side],
                            line_number=line_number, path=self.filenames[-1])
            saved_scope_node = self.append_node(r)

            saved_variables.append(SavedVariable(LHS=save_name,
                                                 RHS=assignment.left_hand_side))
            previous_node.connect(saved_scope_node)

Okay, how should we fix it?
Well we currently only use the [-1] node when we're adding a function (in master this means a user-defined function, in my branch it means a user defined function or a blackbox/builtin function) so I guess in say, handle_or_else in base_cfg.py we either (a) pass in what should be the node we want to connect to e.g. value = None or (b) connect the 2 nodes in e.g. handle_or_else. These are both a lot of work though because we have to do something special in everywhere a function can be called.

Baseline support

Once #100 is merged this will be do-able.

So a baseline is for when you want to diff between a previous run, (probably of known issues or false-positives) and a current run, 'as a big part of continuous integration', baseline support is super important.

See https://github.com/openstack/bandit as a tool that implements this.

    parser.add_argument('-b', '--baseline',
                        help='path of a baseline report to compare against '
                             '(only JSON-formatted files are accepted)',
                        type=str,
                        default=None)

There is also the newly open sourced detect-secrets repo from the Yelp security team that implements this.

Can't clone repo on MacOS

Hi!

For some weird reason when cloning the repo on a mac (tested with 10.11 and 10.13) the file pyt/trigger_definitions/flask_trigger_words.pyt won't be written.

here's an example:

} /tmp$ git clone https://github.com/python-security/pyt.git
Cloning into 'pyt'...
remote: Counting objects: 5740, done.
remote: Total 5740 (delta 0), reused 0 (delta 0), pack-reused 5740
Receiving objects: 100% (5740/5740), 2.62 MiB | 3.75 MiB/s, done.
Resolving deltas: 100% (3916/3916), done.
Checking connectivity... done.
} /tmp$ cd pyt/
} /tmp/pyt$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	deleted:    pyt/trigger_definitions/flask_trigger_words.pyt

no changes added to commit (use "git add" and/or "git commit -a")

Even copy pasting the content in a file result in the file not existing. Tried with default terminal, iterm2 and intellij's terminal, all the same so musn't be the terminal.

Doing some try/fail we suspect that the faulty line is subprocess.call( but doing a hexdump of the file (on a xenial box) doesn't show much...

root@web1:~/pyt/pyt/trigger_definitions# cat flask_trigger_words.pyt | hexdump -C
00000000  73 6f 75 72 63 65 73 3a  0a 67 65 74 28 0a 2e 64  |sources:.get(..d|
00000010  61 74 61 0a 66 6f 72 6d  5b 0a 66 6f 72 6d 28 0a  |ata.form[.form(.|
00000020  4d 61 72 6b 75 70 28 0a  63 6f 6f 6b 69 65 73 5b  |Markup(.cookies[|
00000030  0a 66 69 6c 65 73 5b 0a  53 51 4c 41 6c 63 68 65  |.files[.SQLAlche|
00000040  6d 79 0a 0a 73 69 6e 6b  73 3a 0a 72 65 70 6c 61  |my..sinks:.repla|
00000050  63 65 28 20 2d 3e 20 65  73 63 61 70 65 0a 73 65  |ce( -> escape.se|
00000060  6e 64 5f 66 69 6c 65 28  20 2d 3e 20 27 2e 2e 27  |nd_file( -> '..'|
00000070  2c 20 27 2e 2e 27 20 69  6e 0a 65 78 65 63 75 74  |, '..' in.execut|
00000080  65 28 0a 73 79 73 74 65  6d 28 0a 66 69 6c 74 65  |e(.system(.filte|
00000090  72 28 0a 73 75 62 70 72  6f 63 65 73 73 2e 63 61  |r(.subprocess.ca|
000000a0  6c 6c 28 0a 72 65 6e 64  65 72 5f 74 65 6d 70 6c  |ll(.render_templ|
000000b0  61 74 65 28 0a 73 65 74  5f 63 6f 6f 6b 69 65 28  |ate(.set_cookie(|
000000c0  0a 72 65 64 69 72 65 63  74 28 0a 75 72 6c 5f 66  |.redirect(.url_f|
000000d0  6f 72 28 0a 66 6c 61 73  68 28 0a 6a 73 6f 6e 69  |or(.flash(.jsoni|
000000e0  66 79 28                                          |fy(|
000000e3

The result of this is the tool can't seem to run on mac since this file is not available, fails with

Traceback (most recent call last):
  File ".../bin/pyt", line 11, in <module>
    load_entry_point('pyt==1.0.0a20', 'console_scripts', 'pyt')()
  File ".../lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/__main__.py", line 247, in main
    args.trim_reassigned_in)
  File ".../lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/vulnerabilities.py", line 394, in find_vulnerabilities
    definitions = parse(trigger_word_file)
  File ".../lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/trigger_definitions_parser.py", line 48, in parse
    with open(trigger_word_file, 'r') as fd:
FileNotFoundError: [Errno 2] No such file or directory: '.../lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/trigger_definitions/flask_trigger_words.pyt'

Does that ring any bell?

(Not an issue right now) Handle multiple returns

I'll try to work on this relatively soon, but to think out loud..

In interprocedural_cfg.py, we have

def return_handler(self, node, function_nodes):
    """Handle the return from a function during a function call."""
    call_node = None
    for n in function_nodes:
        if isinstance(n, ConnectToExitNode):
            LHS = CALL_IDENTIFIER + 'call_' + str(self.function_index)
            previous_node = self.nodes[-1]
            if not call_node:
                RHS = 'ret_' + get_call_names_as_string(node.func)
                r = RestoreNode(LHS + ' = ' + RHS, LHS, [RHS],
                                line_number=node.lineno,
                                path=self.filenames[-1])
                call_node = self.append_node(r)
                previous_node.connect(call_node)
        else:
            # lave rigtig kobling
            pass

which cleaned is

def return_handler(self, call_node, function_nodes):
    """Handle the return from a function during a function call.

    Args:
        call_node(ast.Call) : The node that calls the definition.
        function_nodes(list[Node]): List of nodes of the function being called.
    """
    for node in function_nodes:
        # Only Return's and Raise's can be of type ConnectToExitNode
        if isinstance(node, ConnectToExitNode):                
            # Create e.g. ¤call_1 = ret_func_foo RestoreNode
            LHS = CALL_IDENTIFIER + 'call_' + str(self.function_call_index)
            RHS = 'ret_' + get_call_names_as_string(call_node.func)
            return_node = RestoreNode(LHS + ' = ' + RHS,
                                      LHS,
                                      [RHS],
                                      line_number=call_node.lineno,
                                      path=self.filenames[-1])
            self.nodes[-1].connect(return_node)
            self.nodes.append(return_node)
            return 

Firstly, the for loop and the if statement seem to just serve the purpose of "Is there a node of type Return or Raise in the function?" But I think all functions should have at least one return node, right? I'm not sure if I understand the original intention that well e.g. what was going to be in the else?

Secondly, here is an example to illustrate the problem/need to handle multiple returns:

TODO

Framework adaptor only connects first tainted arg to following nodes

In get_func_cfg_with_tainted_args the following taints the args of a framework function, and then links the first arg to the following nodes:

# Taint all the arguments
for arg in args:
tainted_node = TaintedNode(arg, arg,
None, [],
line_number=definition_lineno,
path=definition.path)
function_entry_node.connect(tainted_node)
# 1 and not 0 so that Entry Node remains first in the list
func_cfg.nodes.insert(1, tainted_node)
first_arg = func_cfg.nodes[len(args)]
first_arg.connect(first_node_after_args)

For a framework function where multiple args are user-controlled, this could miss issues related to second or subsequent args. For example, in Django, URL path elements may be passed to a View as args.

For example /xss1/<param>/ could route to:

def xss1(request, param):
    return render(request, 'templates/xss.html', {'param': param})

The suggested fix is to connect each tainted node to the following node in the for loop:

        ...
        func_cfg.nodes.insert(1, tainted_node)
        tainted_node.connect(first_node_after_args)

No module named 'graphviz' error

When attempting to run the intro example on XSS_reassign, I'm getting

ModuleNotFoundError: No module named 'graphviz'

Could someone help me out? I'm really excited to use pyt and ready to get started!

Update Readme

use .rst format so it also can be used for pypi packaging

FileNotFoundError: [Errno 2] No such file or directory: '.../trigger_definitions/flask_trigger_words.pyt'

I installed pyt using:

$ python3 setup.py install --user

But apparently that didn't put the *.pyt files in the right place.
All I get is:

$ pyt -f test.py
Traceback (most recent call last):
  File "/home/jwilk/.local/bin/pyt", line 11, in <module>
    load_entry_point('pyt==1.0.0a20', 'console_scripts', 'pyt')()
  File "/home/jwilk/.local/lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/__main__.py", line 219, in main
    vulnerability_log = find_vulnerabilities(cfg_list, analysis)
  File "/home/jwilk/.local/lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/vulnerabilities.py", line 291, in find_vulnerabilities
    definitions = parse(trigger_word_file)
  File "/home/jwilk/.local/lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/trigger_definitions_parser.py", line 48, in parse
    with open(trigger_word_file, 'r') as fd:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jwilk/.local/lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/trigger_definitions/flask_trigger_words.pyt'

Tested with git master (96f6205).

Add a Python 2 flag

Skip over the 2to3 shell out if it is set.
You can add a test file with this flag set as well, comment out any test if it fails, also make the flag valueaccessible in the cfg files.

(So the [easy] issues are good for new people who want to start contributing to look at.)

sqli.py test fails on non-3.6 versions of python

So I tried to add a tox.ini with all the versions of py3 + pypy3, due to the results file I realized with 3.6 you get:

2 vulnerabilities found:
Vulnerability 1:
File: example/vulnerable_code/sql/sqli.py
 > User input at line 26, trigger word "get(": 
	param = request.args.get('param', 'not set')
File: example/vulnerable_code/sql/sqli.py
 > reaches line 27, trigger word "execute(": 
	result = db.engine.execute(param)

Vulnerability 2:
File: example/vulnerable_code/sql/sqli.py
 > User input at line 33, trigger word "get(": 
	param = request.args.get('param', 'not set')
File: example/vulnerable_code/sql/sqli.py
 > reaches line 36, trigger word "filter(": 
	result = session.query(User).filter('username={}'.format(param))

whereas below 3.6 you get

2 vulnerabilities found:
Vulnerability 1:
File: example/vulnerable_code/sql/sqli.py
 > User input at line 33, trigger word "get(": 
	param = request.args.get('param', 'not set')
File: example/vulnerable_code/sql/sqli.py
 > reaches line 36, trigger word "filter(": 
	result = session.query(User).filter('username={}'.format(param))

Vulnerability 2:
File: example/vulnerable_code/sql/sqli.py
 > User input at line 26, trigger word "get(": 
	param = request.args.get('param', 'not set')
File: example/vulnerable_code/sql/sqli.py
 > reaches line 27, trigger word "execute(": 
	result = db.engine.execute(param)

notice the difference? It's just an order problem

This would explain why I had to change the results file to get Travis CI to pass in #23

scan_github

One big ugly method with a lot of exception handling

False-negative when function call follows ControlFlowNode

Just moved to SF! Figured I would talk to myself in a GitHub issue to reacquaint myself with what I was doing.

This issue was the only test to fail on #63, similar to the [-1] issue it only happens with function calls because it is the only place in the codebase where we "artificially" create nodes.

The reason why the test passed in master and not on the #63 PR was because in master only user-defined functions were turned into more than 1 node, and in the #63 PR blackbox calls are supported as well. (The test only has blackbox calls.)
(
Before it was something like

  param = request.args.get('param', 'not set')

Now it is something like

  ¤call_1 = ret_request.args.get('param', 'not set')
  param = ¤call_1

)

So if master had tests like below but with say outer() being user-defined and replacing scrypt.outer below, the same problem occurs.

import scrypt

image_name = request.args.get('image_name')
for x in range(0, 10):
  print(x)
# print('if this print statement is here everything works')
foo = scrypt.outer(image_name) # Any call after ControlFlowNode causes the problem
send_file(foo)
import scrypt

image_name = request.args.get('image_name')
if not image_name:
    image_name = 'foo'
# print('if this print statement is here everything works')
foo = scrypt.outer(image_name) # Any call after ControlFlowNode causes the problem
send_file(foo)

I gave 2 examples here because it is not specific to if or for, but any ControlFlowNode followed by a function call.

Where and why does this occur though?

In stmt_star_handler in base_cfg.py we have a call to connect_nodes self.connect_nodes(cfg_statements).
The purpose is to connect one node to the next node. (n->n+1 on and on.)

The problem is that foo = ¤call_2 is connected to both the if statement and last statement of it's body, instead of the first node of the call.
In the first example, this is ¤call_1 = ret_request.args.get('param', 'not set') rather than param = ¤call_1.

You might say, what's so bad? You already made a new node type BBnode, just set a first_statement attribute as e.g. ¤call_1 = ret_request.args.get('param', 'not set') and you'd kind of be right.
Except nested function calls make that a pain, because I'll need to follow a chain of "What's your first node?" "What's your first node?".
e.g. if it was ``¤call_2 = ret_request.args.get(¤call_1, 'not set')` we would need first_statement to be the first_statement of ¤call_1.
I think this is the way forward, but I don't feel good about it.

2 Duplication problems and a false-positive in a portion of django.nV output, among other things.

So I run python -m pyt -a E -f example/django.nV/taskManager/upload_controller.py -trim and out I get:

5 vulnerabilities found:
Vulnerability 1:
File: example/django.nV/taskManager/misc.py
 > User input at line 24, trigger word "Flask function URL parameter": 
	title
File: example/django.nV/taskManager/misc.py
 > reaches line 33, trigger word "system(": 
	¤call_2 = ret_os.system('mv ' + uploaded_file.temporary_file_path() + ' ' + '%s/%s' % (upload_dir_path, title))

Vulnerability 2:
File: example/django.nV/taskManager/upload_controller.py
 > User input at line 11, trigger word "get(": 
	¤call_3 = ret_request.POST.get('name', False)
Reassigned in: 
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 11: name = ¤call_3
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: temp_4_title = name
	File: example/django.nV/taskManager/misc.py
	 > Line 24: title = temp_4_title
File: example/django.nV/taskManager/misc.py
 > reaches line 33, trigger word "system(": 
	¤call_6 = ret_os.system('mv ' + uploaded_file.temporary_file_path() + ' ' + '%s/%s' % (upload_dir_path, title))

Vulnerability 3:
File: example/django.nV/taskManager/upload_controller.py
 > User input at line 3, trigger word "Flask function URL parameter": 
	request
Reassigned in: 
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: temp_4_uploaded_file = request.FILES['file']
	File: example/django.nV/taskManager/misc.py
	 > Line 24: uploaded_file = temp_4_uploaded_file
File: example/django.nV/taskManager/misc.py
 > reaches line 33, trigger word "system(": 
	¤call_6 = ret_os.system('mv ' + uploaded_file.temporary_file_path() + ' ' + '%s/%s' % (upload_dir_path, title))

Vulnerability 4:
File: example/django.nV/taskManager/upload_controller.py
 > User input at line 11, trigger word "get(": 
	¤call_3 = ret_request.POST.get('name', False)
Reassigned in: 
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: temp_4_title = name
	File: example/django.nV/taskManager/misc.py
	 > Line 24: title = temp_4_title
	File: example/django.nV/taskManager/misc.py
	 > Line 41: ret_store_uploaded_file = '/static/taskManager/uploads/%s' % title
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: ¤call_4 = ret_store_uploaded_file
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: upload_path = ¤call_4
File: example/django.nV/taskManager/upload_controller.py
 > reaches line 16, trigger word "execute(": 
	¤call_8 = ret_curs.execute('insert into taskManager_file ('name','path','project_id') values ('%s','%s',%s)' % (name, upload_path, project_id))

Vulnerability 5:
File: example/django.nV/taskManager/upload_controller.py
 > User input at line 3, trigger word "Flask function URL parameter": 
	request
Reassigned in: 
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: temp_4_title = name
	File: example/django.nV/taskManager/misc.py
	 > Line 24: title = temp_4_title
	File: example/django.nV/taskManager/misc.py
	 > Line 41: ret_store_uploaded_file = '/static/taskManager/uploads/%s' % title
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: ¤call_4 = ret_store_uploaded_file
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: upload_path = ¤call_4
File: example/django.nV/taskManager/upload_controller.py
 > reaches line 16, trigger word "execute(": 
	¤call_8 = ret_curs.execute('insert into taskManager_file ('name','path','project_id') values ('%s','%s',%s)' % (name, upload_path, project_id))

There are many issues with this output.

(a)
Vulnerability #1 should not be in the output, or at least, if you would argue it should be, you'd concede it's a good idea to give an option for vulnerabilities like it to not be in the output. When I say 'vulnerabilities like it' I mean, we ran it on a controller file, upload_controller.py which calls into misc.py, then we reported vulnerabilities as though we ran it on misc.py, resulting in a duplicate (vulnerabilities 1 and 2).

To solve this, maybe we should do something with self.filenames[-1] inside of interprocedural.py or just, at a higher level, grab the file from the -f output and skip any vulnerabilities that don't match it (note the File: example/django.nV/taskManager/misc.py in the output). The latter idea sounds cleaner and smoother.

(b) Vulnerability #3 is not unknown, although we know uploaded_file is tainted we don't have any idea if uploaded_file.temporary_file_path() is something that leads to a vulnerability.

To solve this, we somehow add the return value of uploaded_file.temporary_file_path() to blackbox_assignments. The .args list of the sink might include uploaded_file, so we'll need to change this as well when we're visiting BBorBInode arguments.

(c) Vulnerabilities #4 and #5 are the same vulnerability, stemming from the same line.
(d) In the Vulnerability #5 output, it doesn't show the actual request.whatever line that led to the vulnerability.

Perhaps these can be solved with the same code, not sure.

(e) If you run it without -trim, and search through the output you'll see ret_render_to_response('taskManager/upload.html', 'form'form, ¤call_13) (from the original line render_to_response('taskManager/upload.html', {'form': form}, RequestContext(request))), so I take it I don't handle visual_args very well when they're dictionaries. A low-priority issue from where I stand though.

Another thing that I noticed, but I'm not going to implement, is #71

Add custom usage message

So my bad, I missed this bug when I made the change to run it with -m, but the default argparse usage message no longer fits with how the program should be run. (It says to run '_ main _.py')

python3.5 -m pyt example/vulnerable_code/sql/sqli.py

usage: __main__.py [-h] (-f FILEPATH | -gr GIT_REPOS) [-pr PROJECT_ROOT] [-d]
                   [-o OUTPUT_FILENAME] [-csv CSV_PATH] [-p | -vp]
                   [-t TRIGGER_WORD_FILE] [-l LOG_LEVEL] [-a ADAPTOR] [-db]
                   [-dl DRAW_LATTICE [DRAW_LATTICE ...]] [-li | -re | -rt]
                   [-intra] [-ppm]
                   {save,github_search} ...
__main__.py: error: invalid choice: 'example/vulnerable_code/sql/sqli.py' (choose from 'save', 'github_search')

Hopefully it's something simple like http://stackoverflow.com/questions/21185526/custom-usage-function-in-argparse

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.