python-security / pyt Goto Github PK

A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications

License: GNU General Public License v2.0

Python 99.99% Makefile 0.01%

pyt control-flow-graph static-analysis python python3 security static-code-analysis program-analysis fixed-point fixed-point-analysis

pyt's People

Contributors

Stargazers

Watchers

Forkers

richinseattle chubbymaggie bbghunter techlord-rce ruadmjn cclauss kakuye viafreekab stasfilin idkwim eg0xy ryanmaynard lonegray ahi1990 disunbow r0zero annettefo hbcbh1999 3453-315h kevinhock indera urasecteam cyber-forensic python-list sleevs phenixrat007 davehenton antonini yongzx v1cker pebsconsulting rmoorman nozberkaryaindonesia gragonvlad sentinelwarren irivera007 againstcurrent xon91 thanatoskira luanjampa av1080p 5up3rc ismailbozkurt raokatakam kevinlowrie jerusalemsbell grukz knaggita gujjuboy10x00 securecloud-biz whb224117 knooow fo0xy heeby tobey123 panckazzz davidoc freewayz tiandiyixian hunslater guoyouy lionking14 indrajithbandara ashang krishna1408 ro9ueadmin ykankaya mrtaheramine deadflowers alecxe ankitxjoshi manishkk rgedagit acamtech rain6851 yehgdotnet c0dak tdelam kartikeyap opt9 fuzzamos omergunal vmlinuxer peterg75 gdraperi r0ug3 ekultek jinyu00 edwinlu ryo18 shaunstanislauslau gpsbird songzcn turpure p3t3rp4rk3r issmonitor gprolog nmaptech gridl ltcguthrie

pyt's Issues

Github access token guide in readme

How to use github search and add file for token

Implement github_oath_token as a configuration

see github_search.py

https://docs.python.org/3/library/configparser.html
http://stackoverflow.com/questions/2026876/packaging-python-applications-with-configuration-files

Investigate `self.filenames.pop() # Should really probably move after restore_saved_local_scope!!!`

This is in the user-defined functions code

Add CodeClimate

E.g. If we look at https://github.com/trailofbits/protofuzz we can see the test coverage at the top and a link to code climate.

To fix this we more or less copy the codeclimate.yml and relevant parts of the top of the README.

(So the [easy] issues are good for new people who want to start contributing to look at.)

Add pre-commit hooks, whatever you like

SyntaxError: Non-ASCII character

Due to https://github.com/python-security/pyt/blob/master/pyt/cfg.py#L21 we get

Kevins-MacBook-Pro-2:example kevinhock$ python cfg_example.py 
Traceback (most recent call last):
  File "cfg_example.py", line 5, in <module>
    from cfg import CFG, print_CFG, generate_ast
  File "/Users/kevinhock/kpyt/pyt/pyt/cfg.py", line 21
SyntaxError: Non-ASCII character '\xc2' in file /Users/kevinhock/kpyt/pyt/pyt/cfg.py on line 21, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

which I think is fixed by adding

# -*- coding: utf-8 -*-

to the top of cfg.py

But, I think it'd be better to just get a new CALL_IDENTIFIER, thoughts?

(Same holds for https://github.com/python-security/pyt/blob/master/pyt/base_cfg.py#L16)

Trim the "Reassigned in:" nodes to the ones that are relevant

So if we have the following code:

@app.route('/menu', methods=['POST'])
def menu():
    param = request.form['suggestion']
    command = 'echo ' + param + ' >> ' + 'menu.txt'
    hey = 'echo ' + param + ' >> ' + 'menu.txt'
    yo = 'echo ' + hey + ' >> ' + 'menu.txt'

    subprocess.call(command, shell=True)

    with open('menu.txt','r') as f:
        menu = f.read()

    return render_template('command_injection.html', menu=menu)

We show the vulnerability output as:

1 vulnerability found:
Vulnerability 1:
File: example/vulnerable_code/command_injection.py
 > User input at line 15, trigger word "form[": 
	param = request.form['suggestion']
Reassigned in: 
	File: example/vulnerable_code/command_injection.py
	 > Line 16: command = 'echo ' + param + ' >> ' + 'menu.txt'
	File: example/vulnerable_code/command_injection.py
	 > Line 17: hey = 'echo ' + param + ' >> ' + 'menu.txt'
	File: example/vulnerable_code/command_injection.py
	 > Line 18: yo = 'echo ' + hey + ' >> ' + 'menu.txt'
File: example/vulnerable_code/command_injection.py
 > reaches line 20, trigger word "subprocess.call(": 
	subprocess.call(command,shell=True)

Where we don't really care about Line 17 and 18 in the output, right?

I ran into this while doing #45, once I fix this then I can make the PR fixing both of them.

Add framework adaptor option for Django

A good heuristic for what a Django controller is is that it has request as the first argument. This is easy enough to implement in an hour by looking at https://github.com/python-security/pyt/pull/52/files though, if you're interested.

Virtual env setup guide

pypi packaging

Fix test case: LivenessTest

See commit: d2418ef

Add new language features from 3.6 to cfg

Add new language features from 3.5 to cfg

Run.py compatibility of newline between windows and linux

When comparing the saved file and the run, newline variants creates false negatives

Search the code base for each repo

Right now we search overall on a string in github search
Search more specific on each repo
This could give better results and maybe lesser requests per minute
But this off course could also give more requests per minute as it would take two requests per repo

Delete repos after scanning

Check if clean up in repo runner works - it seems not.

Support init files

Just wanted to let people know I'm working on this now 👍 (so nobody shows up w/ a PR adding support for them in a few days.)

Support for output in Json format

It will become easier for other application to parse the information generated by pyt.

Uses of self.nodes[-1] cause false positives

I didn't realize this until I investigated the failure of the test_orelse test on my ugly branch, oh well.

Here are the relevant test case and output.

def does_this_kill_us(diff):
	return subprocess.call(diff, shell=True)

@app.route('/poc', methods=['POST'])
def poc():
	try:
	    value = None
	except ImportError:
	    value = request.args.get('foo')
	else:
	    does_this_kill_us(value)

output with -trim flag on.

Vulnerability 1:
File: example/example_inputs/try_orelse.py
 > User input at line 9, trigger word "get(": 
	value = request.args.get('foo')
Reassigned in: 
	File: example/example_inputs/try_orelse.py
	 > Line 11: temp_1_diff = value
	File: example/example_inputs/try_orelse.py
	 > Line 1: diff = temp_1_diff
File: example/example_inputs/try_orelse.py
 > reaches line 2, trigger word "subprocess.call(": 
	ret_does_this_kill_us = subprocess.call(diff,shell=True)

Where does it connect the basic block of the source to does_this_kill_us?
That would be in save_local_scope where self.nodes[-1] here is the last node that was added, which is Label: value = request.args.get('foo')

            previous_node = self.nodes[-1]
            r = RestoreNode(save_name + ' = ' + assignment.left_hand_side,
                            save_name, [assignment.left_hand_side],
                            line_number=line_number, path=self.filenames[-1])
            saved_scope_node = self.append_node(r)

            saved_variables.append(SavedVariable(LHS=save_name,
                                                 RHS=assignment.left_hand_side))
            previous_node.connect(saved_scope_node)

Okay, how should we fix it?
Well we currently only use the [-1] node when we're adding a function (in master this means a user-defined function, in my branch it means a user defined function or a blackbox/builtin function) so I guess in say, handle_or_else in base_cfg.py we either (a) pass in what should be the node we want to connect to e.g. value = None or (b) connect the 2 nodes in e.g. handle_or_else. These are both a lot of work though because we have to do something special in everywhere a function can be called.

Add readthedocs

If you look at https://github.com/trailofbits/manticore/blob/master/README.md you can see a nice link at the top to the docs. I'll write the docs once the layout is there, please see
https://www.slideshare.net/mobile/JohnCosta/how-to-readthedocs

(So the [easy] issues are good for new people who want to start contributing to look at.)

Baseline support

Once #100 is merged this will be do-able.

So a baseline is for when you want to diff between a previous run, (probably of known issues or false-positives) and a current run, 'as a big part of continuous integration', baseline support is super important.

See https://github.com/openstack/bandit as a tool that implements this.

    parser.add_argument('-b', '--baseline',
                        help='path of a baseline report to compare against '
                             '(only JSON-formatted files are accepted)',
                        type=str,
                        default=None)

There is also the newly open sourced detect-secrets repo from the Yelp security team that implements this.

Can't clone repo on MacOS

Hi!

For some weird reason when cloning the repo on a mac (tested with 10.11 and 10.13) the file pyt/trigger_definitions/flask_trigger_words.pyt won't be written.

here's an example:

} /tmp$ git clone https://github.com/python-security/pyt.git
Cloning into 'pyt'...
remote: Counting objects: 5740, done.
remote: Total 5740 (delta 0), reused 0 (delta 0), pack-reused 5740
Receiving objects: 100% (5740/5740), 2.62 MiB | 3.75 MiB/s, done.
Resolving deltas: 100% (3916/3916), done.
Checking connectivity... done.
} /tmp$ cd pyt/
} /tmp/pyt$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	deleted:    pyt/trigger_definitions/flask_trigger_words.pyt

no changes added to commit (use "git add" and/or "git commit -a")

Even copy pasting the content in a file result in the file not existing. Tried with default terminal, iterm2 and intellij's terminal, all the same so musn't be the terminal.

Doing some try/fail we suspect that the faulty line is subprocess.call( but doing a hexdump of the file (on a xenial box) doesn't show much...

root@web1:~/pyt/pyt/trigger_definitions# cat flask_trigger_words.pyt | hexdump -C
00000000  73 6f 75 72 63 65 73 3a  0a 67 65 74 28 0a 2e 64  |sources:.get(..d|
00000010  61 74 61 0a 66 6f 72 6d  5b 0a 66 6f 72 6d 28 0a  |ata.form[.form(.|
00000020  4d 61 72 6b 75 70 28 0a  63 6f 6f 6b 69 65 73 5b  |Markup(.cookies[|
00000030  0a 66 69 6c 65 73 5b 0a  53 51 4c 41 6c 63 68 65  |.files[.SQLAlche|
00000040  6d 79 0a 0a 73 69 6e 6b  73 3a 0a 72 65 70 6c 61  |my..sinks:.repla|
00000050  63 65 28 20 2d 3e 20 65  73 63 61 70 65 0a 73 65  |ce( -> escape.se|
00000060  6e 64 5f 66 69 6c 65 28  20 2d 3e 20 27 2e 2e 27  |nd_file( -> '..'|
00000070  2c 20 27 2e 2e 27 20 69  6e 0a 65 78 65 63 75 74  |, '..' in.execut|
00000080  65 28 0a 73 79 73 74 65  6d 28 0a 66 69 6c 74 65  |e(.system(.filte|
00000090  72 28 0a 73 75 62 70 72  6f 63 65 73 73 2e 63 61  |r(.subprocess.ca|
000000a0  6c 6c 28 0a 72 65 6e 64  65 72 5f 74 65 6d 70 6c  |ll(.render_templ|
000000b0  61 74 65 28 0a 73 65 74  5f 63 6f 6f 6b 69 65 28  |ate(.set_cookie(|
000000c0  0a 72 65 64 69 72 65 63  74 28 0a 75 72 6c 5f 66  |.redirect(.url_f|
000000d0  6f 72 28 0a 66 6c 61 73  68 28 0a 6a 73 6f 6e 69  |or(.flash(.jsoni|
000000e0  66 79 28                                          |fy(|
000000e3

The result of this is the tool can't seem to run on mac since this file is not available, fails with

Traceback (most recent call last):
  File ".../bin/pyt", line 11, in <module>
    load_entry_point('pyt==1.0.0a20', 'console_scripts', 'pyt')()
  File ".../lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/__main__.py", line 247, in main
    args.trim_reassigned_in)
  File ".../lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/vulnerabilities.py", line 394, in find_vulnerabilities
    definitions = parse(trigger_word_file)
  File ".../lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/trigger_definitions_parser.py", line 48, in parse
    with open(trigger_word_file, 'r') as fd:
FileNotFoundError: [Errno 2] No such file or directory: '.../lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/trigger_definitions/flask_trigger_words.pyt'

Does that ring any bell?

Dynamic partitioning of date interval in github search

Running pyt with no arguments yields exception

python -m pyt

expected result: help menu
actual result: exception

(Not an issue right now) Handle multiple returns

~~I'll try to work on this relatively soon, but~~ to think out loud..

In interprocedural_cfg.py, we have

def return_handler(self, node, function_nodes):
    """Handle the return from a function during a function call."""
    call_node = None
    for n in function_nodes:
        if isinstance(n, ConnectToExitNode):
            LHS = CALL_IDENTIFIER + 'call_' + str(self.function_index)
            previous_node = self.nodes[-1]
            if not call_node:
                RHS = 'ret_' + get_call_names_as_string(node.func)
                r = RestoreNode(LHS + ' = ' + RHS, LHS, [RHS],
                                line_number=node.lineno,
                                path=self.filenames[-1])
                call_node = self.append_node(r)
                previous_node.connect(call_node)
        else:
            # lave rigtig kobling
            pass

which cleaned is

def return_handler(self, call_node, function_nodes):
    """Handle the return from a function during a function call.

    Args:
        call_node(ast.Call) : The node that calls the definition.
        function_nodes(list[Node]): List of nodes of the function being called.
    """
    for node in function_nodes:
        # Only Return's and Raise's can be of type ConnectToExitNode
        if isinstance(node, ConnectToExitNode):                
            # Create e.g. ¤call_1 = ret_func_foo RestoreNode
            LHS = CALL_IDENTIFIER + 'call_' + str(self.function_call_index)
            RHS = 'ret_' + get_call_names_as_string(call_node.func)
            return_node = RestoreNode(LHS + ' = ' + RHS,
                                      LHS,
                                      [RHS],
                                      line_number=call_node.lineno,
                                      path=self.filenames[-1])
            self.nodes[-1].connect(return_node)
            self.nodes.append(return_node)
            return

Firstly, the for loop and the if statement seem to just serve the purpose of "Is there a node of type Return or Raise in the function?" But I think all functions should have at least one return node, right? I'm not sure if I understand the original intention that well e.g. what was going to be in the else?

Secondly, here is an example to illustrate the problem/need to handle multiple returns:

TODO

Write tests for main.py

As we can see on CodeClimate https://codeclimate.com/github/python-security/pyt/coverage/5935971dbf92ed000102998b there is pretty low test coverage of main, I understand why this is but adding some tests for it would increase our test coverage percentage and 75% isn't satisfying.

If you have any trouble with this I can help, I am going to label this issue as Easy so new comers see it.

Refactor: base_test_case.py

Framework adaptor only connects first tainted arg to following nodes

In get_func_cfg_with_tainted_args the following taints the args of a framework function, and then links the first arg to the following nodes:

pyt/pyt/framework_adaptor.py

Lines 41 to 52 in a762e00

    
           # Taint all the arguments 
        
           for arg in args: 
        
               tainted_node = TaintedNode(arg, arg, 
        
                                          None, [], 
        
                                          line_number=definition_lineno, 
        
                                          path=definition.path) 
        
               function_entry_node.connect(tainted_node) 
        
               # 1 and not 0 so that Entry Node remains first in the list 
        
               func_cfg.nodes.insert(1, tainted_node) 
        
           first_arg = func_cfg.nodes[len(args)] 
        
           first_arg.connect(first_node_after_args)

For a framework function where multiple args are user-controlled, this could miss issues related to second or subsequent args. For example, in Django, URL path elements may be passed to a View as args.

For example /xss1/<param>/ could route to:

def xss1(request, param):
    return render(request, 'templates/xss.html', {'param': param})

The suggested fix is to connect each tainted node to the following node in the for loop:

        ...
        func_cfg.nodes.insert(1, tainted_node)
        tainted_node.connect(first_node_after_args)

No module named 'graphviz' error

When attempting to run the intro example on XSS_reassign, I'm getting

ModuleNotFoundError: No module named 'graphviz'

Could someone help me out? I'm really excited to use pyt and ready to get started!

Update Readme

use .rst format so it also can be used for pypi packaging

make a parameter for date for github search

Make argparse date arg
Maybe make menu for github search

Blackbox functions return tainted value (false positives)

So, there are 2 commented out tests that I added during my interprocedural PR here and here.

I was going to fix these via doing something in an else of this if in visit_call, marking node.something as blackbox but I'm wondering what the best place to check something at, maybe in reaching_definitions_base, or adding them to the sanitizers dictionary so they can be marked as sanitized here.

Thoughts?

FileNotFoundError: [Errno 2] No such file or directory: '.../trigger_definitions/flask_trigger_words.pyt'

I installed pyt using:

$ python3 setup.py install --user

But apparently that didn't put the *.pyt files in the right place.
All I get is:

$ pyt -f test.py
Traceback (most recent call last):
  File "/home/jwilk/.local/bin/pyt", line 11, in <module>
    load_entry_point('pyt==1.0.0a20', 'console_scripts', 'pyt')()
  File "/home/jwilk/.local/lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/__main__.py", line 219, in main
    vulnerability_log = find_vulnerabilities(cfg_list, analysis)
  File "/home/jwilk/.local/lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/vulnerabilities.py", line 291, in find_vulnerabilities
    definitions = parse(trigger_word_file)
  File "/home/jwilk/.local/lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/trigger_definitions_parser.py", line 48, in parse
    with open(trigger_word_file, 'r') as fd:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jwilk/.local/lib/python3.5/site-packages/pyt-1.0.0a20-py3.5.egg/pyt/trigger_definitions/flask_trigger_words.pyt'

Tested with git master (96f6205).

Add a Python 2 flag

Skip over the 2to3 shell out if it is set.
You can add a test file with this flag set as well, comment out any test if it fails, also make the flag valueaccessible in the cfg files.

(So the [easy] issues are good for new people who want to start contributing to look at.)

sqli.py test fails on non-3.6 versions of python

So I tried to add a tox.ini with all the versions of py3 + pypy3, due to the results file I realized with 3.6 you get:

2 vulnerabilities found:
Vulnerability 1:
File: example/vulnerable_code/sql/sqli.py
 > User input at line 26, trigger word "get(": 
	param = request.args.get('param', 'not set')
File: example/vulnerable_code/sql/sqli.py
 > reaches line 27, trigger word "execute(": 
	result = db.engine.execute(param)

Vulnerability 2:
File: example/vulnerable_code/sql/sqli.py
 > User input at line 33, trigger word "get(": 
	param = request.args.get('param', 'not set')
File: example/vulnerable_code/sql/sqli.py
 > reaches line 36, trigger word "filter(": 
	result = session.query(User).filter('username={}'.format(param))

whereas below 3.6 you get

2 vulnerabilities found:
Vulnerability 1:
File: example/vulnerable_code/sql/sqli.py
 > User input at line 33, trigger word "get(": 
	param = request.args.get('param', 'not set')
File: example/vulnerable_code/sql/sqli.py
 > reaches line 36, trigger word "filter(": 
	result = session.query(User).filter('username={}'.format(param))

Vulnerability 2:
File: example/vulnerable_code/sql/sqli.py
 > User input at line 26, trigger word "get(": 
	param = request.args.get('param', 'not set')
File: example/vulnerable_code/sql/sqli.py
 > reaches line 27, trigger word "execute(": 
	result = db.engine.execute(param)

notice the difference? It's just an order problem

This would explain why I had to change the results file to get Travis CI to pass in #23

scan_github

One big ugly method with a lot of exception handling

Split trigger definition file into 3 separate files

A file for sources, sinks and sanitisers.

False-negative when function call follows ControlFlowNode

Just moved to SF! Figured I would talk to myself in a GitHub issue to reacquaint myself with what I was doing.

This issue was the only test to fail on #63, similar to the [-1] issue it only happens with function calls because it is the only place in the codebase where we "artificially" create nodes.

The reason why the test passed in master and not on the #63 PR was because in master only user-defined functions were turned into more than 1 node, and in the #63 PR blackbox calls are supported as well. (The test only has blackbox calls.)
(
Before it was something like

  param = request.args.get('param', 'not set')

Now it is something like

  ¤call_1 = ret_request.args.get('param', 'not set')
  param = ¤call_1

)

So if master had tests like below but with say outer() being user-defined and replacing scrypt.outer below, the same problem occurs.

import scrypt

image_name = request.args.get('image_name')
for x in range(0, 10):
  print(x)
# print('if this print statement is here everything works')
foo = scrypt.outer(image_name) # Any call after ControlFlowNode causes the problem
send_file(foo)

import scrypt

image_name = request.args.get('image_name')
if not image_name:
    image_name = 'foo'
# print('if this print statement is here everything works')
foo = scrypt.outer(image_name) # Any call after ControlFlowNode causes the problem
send_file(foo)

I gave 2 examples here because it is not specific to if or for, but any ControlFlowNode followed by a function call.

Where and why does this occur though?

In stmt_star_handler in base_cfg.py we have a call to connect_nodes self.connect_nodes(cfg_statements).
The purpose is to connect one node to the next node. (n->n+1 on and on.)

The problem is that foo = ¤call_2 is connected to both the if statement and last statement of it's body, instead of the first node of the call.
In the first example, this is ¤call_1 = ret_request.args.get('param', 'not set') rather than param = ¤call_1.

You might say, what's so bad? You already made a new node type BBnode, just set a first_statement attribute as e.g. ¤call_1 = ret_request.args.get('param', 'not set') and you'd kind of be right.
Except nested function calls make that a pain, because I'll need to follow a chain of "What's your first node?" "What's your first node?".
e.g. if it was ``¤call_2 = ret_request.args.get(¤call_1, 'not set')` we would need first_statement to be the first_statement of ¤call_1.
I think this is the way forward, but I don't feel good about it.

2 Duplication problems and a false-positive in a portion of django.nV output, among other things.

So I run python -m pyt -a E -f example/django.nV/taskManager/upload_controller.py -trim and out I get:

5 vulnerabilities found:
Vulnerability 1:
File: example/django.nV/taskManager/misc.py
 > User input at line 24, trigger word "Flask function URL parameter": 
	title
File: example/django.nV/taskManager/misc.py
 > reaches line 33, trigger word "system(": 
	¤call_2 = ret_os.system('mv ' + uploaded_file.temporary_file_path() + ' ' + '%s/%s' % (upload_dir_path, title))

Vulnerability 2:
File: example/django.nV/taskManager/upload_controller.py
 > User input at line 11, trigger word "get(": 
	¤call_3 = ret_request.POST.get('name', False)
Reassigned in: 
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 11: name = ¤call_3
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: temp_4_title = name
	File: example/django.nV/taskManager/misc.py
	 > Line 24: title = temp_4_title
File: example/django.nV/taskManager/misc.py
 > reaches line 33, trigger word "system(": 
	¤call_6 = ret_os.system('mv ' + uploaded_file.temporary_file_path() + ' ' + '%s/%s' % (upload_dir_path, title))

Vulnerability 3:
File: example/django.nV/taskManager/upload_controller.py
 > User input at line 3, trigger word "Flask function URL parameter": 
	request
Reassigned in: 
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: temp_4_uploaded_file = request.FILES['file']
	File: example/django.nV/taskManager/misc.py
	 > Line 24: uploaded_file = temp_4_uploaded_file
File: example/django.nV/taskManager/misc.py
 > reaches line 33, trigger word "system(": 
	¤call_6 = ret_os.system('mv ' + uploaded_file.temporary_file_path() + ' ' + '%s/%s' % (upload_dir_path, title))

Vulnerability 4:
File: example/django.nV/taskManager/upload_controller.py
 > User input at line 11, trigger word "get(": 
	¤call_3 = ret_request.POST.get('name', False)
Reassigned in: 
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: temp_4_title = name
	File: example/django.nV/taskManager/misc.py
	 > Line 24: title = temp_4_title
	File: example/django.nV/taskManager/misc.py
	 > Line 41: ret_store_uploaded_file = '/static/taskManager/uploads/%s' % title
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: ¤call_4 = ret_store_uploaded_file
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: upload_path = ¤call_4
File: example/django.nV/taskManager/upload_controller.py
 > reaches line 16, trigger word "execute(": 
	¤call_8 = ret_curs.execute('insert into taskManager_file ('name','path','project_id') values ('%s','%s',%s)' % (name, upload_path, project_id))

Vulnerability 5:
File: example/django.nV/taskManager/upload_controller.py
 > User input at line 3, trigger word "Flask function URL parameter": 
	request
Reassigned in: 
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: temp_4_title = name
	File: example/django.nV/taskManager/misc.py
	 > Line 24: title = temp_4_title
	File: example/django.nV/taskManager/misc.py
	 > Line 41: ret_store_uploaded_file = '/static/taskManager/uploads/%s' % title
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: ¤call_4 = ret_store_uploaded_file
	File: example/django.nV/taskManager/upload_controller.py
	 > Line 12: upload_path = ¤call_4
File: example/django.nV/taskManager/upload_controller.py
 > reaches line 16, trigger word "execute(": 
	¤call_8 = ret_curs.execute('insert into taskManager_file ('name','path','project_id') values ('%s','%s',%s)' % (name, upload_path, project_id))

There are many issues with this output.

(a)
Vulnerability #1 should not be in the output, or at least, if you would argue it should be, you'd concede it's a good idea to give an option for vulnerabilities like it to not be in the output. When I say 'vulnerabilities like it' I mean, we ran it on a controller file, upload_controller.py which calls into misc.py, then we reported vulnerabilities as though we ran it on misc.py, resulting in a duplicate (vulnerabilities 1 and 2).

To solve this, maybe we should do something with self.filenames[-1] inside of interprocedural.py or just, at a higher level, grab the file from the -f output and skip any vulnerabilities that don't match it (note the File: example/django.nV/taskManager/misc.py in the output). The latter idea sounds cleaner and smoother.

(b) Vulnerability #3 is not unknown, although we know uploaded_file is tainted we don't have any idea if uploaded_file.temporary_file_path() is something that leads to a vulnerability.

To solve this, we somehow add the return value of uploaded_file.temporary_file_path() to blackbox_assignments. The .args list of the sink might include uploaded_file, so we'll need to change this as well when we're visiting BBorBInode arguments.

(c) Vulnerabilities #4 and #5 are the same vulnerability, stemming from the same line.
(d) In the Vulnerability #5 output, it doesn't show the actual request.whatever line that led to the vulnerability.

Perhaps these can be solved with the same code, not sure.

(e) If you run it without -trim, and search through the output you'll see ret_render_to_response('taskManager/upload.html', 'form'form, ¤call_13) (from the original line render_to_response('taskManager/upload.html', {'form': form}, RequestContext(request))), so I take it I don't handle visual_args very well when they're dictionaries. A low-priority issue from where I stand though.

Another thing that I noticed, but I'm not going to implement, is #71

python3.5 -m pyt example/vulnerable_code/sql/sqli.py

usage: __main__.py [-h] (-f FILEPATH | -gr GIT_REPOS) [-pr PROJECT_ROOT] [-d]
                   [-o OUTPUT_FILENAME] [-csv CSV_PATH] [-p | -vp]
                   [-t TRIGGER_WORD_FILE] [-l LOG_LEVEL] [-a ADAPTOR] [-db]
                   [-dl DRAW_LATTICE [DRAW_LATTICE ...]] [-li | -re | -rt]
                   [-intra] [-ppm]
                   {save,github_search} ...
__main__.py: error: invalid choice: 'example/vulnerable_code/sql/sqli.py' (choose from 'save', 'github_search')

Hopefully it's something simple like http://stackoverflow.com/questions/21185526/custom-usage-function-in-argparse

Create tests for try_orelse_with_no_variables_to_save.py and try_orelse_with_no_variables_to_save_and_no_args.py

In this commit 23e6412 you can see where this would affect the program. (Search for "# raise")

The test would be just like test_orelse in cfg_test.py.

	# Taint all the arguments
	for arg in args:
	tainted_node = TaintedNode(arg, arg,
	None, [],
	line_number=definition_lineno,
	path=definition.path)
	function_entry_node.connect(tainted_node)
	# 1 and not 0 so that Entry Node remains first in the list
	func_cfg.nodes.insert(1, tainted_node)

	first_arg = func_cfg.nodes[len(args)]
	first_arg.connect(first_node_after_args)

python-security / pyt Goto Github PK

pyt's People

Contributors

Stargazers

Watchers

Forkers

pyt's Issues

Recommend Projects

Recommend Topics

Recommend Org