wogscpar / szzunleashed Goto Github PK

View Code? Open in Web Editor NEW

106.0 7.0 76.0 6.3 MB

An implementation of the SZZ algorithm, i.e., an approach to identify bug-introducing commits.

License: MIT License

Python 47.11% Shell 0.24% Java 52.36% Dockerfile 0.29%

szz-algorithm szz git defect-prediction software-engineering-research mining-software-repositories

szzunleashed's People

Contributors

Stargazers

Watchers

Forkers

rbonifacio quyutest mrksbrg efregnan hichemjedidi teemujal sbrentini jacknichao haceng luisamaralh bvandehei qingdujun betrandmaryjane paul-rajesh simonbaars ptrbld naplues cuidi34 djaekim yuanqingmei jonlamca bvasiles tigerqiu712 jiyeongyun 84n4n4 linuer liuchyi jtlemkin khine-yin-mon hareem-e-sahar openselab nowyouseeme523 rayhanramin lobshunter ztwater jacktj01 toukir-ahammed jerrylikerice farid-feyzi incioman amir9979 daanschipper hayasam boehmseb tsukimirini shoogino cchengjie mayacostantini chubbymaggie unity-technologies apollo3531 clowee felix1982 akichil-fj markus7800 lanpengyou125 lyx999-star pombredanne shuyang-liu diafarad myteam888 zhangxunhui fuhrmanator harel-coffee njuzhyy

szzunleashed's Issues

docker image name issue

docker image name is 'ssz'

But Readme is running 'szz'.

Wrong pair order in SimpleBugIntroducerFinder

It seems to me that in the following code snippet

SZZUnleashed/szz/src/main/java/heuristics/SimpleBugIntroducerFinder.java

Lines 180 to 207 in 79f369f

    
           for (Map.Entry<String, List<String>> entry : bucketIntroducers.entrySet()) { 
        
             List<String> introducers = entry.getValue(); 
        
             List<String> issues = bucketIssues.get(entry.getKey()); 
        
             RevisionCombinationGenerator gen = new RevisionCombinationGenerator(introducers, issues, 2); 
        
             gen = gen.iterator(); 
        
             while(gen.hasNext()) { 
        
               String[] pair = gen.getNextIndic(); 
        
               if (pair[0] == "" && pair[1] == "") 
        
                 continue; 
        
               if (isWithinTimeframe(pair[1], pair[0])) { 
        
                 bugIntroducers.add(pair); 
        
               } else { 
        
                 if (!partialIntroducers.containsKey(entry.getKey())) { 
        
                   partialIntroducers.put(entry.getKey(), new ArrayList<>()); 
        
                 } 
        
                 partialIntroducers.get(entry.getKey()).add(pair[0]); 
        
                 if (!partialIssues.containsKey(entry.getKey())) { 
        
                   partialIssues.put(entry.getKey(), new ArrayList<>()); 
        
                 } 
        
                 partialIssues.get(entry.getKey()).add(pair[1]); 
        
               } 
        
             } 
        
           }

pair[0] is a bug-introducing commit, and pair[1] is a bug-fixing commit as defined in the issue list.

However, in line 193 (as well as in line 224), I think the order of the pair should be (bug-fixing commit, bug-introducing commit), so (pair[1], pair[0]).

Is this correct or am I missing something?

UnicodeEncodeError from fetch.py

The fetch script results in a UnicodeEncodeError, see below. An empty file "res0.json" is created in the issues subdirectory.

Environment: Python 3.6.6 on Win10.

python fetch.py
Total issue matches: 2435
Progress: | = 1000 issues
Traceback (most recent call last):
  File "fetch.py", line 53, in <module>
    fetch()
  File "fetch.py", line 46, in fetch
    f.write(conn.read().decode('utf-8'))
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2193' in position 320915: character maps to <undefined>

Add parameters to fetch.py script

Now fetch.py only works with the Jira repository of the JENKINS project. It could be interesting to allow the script to work with any Jira project.

I would suggest to pass two parameters when the script is called from command line:

The code of the project issues in Jira: e.g., JENKINS
The name of the Jira page of the project: e.g., issues.jenkins-ci.org

Moreover, a third parameter may allow to configure the end date of the query: currently, it is set to "2018-02-20 10:34"

If you agree, I will generate a pull request with the above-mentioned changes.

Unclear Figure 3

I'm looking at Figure 3 in your paper (BTW, nice graphics!) and I don't understand why line 3 in Commit 3 is not blamed to either Commit 2 or Commit 1.

Could you please clarify?

Short description of repo and tags need an update

The short description of the repo should be corrected. Instead of

A complete implementation of the SZZ algorithm as described by Zeller et al's.

I suggest

An implementation of the SZZ algorithm, i.e., an approach to identify bug-introducing commits.

The reference to the work by the SZZ authors is anyway first in the README.md... on top of that "Zeller et al." is wrong, since it's actually "Śliwerski et al."

Also, I suggest adding a few more tags to support findability of this repo from the software engineering research community. I suggest adding: "defect-prediction", "mining-software-repositories", and "software-engineering-research".

I want to know how to run this program

assemble_code_churns.py I don't know how to use this file

java.lang.ClassCastException when casting JSONArray to JSONObject

The "issues" subfolder is created, but it's empty.

Here is the stack trace:

[main] INFO Main - Checking available processors...
[main] INFO Main - Found 8 processes!
Exception in thread "main" java.lang.ClassCastException: org.json.simple.JSONArray cannot be cast to org.json.simple.JSONObject
        at diff.SimplePartition.splitJSON(SimplePartition.java:52)
        at diff.SimplePartition.splitFile(SimplePartition.java:121)
        at Main.main(Main.java:44)

Using for a GitHub Project

How can it be used for a Public GitHub repository to find bug-fixing commits and the modified code [for any release]?

olt parameter

How to use olt parameter that you talked about in #28.
I want to try it to avoid MemoryError.
Thanks

FileNotFoundError in git_log_to_array.py

FileNotFoundError exception in git_log_to_array.py, see below.

Cloned jenkins to C:\Code\jenkins and provided absolute path to script. Tried a few variations of separators in file path. Not sure which file the system is looking for.

C:\Code\SZZUnleashed\fetch_jira_bugs>python git_log_to_array.py --repo-path C:/Code/jenkins
Traceback (most recent call last):
  File "git_log_to_array.py", line 43, in <module>
    git_log_to_json(init_hash, path_to_repo)
  File "git_log_to_array.py", line 14, in git_log_to_json
    stdout=subprocess.PIPE).stdout.decode('ascii').split()
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\subprocess.py", line 403, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\subprocess.py", line 997, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

cannot find bug fixing issues in Jira Projects of Apache Software Foundation

The analysis of any project in the jira of the apache software foundations fails.

As example, we executed

git clone https://github.com/apache/accumulo
python3 fetch.py --issue-code ACCUMULO --jira-project issues.apache.org/jira

python3 git_log_to_array.py --repo-path ./accumulo --from-commit 31b54248176f320cc15f432ece29f998a0d3a363

python3 find_bug_fixes.py --gitlog gitlog.json --issue-list ./issues

The last script cannot find any matching bug-fixing commit while reading from the commit messages there are several commits reporting the Jira issue id and clearly reporting "fixed" in the commit message.

"szz_find_bug_introducers-0.1.jar" stalls for a long time Kafka 2.1.1

First of all, I really appreciate your work on making SZZ algorithm public. This is truly helpful for researchers and practitioners.

Secondly, I am not using it with docker, and I am windows user.

Questions:
[1] I noticed that annotation.json is created quickly, however, the command line still shows "trying to find potential bug introducing commit" and stalls for a very long time. Based on the documentation, if "annotation.json" has same information as "fix_and_introducers_pairs.json", but shows details about bug introducing file rather commit, I do not understand why it stalls for a long time to get commits.

As soon as I run the program following happens
A.

B. inside each one I already have

C. however, it stalls for a very long time here.

[2] I was wondering how would i be able to get file introducing the bug rather than commit level? Can I traverse the annotation.json and look for filePath?

thank you!

ClassCastException in GitParser.java:342

Successfully built szz_find_bug_introducers-0.1.jar with gradle. The res1000.json file appears to be populated with correct data.

Not sure if it is related, an also not sure what the purpose of the file results\result0\commits.json is, but it is missing and a FileNotFoundException is thrown.

If this is not a bug, then the instructions need to be updated to explain what needs to be present before running the jar file.

Output and stack trace:

java -jar szz_find_bug_introducers-0.1.jar -i C:\Code\SZZUnleashed\fetch_jira_bugs\issues\res1000.json -r C:\Code\jenkins
[main] INFO Main - Checking available processors...
[main] INFO Main - Found 8 processes!
[Thread-0] INFO parser.GitParserThread - Started process...
Exception in thread "Thread-0" java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map
        at parser.GitParser.readBugFixCommits(GitParser.java:342)
        at parser.GitParserThread.run(GitParserThread.java:94)
[Thread-1] INFO parser.GitParserThread - Started process...
Exception in thread "Thread-1" java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.Map
        at parser.GitParser.readBugFixCommits(GitParser.java:342)
        at parser.GitParserThread.run(GitParserThread.java:94)
[Thread-2] INFO parser.GitParserThread - Started process...
Exception in thread "Thread-2" java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.Map
        at parser.GitParser.readBugFixCommits(GitParser.java:342)
        at parser.GitParserThread.run(GitParserThread.java:94)
[Thread-3] INFO parser.GitParserThread - Started process...
[Thread-4] INFO parser.GitParserThread - Started process...
[Thread-5] INFO parser.GitParserThread - Started process...
[Thread-5] INFO parser.GitParserThread - Found 0 number of commits.
[Thread-5] INFO parser.GitParserThread - Checking each commits diff...
[Thread-5] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-5] INFO parser.GitParserThread - Saving parsed commits to file
Exception in thread "Thread-4" java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.Map
        at parser.GitParser.readBugFixCommits(GitParser.java:342)
        at parser.GitParserThread.run(GitParserThread.java:94)
[Thread-5] INFO parser.GitParserThread - Building line mapping graph.
[Thread-6] INFO parser.GitParserThread - Started process...
[Thread-5] INFO parser.GitParserThread - Saving results to file
[Thread-6] INFO parser.GitParserThread - Found 0 number of commits.
[Thread-6] INFO parser.GitParserThread - Checking each commits diff...
[Thread-5] INFO parser.GitParserThread - Trying to find potential bug introducing commits...
[Thread-6] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-6] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-5] INFO parser.GitParserThread - Saving found bug introducing commits...
[Thread-7] INFO parser.GitParserThread - Started process...
[Thread-6] INFO parser.GitParserThread - Building line mapping graph.
[Thread-7] INFO parser.GitParserThread - Found 0 number of commits.
[Thread-6] INFO parser.GitParserThread - Saving results to file
[Thread-7] INFO parser.GitParserThread - Checking each commits diff...
[Thread-7] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-6] INFO parser.GitParserThread - Trying to find potential bug introducing commits...
[Thread-7] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-6] INFO parser.GitParserThread - Saving found bug introducing commits...
[Thread-7] INFO parser.GitParserThread - Building line mapping graph.
[Thread-7] INFO parser.GitParserThread - Saving results to file
[Thread-7] INFO parser.GitParserThread - Trying to find potential bug introducing commits...
[Thread-7] INFO parser.GitParserThread - Saving found bug introducing commits...
Exception in thread "Thread-3" java.lang.ClassCastException: org.json.simple.JSONArray cannot be cast to java.util.Map
        at parser.GitParser.readBugFixCommits(GitParser.java:342)
        at parser.GitParserThread.run(GitParserThread.java:94)
java.io.FileNotFoundException: results\result0\commits.json (The system cannot find the file specified)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileReader.<init>(Unknown Source)
        at diff.SimplePartition.mergeFiles(SimplePartition.java:140)
        at Main.main(Main.java:65)
java.io.FileNotFoundException: results\result1\commits.json (The system cannot find the file specified)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileReader.<init>(Unknown Source)
        at diff.SimplePartition.mergeFiles(SimplePartition.java:140)
        at Main.main(Main.java:65)
java.io.FileNotFoundException: results\result2\commits.json (The system cannot find the file specified)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileReader.<init>(Unknown Source)
        at diff.SimplePartition.mergeFiles(SimplePartition.java:140)
        at Main.main(Main.java:65)
java.io.FileNotFoundException: results\result3\commits.json (The system cannot find the file specified)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileReader.<init>(Unknown Source)
        at diff.SimplePartition.mergeFiles(SimplePartition.java:140)
        at Main.main(Main.java:65)
java.io.FileNotFoundException: results\result4\commits.json (The system cannot find the file specified)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileInputStream.<init>(Unknown Source)
        at java.io.FileReader.<init>(Unknown Source)
        at diff.SimplePartition.mergeFiles(SimplePartition.java:140)
        at Main.main(Main.java:65)

Add references to previous "calls for a public SZZ implementation"

I believe we found some explicit calls for a complete open source SZZ implementation in the literature. I don't remember which papers raised this need. Does anyone know? It would make sense to add references to these in the Readme, to state that we "respond to the call by [REF] and [REF]".

Parameters passed when running the jar file

I may be totally wrong here, but it seems to me that Configuration.java supports more parameters than the two specified in Readme.md for the szz_find_bug_introducers-<version_number>.jar file.
Maybe an explanation about them should be added in the Readme.md file.

In particular, I think that it may be important to explain the parameter that sets the number of cores during the execution. Currently, the Readme.md file states:

The algorithm tries to use as many cores as possible during runtime.

However, the option to enable the user to set the number of cores seems to have been implemented in Configuration.java. I think this possibility should be explained in Readme.md since it may be relevant for the users.

What do you think?

I found a "fetch.py" error

I noticed that the contents of your res0.json, res1000.json and res2000.json are exactly the same.

Regardless of what start_at and max_results are, it will get all the data (no difference).

It always,

{"expand":"schema,names","startAt":0,"maxResults":50,"total":2445,"issues":
[{"expand":"operations,versionedRepresentations,editmeta,changelog,renderedFields",
"id":"188567","self":"https://issues.jenkins-ci.org/rest/api/2/issue/188567","key":
"JENKINS-49642","fields":{"issuetype":
{"self":"https://issues.jenkins-ci.org/rest/api/2/issuetype/1","id":"1",
"description":"A problem which impairs or prevents the functions of the product.","iconUrl":

...

You can copy the following string to your browser URL.

https://issues.jenkins-ci.org/rest/api/2/search?jql=project%20%3D%20JENKINS%20AND%20issuetype%20%3D%20Bug%20AND%20status%20in%20%28Resolved%2C%20Closed%29%20AND%20resolution%20%3D%20Fixed%20AND%20component%20%3D%20core%20AND%20created%20%3C%3D%20%222018-02-20%2010%3A34%22%20ORDER%20BY%20created%20DESC&start_at=0&max_results=1

https://issues.jenkins-ci.org/rest/api/2/search?jql= \
project = JENKINS AND issuetype = Bug AND status in (Resolved, Closed) \
AND resolution = Fixed AND component = core \
AND created <= "2018-02-20 10:34" \
ORDER BY created DESC&start_at=0&max_results=1

In the SZZ results, I find that there is a line number of 0?

line numebr such as,

{"0": 0}

Fetching issues does not work properly

Jenkins or Jira issues are not downloaded correctly. The fetch.py file's function fetch downloads only 50 issues at a time and pagination does not work either as everytime the 50 first issues are downloaded.

I tested the code with several Jira projects and the Jenkins project provided as an example. All tested projects had the same problem.

Apparently the Jira REST API's syntax has changed which causes the problem. I fixed the issue by changing start_at to startAt and max_results to maxResults. Example of my fix is below

    request = 'https://' + jira_project_name + '/rest/api/2/search?'\
        + 'jql={}&startAt={}&maxResults={}'

For help, -xmx64g and ran szz for 5 days without getting any results

I refer to it and ran the SZZ program for 5 days without getting any results.

java -Xmx64g -jar ${JAR_PATH}/szz_find_bug_introducers-0.1.jar -d 1 -i ${ISSUE_LIST_PA TH}/hadoop.json -r ${REPOS_PATH}/hadoop/

$ tree hadoop/
hadoop/
├── issues
│   ├── fix_and_introducers_pairs_0.json
│   ├── fix_and_introducers_pairs_1.json
│   ├── fix_and_introducers_pairs_2.json
│   ├── fix_and_introducers_pairs_3.json
│   ├── fix_and_introducers_pairs_4.json
│   ├── fix_and_introducers_pairs_5.json
│   ├── fix_and_introducers_pairs_6.json
│   └── fix_and_introducers_pairs_7.json
└── results
    ├── result0
    │   ├── annotations.json
    │   └── commits.json
    ├── result1
    │   ├── annotations.json
    │   └── commits.json
    ├── result2
    │   └── commits.json
    ├── result3
    │   └── commits.json
    ├── result4
    │   ├── annotations.json
    │   └── commits.json
    ├── result5
    │   ├── annotations.json
    │   └── commits.json
    ├── result6
    │   ├── annotations.json
    │   └── commits.json
    └── result7
        ├── annotations.json
        └── commits.json

10 directories, 22 files
$ free -g
              total        used        free      shared  buff/cache   available
Mem:             32           32           0           0           0          0
Swap:             41           28           13

and nohup.out

[main] INFO Main - Checking available processors...
[main] INFO Main - Found 8 processes!
[Thread-0] INFO parser.GitParserThread - Started process...
[Thread-3] INFO parser.GitParserThread - Started process...
[Thread-4] INFO parser.GitParserThread - Started process...
[Thread-5] INFO parser.GitParserThread - Started process...
[Thread-6] INFO parser.GitParserThread - Started process...
[Thread-7] INFO parser.GitParserThread - Started process...
[Thread-8] INFO parser.GitParserThread - Started process...
[Thread-9] INFO parser.GitParserThread - Started process...
[Thread-8] INFO parser.GitParserThread - Found 1839 number of commits.
[Thread-8] INFO parser.GitParserThread - Checking each commits diff...
[Thread-8] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-9] INFO parser.GitParserThread - Found 1926 number of commits.
[Thread-9] INFO parser.GitParserThread - Checking each commits diff...
[Thread-9] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-3] INFO parser.GitParserThread - Found 1949 number of commits.
[Thread-3] INFO parser.GitParserThread - Checking each commits diff...
[Thread-3] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-6] INFO parser.GitParserThread - Found 1977 number of commits.
[Thread-6] INFO parser.GitParserThread - Checking each commits diff...
[Thread-6] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-4] INFO parser.GitParserThread - Found 2015 number of commits.
[Thread-4] INFO parser.GitParserThread - Checking each commits diff...
[Thread-4] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-7] INFO parser.GitParserThread - Found 1941 number of commits.
[Thread-7] INFO parser.GitParserThread - Checking each commits diff...
[Thread-7] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-5] INFO parser.GitParserThread - Found 2047 number of commits.
[Thread-5] INFO parser.GitParserThread - Checking each commits diff...
[Thread-5] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-0] INFO parser.GitParserThread - Found 1973 number of commits.
[Thread-0] INFO parser.GitParserThread - Checking each commits diff...
[Thread-0] INFO parser.GitParserThread - Parsing difflines for all found commits.
[Thread-3] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-8] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-9] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-0] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-4] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-6] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-7] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-5] INFO parser.GitParserThread - Saving parsed commits to file
[Thread-3] INFO parser.GitParserThread - Building line mapping graph.
[Thread-7] INFO parser.GitParserThread - Building line mapping graph.
[Thread-4] INFO parser.GitParserThread - Building line mapping graph.
[Thread-8] INFO parser.GitParserThread - Building line mapping graph.
[Thread-6] INFO parser.GitParserThread - Building line mapping graph.
[Thread-9] INFO parser.GitParserThread - Building line mapping graph.
[Thread-5] INFO parser.GitParserThread - Building line mapping graph.
[Thread-0] INFO parser.GitParserThread - Building line mapping graph.
[Thread-3] INFO parser.GitParserThread - Saving results to file
[Thread-3] INFO parser.GitParserThread - Trying to find potential bug introducing commits...
[Thread-6] INFO parser.GitParserThread - Saving results to file
[Thread-6] INFO parser.GitParserThread - Trying to find potential bug introducing commits...
[Thread-8] INFO parser.GitParserThread - Saving results to file
[Thread-8] INFO parser.GitParserThread - Trying to find potential bug introducing commits...
[Thread-0] INFO parser.GitParserThread - Saving results to file
[Thread-0] INFO parser.GitParserThread - Trying to find potential bug introducing commits...
[Thread-9] INFO parser.GitParserThread - Saving results to file
[Thread-9] INFO parser.GitParserThread - Trying to find potential bug introducing commits...
[Thread-7] INFO parser.GitParserThread - Saving results to file
[Thread-7] INFO parser.GitParserThread - Trying to find potential bug introducing commits...

output result make me confuse

in commit.json file, the line number in the diff dict , the result seen not right, it should +1 ,that is the really right line number

execute the cmd : git blame -l b831acd9854b525d680ca72fd218c848121b9d3f^ -- core/src/test/java/hudson/model/ViewTest.java, the delete code line number is actually is 101, not 100, add dict result is the same should +1 when write result to file

In addition, the annotation graph result make me confuse, the bug-introduction in this file, 101 line number, from git blame command show commit hash should be 67827c7eaac821aa22a2f26bd4dbe7d44470b6c9 not 05b46659e451c316fb5f1a5243c49b9a84a50702 that result in annotations.json

the code in SZZUnleashed/szz/src/main/java/parser/GitParser.java 212, the var i should be index ????

when run fetch.py, how to use other project

Use github as an issue tracker

A lot of open-source projects rely on Github issues for their issue tracker. Since researchers work a lot with open source repositories, I think it's extremely valuable to have a support for retrieving issues from Github issues and using it with SZZUnleashed.

Could this feature be added?

Has the entire commit induced a bug?

I have two questions:
First, in the fix_and_bug_introducers.json, is the order [fixing, buggy] or the opposite? I am confused because commits.json which is supposed to contain buggy files has the commit numbers which are at location 0 in the pairs in fix_and_bug_introducers.json?
Second, when SZZUnleashed finds a commit as buggy would it label the entire commit as buggy or just a few files are labelled as buggy? Apparently commits.json contains the entire commit(as it appears in GitHub), and not just a few files that might have caused the bug.
If someone can kindly answer these I'd be grateful.

	for (Map.Entry<String, List<String>> entry : bucketIntroducers.entrySet()) {
	List<String> introducers = entry.getValue();
	List<String> issues = bucketIssues.get(entry.getKey());

	RevisionCombinationGenerator gen = new RevisionCombinationGenerator(introducers, issues, 2);
	gen = gen.iterator();

	while(gen.hasNext()) {
	String[] pair = gen.getNextIndic();
	if (pair[0] == "" && pair[1] == "")
	continue;

	if (isWithinTimeframe(pair[1], pair[0])) {
	bugIntroducers.add(pair);
	} else {

	if (!partialIntroducers.containsKey(entry.getKey())) {
	partialIntroducers.put(entry.getKey(), new ArrayList<>());
	}
	partialIntroducers.get(entry.getKey()).add(pair[0]);

	if (!partialIssues.containsKey(entry.getKey())) {
	partialIssues.put(entry.getKey(), new ArrayList<>());
	}
	partialIssues.get(entry.getKey()).add(pair[1]);
	}
	}
	}