Giter Site home page Giter Site logo

cmput663proj's Introduction

CMPUT663 Project

This repository contains the CMPUT663 project works done titled as "Which categories of Simple Stupid Bugs (SStuB) are missed by static analyzers?". All scripts and dataset files are listed in specific directories. All the experiments are done on our local machines and publicly shared here.

Table of Contents

Participants:

Student name CCID
Sakib Hasan sakib2
Nazmus Sakeef sakeef

Task:

We anayzed the performances of the Static Analyzers like- SonarQube, SpotBugs and PMD on detecting simple stupid bugs from 100 Java and 100 maven porjects downloaded from the repository links available in the ManySStuBs4j dataset.

Acknowledgement

In accordance with the UofA Code of Student Behaviour, we acknowledge that

  • We have listed all external resources we consulted for this assignment.

Non-detailed oral discussion with others is permitted as long as any such discussion is summarized and acknowledged by all parties.

Data

All of our data files are big in size which is beyond the accepted size limit of GitHub. The dataset files are stored here: CMPUT663 G-Drive which is our dedicated drive for thisproject. For analysis, one has to download the necessary data files.

Directories

The directory structure of the repository are as follows:

  1. BugsAdded_CodeFiles: This directory contains the code files that we utilized for our manual scanner to scan. In these files, we manually integraetd some SStuBs for the specific pattern that our manual scanner will look for.
  2. NoBugs_CodeFiles: This directory contains the code files that we collected where we manually integrated SStuBs. But these files are before adding bugs. If anyone wants, these files can be reued to implement newer template of bugs or number of bugs can even be increased.
  3. ImplementedScripts: This directory contains the scripts that we implemented for our analysis. The execution instructions for these scripts are described later.
  4. ScriptsForDataCollection: This directory contains the original scripts from the ManySStuBs4j dataset's official GitHub repository. We used these scripts to download the projects that were used for static analysis through the analyzer tools.
  5. ScannedReports: These directories contain the scanned reports in .csv formats. There are four files here.

Execution

  1. To run the mining, you have to download the datasets from the G-Drive.
  2. After that, you have to have the projects listed under topProjects.csv and 'topJavaMavenProjects.csv' file with the help of the scripts- clone_top_repos.py and clone_top_maven_repos.py. We can choose any number of projects
  3. For our experiment, we took 100 projects from Maven and Java respectively. The next step is to scan the projects wtih the static analyzers of anyone's choice, and filtering out the desired SStuBs' patterns to analyze. The execution steps for the static analyzers that we used can be achieved here- SonarQube, SpotBugs, and PMD.
  4. Manual Scanning: To manually scan a Java file for the two templates of SStuBs named "Same Function More Args" and "Same Function Less Args" can be achieved by running the folowing command-

python3 ImplementedScripts/manualScanner.py

This scanner can only analyze one single file at a time. The scanner is not optimized yet, thus the path to the file that one wants to scan has to be manually edited in the Scanner file.

  1. SStuBs Statistics: To run this we need to have our projects downloaded in a directory. Then we also have to update the filepaths in this file. A thorough percentage of the 16 templates of SStuBs that are present in the dataset will be printed out in the terminal. If projects are downloaded from our Googe Drive, the percentage and number of SStuBs present in the dataset will match. To run the script the filepaths need to be updated accordingly in the file.

To get the statistics, run the following command in the root directory.

python3 ImplementedScripts/dataset_sstub_percentage.py

Report

Our Proposal and Final Report paper are saved in the Project Report directory for reference.

Bibliography

The following resources were cosnulted:

https://pmd.github.io/

https://github.com/SonarSource/sonarqube

https://github.com/spotbugs/spotbugs

https://dl.acm.org/doi/10.1145/3379597.3387491

cmput663proj's People

Contributors

sakib1486 avatar nazmussakeef avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.