Giter Site home page Giter Site logo

bellwether_community's People

Contributors

rahlk avatar suvodeep90 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bellwether_community's Issues

Preprocess data

  • Remove all the commits that are not bugfix related
  • Remove rows that are not involved in bugfix
  • Create one big csv for each project
    • Make sure the rows in the file are chronological
    • Keep track of the commit hash (to separate into releases)

Compare bubble vs total

Bubble Algo:

def bubble(projects):
    project_attrs = get_attr(projects)
    birch_tree = birch(project_attrs)
    max_level = birch_tree.max_level
    for level in range(max_level,0):
        level_cluster_ids = birch_tree.get_cluster_ids[level]
        if level == max_level:
            bellwethers = get_bellwether(level_cluster_ids)
        else:
            for bellwether in bellwethers:
                parent_cluster = bellwether.parent_cluster_id
                s_project = bellwether.project
                d_projects = birch_tree.cluster[parent_cluster].projects
                score = apply_bellwether(s_project,d_projects)
                if parent_cluster not in bellwether_score.keys():
                    bellwether_score[parent_cluster] = score
                else:
                    if score > bellwether_score[parent_cluster]:
                        bellwether_score[parent_cluster] = score
            for parent_cluster_ids in bellwether_score.keys():
                bellwethers[parent_cluster_ids] = bellwether_score[parent_cluster_ids][project_name]
            for level_cluster_id in level_cluster_ids:
                if level_cluster_id not in bellwethers.keys():
                    remaining_cluster.append(level_cluster_id)
            bellwethers = bellwethers + get_bellwether(remaining_cluster)

1385 defect prediction datasets

todo

  • if we apply these sanity checks to GH, would we prune much?
  • results with any tuning?
  • sort learned by tenability. LR, linear SVM. FFT*16 (you'll have to discretize with the goal), NB (m-estiate, laplacian, how to discretize).
    - http://robotics.stanford.edu/users/sahami/papers-dir/disc.pdf
  • start reporting how long it takes to tune
  • add in SMO: that's hard!!!!!!
  • generality experiment : if u do anything, justify it!!!!
  • tune for IFA reduction (just using FFT)

expectations

The expectation from results:

  1. adequacy of predictors (Pd > 66, pf < 33)
  2. FSS Is useful
  3. Hyperparameter optimization is useful
  4. it all scales
  5. stable conclusion across
  6. stable conclusion locally

Projects that fail

  • lodash
  • ember-cli
  • webpack
  • expect.js
  • shelljs
  • chai
  • babel
  • karma-mocha
  • node-optimist
  • react
  • TypeScript

Plato filenames are explicit

INCORRECT

                                          Unnamed: 0  E001  E002  E003  E004  ...  W144  W145  W146  W147  W148
0       /analysis/inputs/public/source-code/index.js     0     0     0     0  ...     0     0     0     0     0
1   /analysis/inputs/public/source-code/test/main.js     0     0     0     0  ...     0     0     0     0     0
2  /analysis/inputs/public/source-code/test/submo...     0     0     0     0  ...     0     0     0     0     0
3  /analysis/inputs/public/source-code/test/simpl...     0     0     0     0  ...     0     0     0     0     0

CORRECT

            Unnamed: 0  E001  E002  E003  E004  E005  E006  ...  W142  W143  W144  W145  W146  W147  W148
0            /index.js     0     0     0     0     0     0  ...     0     0     0     0     0     0     0
1        /test/main.js     0     0     0     0     0     0  ...     0     0     0     0     0     0     0
2   /test/submodule.js     0     0     0     0     0     0  ...     0     0     0     0     0     0     0
3  /test/simpleTask.js     0     0     0     0     0     0  ...     0     0     0     0     0     0     0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.