pwhat's People
pwhat's Issues
try for stop = 5% 10%, 15%
Summary:
10
Mean: Win = 8/11 (two projects is accurate by 5% or more, 5 projects are accurate by 2% or more)
STD: Win = 9/11 (four projects are more accurate by 5% or more)
Evals: Win = 3/11 [23.7, 6, 1]
20
Mean: Win= 8/11 (wins in accending order = [46, 11, 8, 5, 1, 1..]
STD: WIN = 8/11 (one of the project is more accurate by 50% or more and other are better 2% and 1%)
Evals: Win = 0 ( at an average random makes 17 less number of evaluations)
Results
Research Question 1
RQ1: Does progressive sampling useful when compared to the static WHAT?
Dataset Description
Research Question 3
Comparison between number of evaluations required by WHAT and pWHAT?
Ideas?
- Find a better validation set using WHAT
- Add a new stopping criteria based on the ratio of the first 2 axis of maximum variance
- Check with Guo Guo_Results.xlsx
- Progressive sampling randomly picking points --> Works as well and better than pWHAT but not stable.
- Cluster and then split the data into training and testing --> Doesn't help. Suspect a bug in the code.
- Visualization: Visualization.xlsx
- Reimplementing pWHAT
- Exploration Techniques Page 20 https://goo.gl/tFlp4W
- Why does random sampling works well.
Summary
results
Sardar et al.
Dataset: Apache Mean: 31.026 Std: 0.017 #Evals: 55
Dataset: BerkeleyC Mean: 8.199 Std: 0.004 #Evals: 219
Dataset: BerkeleyDB Mean: 8.445 Std: 0.001 #Evals: 97
Dataset: BerkeleyDBC Mean: 656.099 Std: 0.304 #Evals: 161
Dataset: BerkeleyDBJ Mean: 52.648 Std: 0.069 #Evals: 57
Dataset: clasp Mean: 57.201 Std: 0.029 #Evals: 167
Dataset: EPL Mean: 5.405 Std: 0.001 #Evals: 104
Dataset: lrzip Mean: 495.48 Std: 0.778 #Evals: 47
Dataset: ajstats Mean: 4.347 Std: 0.0 #Evals: 2013
WHAT
Apache.csv 17.663 11.556 14.4 0.116
BerkeleyC.csv 5.499 1.166 16.0 0.012
BerkeleyDB.csv 1.593 0.297 44.9 0.003
BerkeleyDBC.csv 9.972 2.958 49.4 0.03
BerkeleyDBJ.csv 12.816 14.834 14.4 0.148
clasp.csv 6.551 2.83 30.6 0.028
Dune.csv 15.619 1.723 32.0 0.017
EPL.csv 5.501 1.721 16.0 0.017
Hipacc.csv 16.218 1.108 128.0 0.011
JavaGC.csv 17.661 2.671 512.0 0.027
LinkedList.csv 3.17 0.447 16.0 0.004
lrzip.csv 195.957 214.896 16.0 2.149
PKJab.csv 1.373 0.427 8.0 0.004
SQLite.csv 10.146 6.209 16.0 0.062
Wget.csv 21.352 14.527 15.5 0.145
AJStats.csv 2.243 0.123 128.0 0.001
pWHAT_V1
AJStats.csv,3.599,2.452,118.9,126.327708758
Apache.csv,9.098,2.569,30.1,11.622822
BerkeleyC.csv,5.665,0.63,10.7,1.268857754
BerkeleyDB.csv,2.284,0.368,10,0
BerkeleyDBC.csv,9.679,9.569,51,9.4339811321
clasp.csv,6.353,1.569,31.2,4.5782092569
Dune.csv,26.707,0.465,914.1,1.3
EPL.csv,5.06,1.322,10.5,1.5
Hipacc.csv,10.155,1.082,4387.8,1754.64422605
JavaGC.csv,62.328,0.262,66628.3,3.06757233
LinkedList.csv,3.775,0.613,10,0
lrzip.csv,8.222,5.588,86.2,30.7857109712
PKJab.csv,1.49,0.591,10,0
SQLite.csv,7.599,7.215,15.6,6.3749509802
Wget.csv,13.671,12.968,38.7,26.5633205756
pWHAT_V2
Apache.csv 8.683 1.746 37.2 21.208
BerkeleyC.csv 5.778 1.164 11.0 1.844
BerkeleyDB.csv 2.453 0.406 10.0 0.0
BerkeleyDBC.csv 6.295 4.063 63.0 13.454
BerkeleyDBJ.csv 12.048 13.801 16.3 8.603
clasp.csv 7.257 2.636 35.2 11.531
Dune.csv 27.332 0.568 913.2 0.748
EPL.csv 5.557 2.129 15.9 15.443
Hipacc.csv 9.651 0.068 5363.0 0.0
JavaGC.csv 62.2 0.467 66626.6 2.577
LinkedList.csv 3.626 0.756 10.0 0.0
lrzip.csv 8.942 5.005 87.0 23.891
PKJab.csv 1.245 0.487 10.0 0.0
SQLite.csv 3.426 3.077 30.2 18.405
Wget.csv 17.071 11.706 15.5 8.115
AJStats.csv 4.843 3.67 179.4 110.898
try for clustering = spectral
Research Question 2
Is the progressive sampling stable?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.