Comments (5)
Hello @LambertAn, thanks for your detailed report. Can you provide some data to reproduce the behaviour? And did you run the integrity check with integrity_score? What score did you get?
from sklearn-porter.
Thanks for getting back to me.
Below is code to build a 3-class extra tree classifier on random data.
from sklearn_porter import Porter
from sklearn.ensemble import ExtraTreesClassifier
import numpy as np
# Build random dataset
prng = np.random.RandomState(123)
X = prng.rand(50, 10)
y = prng.randint(0, 3, 50)
# Fit model
model = ExtraTreesClassifier(n_estimators=3, max_depth=3, random_state=prng)
model.fit(X, y)
# export:
porter = Porter(model, language='c')
output = porter.export(embed_data=True)
with open('extratree_randomdataset_original.c', 'w') as f_out:
f_out.write(output)
# accuracy:
integrity = porter.integrity_score(X)
print(integrity)
# Show details for one point
test_point = X[2:3]
for i in range(0, len(model.estimators_)):
print ("{}: {} -> {}".format(i, model.estimators_[i].predict_proba(test_point), model.estimators_[i].predict(test_point)))
print (model.predict_proba(test_point))
print (model.predict(test_point))
print (test_point)
The integrity score on the training data is 0.86. Let's look at the result for one of the data point: each estimator predicts a different class:
Estimator 0 predicts class 0 with probabilities [0.45 0.20 0.35]
Estimator 1 predicts class 2 with probabilities [0.17 0.08 0.75]
Estimator 2 predicts class 1 with probabilities [0.24 0.52 0.24]
The model predicts class 2 with probabilities [0.29 0.27 0.45].
I attached the above python code and 2 C files (the original model as generated by sklearn-porter and a modified version that calculates the probabilities for each estimator as well as the average for the model prediction):
For the above point the original 'predict' method returns class 0 and the new model 'predict_proba method returns: [0.29 0.27 0.45].
I hope it is enough to reproduce the problem.
from sklearn-porter.
Hello @LambertAn, we found a small bug and fixed it (release/0.7.0: Merge branch 'master' into release/0.7.0). Can you please reinstall the package and test it again?
pip uninstall -y sklearn-porter
pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master
from sklearn-porter.
Hi, I finally had some time to test but unfortunately this problem was not fixed. I used the python script above and had exactly the same results as before with an integrity score of 0.86.
from sklearn-porter.
I belive this is the same issue as #52
from sklearn-porter.
Related Issues (20)
- Feature Request: translator for onehot encoder
- Feature Request: Multinomial Logistic Regression
- A bug : When the version of sklearn contains character sequences like "rc1, rc2", the Porter class cannot be created. HOT 1
- RandomForestClassifier export HOT 1
- decision tree C code exported by porter have zero integrity score with custom test_data. HOT 1
- Test code, which is part of the Readme is failing HOT 2
- [Query] Is the isolation forest model for outlier detection supported now? HOT 1
- ValueError: invalid literal for int() with base 10: 'post1' on Example from Readme HOT 2
- What does embed_data do?
- [Enhancement]Background concurrent copying GC freed for sklearn model constrcutor in Java HOT 2
- [Error] Works fine with C but getting this error when ported to Java
- OSError: Windows isn't supported yet HOT 3
- Unable to check integrity score. HOT 1
- Generating probabilities instead of categorical results
- scikit-learn-0.24.1: ModuleNotFoundError: No module named 'sklearn.tree.tree' HOT 5
- Is there any plan to support RandomForestRegressor? HOT 11
- ImportError: cannot import name 'Porter' HOT 2
- Can't use port or save functions HOT 3
- ModuleNotFoundError: No module named 'sklearn_porter' HOT 1
- ModuleNotFoundError: No module named 'sklearn.tree.tree' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sklearn-porter.