Comments (25)
It works like a charm now with v1.0.3, the results I've got are:
BEST VALID SCORE FOR forest-cover-type : -0.9637142838702182
FINAL TEST SCORE FOR forest-cover-type : 0.963847749197525
Thanks for the quick replies and quick fixes!
from tabnet.
Thanks @Optimox, this is very helpful. I amended the relevant code and found the following:
print(f"BEST VALID SCORE FOR {dataset_name} : {clf.best_cost}")
BEST VALID SCORE FOR Forest: -0.8830427851320214
print(f"FINAL TEST SCORE FOR {dataset_name} : {test_acc}")
FINAL TEST SCORE FOR Forest: 0.8830042123459947
That looks reasonable but still in the original paper authors claim performance of 96,99%
in accuracy, which is still far from my findings. Let me know if you have any feedback on that.
Many thanks again! d
from tabnet.
@meechos how did you run your experiment? what was the parameters you used?
I launched two experiments:
- launching forest_example as is : valid_accuracy = 0.9445 test_accuracy = 0.9459
- launching forest_example with n_d=n_a=64 (similar to the research paper) : valid_accuracy = 0.9629 test_accuracy = 0.9639
So it's not exactly the same results as written in the paper (0.9699) but it's pretty close, moreover we did a fix so the latest version should be a bit different and might be better (there was an issue with shared layers).
FYI: XGBoost from the notebook gave valid_accuracy = 0.9642 test_accuracy = 0.9650, so TabNet has similar results.
from tabnet.
@meechos @carefree0910 new version has been updated to Pypi so you can pip install the latest version and try again, hopefully everything is fixed now!
from tabnet.
I noticed a similar result with the Poker Hand dataset, I got 54.04%, far from their alleged 99%
from tabnet.
Thanks @muellerzr, this is interesting. Have you tried any of the examples from this repository, e.g. census_example
or the regression_example
?
from tabnet.
@meechos I did some experiments on Forest Cover Type, as the split is given in the paper, results should be reproducible. On one hand I tried to launch the official code in tensorflow https://github.com/google-research/google-research/tree/master/tabnet but I never got the results from the paper (after more than 120K steps, on the other hand I had better results with our implementation. The most disturbing part is that vanilla XGBoost has far better results that the ones written in the original paper. Nevertheless from what I remember TabNet had similar results as XGBoost.
@muellerzr On poker hand dataset I remember seeing the same kind of performance of what you are mentioning, however this dataset is very unbalanced and I guess that the split is very important but I could not find any hint on what was the split used in the original paper.
I would be very interested to know about your results on different dataset, if you do some experiments please don't hesitate to share the results! Thanks!
from tabnet.
@Optimox, thanks for a comprehensive reply.
I did some experiments on Forest Cover Type, as the split is given in the paper, results should be reproducible.
Apologies, do you mean you found results similar to what I posted or similar to the ones in the original paper? Example notebooks of this tabnet implementation with outputs would be of help. If you would consider sharing that would be great.
I have run several tests by implementing the original codebase in Tensorflow as well. Results were far worse than XGBoost/LightGBM. More interestingly, That tabnet implementation was also worse than a vanilla deep feed-forward neural network by about 10%.
Despite the above, I plan to continue experimenting with your implementation in terms of parameter tuning. Will report back with any findings.
Many thanks!
from tabnet.
@meechos I think I found the problem, there is an error in the current notebook : what happens is that the classes for the dataset are clf.classes_ = [1, 2, 3, 4, 5, 6, 7]
, which is ok for TabNetClassifier because target are label encoded
But in the notebook for the test set we do test_acc = accuracy_score(y_pred=np.argmax(preds, axis=1), y_true=y_true)
which implies that the targets are [0, 1, 2, 3, 4, 5, 6]
so there is a mismatch between predictions and labels, if you just add 1 and look at test_acc = accuracy_score(y_pred=np.argmax(preds, axis=1)+1, y_true=y_true)
your score should be ok.
I'll fix the notebook to make things work as expected!
Thanks for your useful feeback!
from tabnet.
@optimo, there must be something we do differently. I am using the forest_example.ipynb
with parameters:
n_d=32, n_a=32, n_steps=5,
lr=0.02,
gamma=1.5, n_independent=2, n_shared=2,
cat_idxs=cat_idxs,
cat_dims=cat_dims,
cat_emb_dim=1,
lambda_sparse=1e-4, momentum=0.3, clip_value=2.,
optimizer_fn=torch.optim.Adam,
scheduler_params = {"gamma": 0.95,
"step_size": 20},
scheduler_fn=torch.optim.lr_scheduler.StepLR, saving_path="./", epsilon=1e-15
I have launched two experiments with the 1.0.1
and recent 1.0.2
versions, and got:
pytorch-tabnet version | test acc |
---|---|
1.0.1 | 0.886 |
1.0.2 | 0.750 |
Are you running the exact same notebook? Are you using a GPU for training?
Many thanks, appreciate the commitment!
from tabnet.
hey @meechos, the two results I got was with pytorch-tabnet 1.0.1 (I think there is a bug on 1.0.2 that I need to fix, I'm on it) using CPU and launching the notebook without any changes. Did you change the patience? My models were training for more than 900 epochs.
However when switching to GPU I saw a weird behavior, the model early stopped after around 300 epochs at a score around 0.94 (for the large version n_d=64), the problem is that GPU results are not reproducible...
Are you using the docker image from the repo or are you using it with your local machine? are you using GPU?
from tabnet.
Hello @Optimox, thanks so much for working on the 1.0.2
. The performance decrease in 1.0.2
was reproduced in my own datasets so I am sure it's worth the debug.
Are you using the docker image from the repo
I have just pip installed
the package, nothing breaks though. Would there be further configurations? Sorry if I missed that.
I am running experiments on 2 x Tesla
. Definitely not ideal for reproducibility checks but I think the
Forest results are signifying something different being out of place.
Thanks for following up this ticket, cheers
from tabnet.
I've seen drops with GPU but not as big as you describe, you might want to launch the default notebook and simply add device_name='cpu'
when calling TabNet Classifier and see what happens. I don't think the code handle multi-gpu, is it?
from tabnet.
Hello @meechos,
I just re-runned the forest_example.ipynb on GPU, and got 0.93805224058811 for valid and 0.9396401125616378 for test for version 1.0.1
from tabnet.
Thank you for confirming @Hartorn @Optimox . I re-run an experiment on google colab with a GPU and got
BEST VALID SCORE FOR forest-cover-type : -0.8830427851320214
FINAL TEST SCORE FOR forest-cover-type : 0.883987504625526`
Running a new experiment on CPU and I will report back, this is very interesting.
from tabnet.
I've encountered the similar performance issue but the situation seems to be worse. Here are my setup procedures:
pip install pytorch-tabnet
, which isv1.0.2
- ONLY downloaded
forest_example.ipynb
, from thedevelop
branch, and run it through
And here are the
- results for tabnet:
Device used : cuda
Current learning rate: 0.011376001845529194
| 238 | 0.87303 | 0.55215 | 4678.0
Early stopping occured at epoch 238
Training done in 4678.040 seconds.
BEST VALID SCORE FOR forest-cover-type : -0.6234146782240524
FINAL TEST SCORE FOR forest-cover-type : 0.6246912730308167
- results for xgboost
0.017897597087848608
0.017219865235837285
There must be something wrong but I couldn't figure out, any help would be grateful, thanks in advance!
from tabnet.
@carefree0910 yep we have a fix that is working just doing some tests before releasing it, so in order to have the results for TabNet you'll have to wait a few hours sorry!
about XGBoost, I think I forgot to change in the notebook, simply do accuracy_score(y_pred=np.argmax(preds, axis=1)+1, y_true=y_true)
and you'll get the real results
from tabnet.
Thanks for your reply! I'll wait for the release. And thanks again for implementing this interesting work!
from tabnet.
Thank you for confirming @Hartorn @Optimox . I re-run an experiment on google colab with a GPU and got
BEST VALID SCORE FOR forest-cover-type : -0.8830427851320214 FINAL TEST SCORE FOR forest-cover-type : 0.883987504625526`Running a new experiment on CPU and I will report back, this is very interesting.
That is really weird, I'm trying to understand ^^
Also it seems that your validation score is exactly the same as the one you gave with your first message, it seems weird that the GPU seed wouldn't change at all, are you sure the gpu is recognized? When defining TabNetClassifier
do you see a message saying "Deviced used cuda"?
Moreover in your first message you reported : BEST VALID SCORE FOR EPIGN
what is EPIGN dataset? Are you sure you don't read from the wrong file that you renamed forest_cover_type and so the notebook is not downloading anything but reads the wrong dataset? Sorry for the stupid questions but how many lines, columns and target modalities do you have in your dataset?
from tabnet.
it seems weird that the GPU seed wouldn't change at all, are you sure the gpu is recognized?
Yes, also monitored with nvidia-smi.
Are you sure you don't read from the wrong file that you renamed forest_cover_type
Exactly my thoughts, as in the first experiments had to download the data manually due to policy. This is why I run the notebook on colab, but interestingly saw the exact same results as you mention above.
@Optimox this is indeed really weird. Have you tried pip
ing the repo into a new environment fetching the data from scratch?
I can share the colab notebook if that'd be helpful.
from tabnet.
Great! I'll try it out in no time!
from tabnet.
Thanks for testing, @carefree0910 !
Have you obtained these results on CPU or GPU?
from tabnet.
I'm using TITAN RTX for testing, so GPU results are reproducible now!
However I'm not sure whether different GPU product type matters...
from tabnet.
@Optimox @carefree0910 similar results here with 1.0.3
on the forest dataset.
Thanks so much for a new release so quickly, I will report back with results my own datasets.
from tabnet.
fyi, the author's code for experiment results in the paper can be found at:
https://openreview.net/forum?id=BylRkAEKDH
Code: https://drive.google.com/file/d/1oLQRgKygAEVRRmqCZTPwno7gyTq22wbb/view?usp=sharing
from tabnet.
Related Issues (20)
- Current version on conda-forge is 4.0 while 4.1 is already released HOT 8
- Minimal working example for TabNetRegressor/Classifier HOT 4
- Transfer learning, capability to change structure of model HOT 1
- Generate Embeddings for Tabular Data HOT 1
- TabNet overfits (help wanted, not a bug) HOT 9
- TabNetRegressor vs other networks HOT 1
- spike in memory when training ends HOT 8
- Severe overfitting HOT 18
- OOM problem when I search hyperparameters with Tabnet HOT 3
- Support for complex-valued datasets HOT 4
- Different classification variables in the test set and train set HOT 1
- Struggling to get model to fit - Help Wanted HOT 7
- Optimizing TabNet for Disease Classification with Continuous Audio Features HOT 1
- Interpreting Sparsity on Global Importance HOT 5
- ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() HOT 1
- Validation loss HOT 1
- Lightweight Fine-tunning or few-shot learning for limited labeled data HOT 1
- Maybe `drop_last` should be set as False in default? HOT 1
- Incompatiblity of current round() method with pytorch tensors when performing early stopping HOT 1
- Retraining a saved model on different dataset HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tabnet.