zihangjiang / tokenlabeling Goto Github PK
View Code? Open in Web Editor NEWPytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"
License: Apache License 2.0
Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"
License: Apache License 2.0
Hi, I am curious about the problem of dimension inconsistency.
(1) The shape of "score_map" that generated in generate_label.py is [2, 5, H, W], but the dimension of score_maps seems to be [2, H, W, 5] in "score_maps[-1,0,0,5]=target " (line 97 of TokenLabeling/tlt/data/dataset.py )
(2) The dimension of "label_maps_topk" in line 54 of TokenLabeling/tlt/data/mixup.py is [batch_size, 3, H, H, 5], but I cannot find how to transform "score_maps" to "label_maps_topk", and what information is stored in the 0, 1, 2 dimension of "label_maps_topk", respectively.
This problem has also been mentioned in the Issue #9
if I only have one GPU, how to change it?
Greetings,
Thank you for this incredible research.
I would like to know if it is possible to use Token Labeling to achieve scores higher than that of the annotator model, I believe this was the case with VOLO D5 model where it achieved higher score than NFNet, model used for annotation.
The model parameters couldn't be downloaded.
where is the training code?
Hello,
Thank you for sharing your work. I am currently trying to generate token label to a custom dataset for model lvvit_s, but I keep getting the loss close to 7 and the Accuracy 0 (not pre-trained and using 1 GPU in Google Colab). I also tried using the pre-trained model with --transfer but got 0 in both Loss and Acc . What option should I use for a custom dataset?
Hi, can you re-check the models you provided by the download link? I downloaded the first one but it cannot be unzipped.
Have you tried this setting?
in ur released lvvit.py code, mixtoken is implemented by cut & mix the origin gridmap and the flipped one, with labels no need to change, which is not as described in the paper.
is this what you actually did during the training process?
Hi, thanks for your wonderful work. I have a question that whether training techniques mentioned in the LV-Vit can be used in other
downstream task like object detection? In your paper, I see that many of this techniques are used in ImageNet. Thanks!
Hi,
Thanks for sharing your work.
Could you also provide the pre-trained weights for the LV-ViT-T model variant, the one that achieves 79.1% top1-acc. as mentioned in Table 1 of your paper?
All the best,
Marc
if I wanna to use 1p to train,
how many batchsize I need to allocate?
or there's the formula to compute?, please
作者好,
请问下如果我不传token_label,
看起来是用的默认dataset,
那就感觉像是普通训练呢,那是否token_label_path是必传,
还是代码里会处理普通数据?我没看到呢
怎么在自己的数据集上生成dense label map呢?
tar -xvf lvvit_s-26M-384-84-4.pth.tar tar: This does not look like a tar archive tar: Skipping to next header tar: Exiting with failure status due to previous errors
Hi,
Thanks for the wonderful work.
Could you share with us the password to unzip LV-ViT-S pretrained model ?
Thanks !
Hello ~
I'm interested in your token labeling technique,
So I want to apply this technique in CNN based model because ViT is very heavy to train.
can I get the your code with CNN token labeling?
if you're not give me some detail for implementing
thank you.
I am a green hand of DL. When I run the code of volo with tlt in a single or multi GPU, I get an error as follows:
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [0,0,0], thread: [25,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds"
failed.
Traceback (most recent call last):
File "main.py", line 949, in
main()
File "main.py", line 664, in main
optimizers=optimizers)
File "main.py", line 773, in train_one_epoch
label_size=args.token_label_size)
File "/opt/conda/lib/python3.6/site-packages/tlt/data/mixup.py", line 90, in mixup_target
y1 = get_labelmaps_with_coords(target, num_classes, on_value=on_value, off_value=off_value, device=device, label_size=label_size)
File "/opt/conda/lib/python3.6/site-packages/tlt/data/mixup.py", line 64, in get_labelmaps_with_coords
num_classes=num_classes,device=device)
File "/opt/conda/lib/python3.6/site-packages/tlt/data/mixup.py", line 16, in get_featuremaps
_label_topk[1][:, :, :].long(),
RuntimeError: CUDA error: device-side assert triggered.
I can't fix this problem right now.
Hello,
I was trying to compute the class weight "balanced". I see that there are two arguments:
parser.add_argument('--dense-weight', type=float, default=0.5,
help='Token labeling loss multiplier (default: 0.5)')
parser.add_argument('--cls-weight', type=float, default=1.0,
help='Cls token prediction loss multiplier (default: 1.0)')
How can I add the parameter "weights" into the loss?
What I did so far:
from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight('balanced', np.unique(target_values), target_values.numpy())
class_weights = torch.tensor(class_weights, dtype=torch.float)
train_loss_fn = nn.CrossEntropyLoss(weight=class_weights).cuda()
See that I am changing the loss function, before I was using TokenLabelCrossEntropy:
train_loss_fn = TokenLabelCrossEntropy(dense_weight=args.dense_weight,\
cls_weight = args.cls_weight, mixup_active = mixup_active, ground_truth=args.ground_truth).cuda()
Thank you in advance
Hi,
When I tried to run the label generation script for the model lvvit_s it returned an error "RuntimeError: Unknown model".
Solution: It worked when I added the line "import tlt.models" in the file generate_label.py.
The shape of 'score_map' is [2, 5, H, W], but I'm curious about why append image class label in this coordinate.
TokenLabeling/tlt/data/dataset.py
Line 97 in 5cc1461
I am interested if there is any LV-ViT- model setup you have tested for Cifar10. I would like to know the proper setup of all blocks in none pretrained weights settings.
Test: [ 0/1] Time: 11.293 (11.293) Loss: 0.7043 (0.7043) Acc@1: 42.1875 (42.1875) Acc@5: 100.0000 (100.0000) Test: [ 1/1] Time: 0.108 (5.701) Loss: 0.5847 (0.6689) Acc@1: 89.8148 (56.3187) Acc@5: 100.0000 (100.0000) free(): invalid pointer free(): invalid pointer Traceback (most recent call last): File "/opt/conda/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/conda/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launch.py", line 303, in <module> main() File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launch.py", line 294, in main raise subprocess.CalledProcessError(returncode=process.returncode, subprocess.CalledProcessError: Command '['/opt/conda/bin/python3.8', '-u', 'main.py', '--local_rank=1', './dataset/c/c', '--model', 'lvvit_s', '-b', '128', '--apex-amp', '--img-size', '224', '--drop-path', '0.1', '--token-label', '--token-label-size', '14', '--dense-weight', '0.0', '--num-classes', '2', '--finetune', './pretrained/lvvit_s-26M-384-84-4.pth.tar']' died with <Signals.SIGABRT: 6>. root@btq3ajqsfk1cu-0:/puxin_libochao/TokenLabeling# CUDA_VISIBLE_DEVICES=0,1 bash ./distributed_train.sh 2 ./dataset/c/c --model lvvit_s -b 128 --apex-amp --img-size 224 --drop-path 0.1 --token-label --token-label-size 14 --dense-weight 0.0 --num-classes 2 --finetune ./pretrained/lvvit_s-26M-384-84-4.pth.tar
Hi
Thanks so much for the nice work!
I am curious if you could share the insight on processing of the label_map.
If I understand it correctly, after we load image and the corresponding, we shall do the same cropping/ flip/ resize, but in
TokenLabeling/tlt/data/label_transforms_factory.py
Lines 58 to 73 in aa438ef
Shall we do
return torchvision_F.resized_crop(
img, i, j, h, w, self.size, interpolation
), torchvision_F.resized_crop(
label_map, i / ratio, j / ratio, h / ratio, w / ratio, self.size, interpolation
)
Thanks
Hi, thanks for you great work. When I train script, some error occurs: AttributeError: 'tuple' object has no attribute 'log_softmax'
with amp_autocast():
output = model(input)
loss = loss_fn(output, target) # error occurs
and loss function is train_loss_fn = LabelSmoothingCrossEntropy(smoothing=0.0).cuda()
by the way: Could you please tell me why we need to specify smoothing=0.0?
Hello! I use ILSVRC2012_img_train and ILSVRC2012_img_val, and use the provided label_top5_train_nfnet from Google Drive. I train lv-vit-s with batch_size 64 without apex for one epoch. Thanks for your advice.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.