ze-yang / context-transformer Goto Github PK
View Code? Open in Web Editor NEWContext-Transformer: Tackling Object Confusion for Few-Shot Detection, AAAI 2020
Home Page: https://arxiv.org/abs/2003.07304
License: MIT License
Context-Transformer: Tackling Object Confusion for Few-Shot Detection, AAAI 2020
Home Page: https://arxiv.org/abs/2003.07304
License: MIT License
Can you provide a demo to test a single image?
Your work is excellent, and by reading your readme.md I can easily reproduce your result in VOC increment experiment setting. However, due to my limit cuda memory(I have only two 2080ti for experiment), my result is lower than your paper(loss 5%) in batch size 32. Can you give me some suggestion on improving experiment result? Thank you very much!
Dear @Ze-Yang,
Thank you for your great work!
I attempt to re-train your model with my 1 GPU. As your default setup, it is too large to be able run on my GPU because I just have 1 GPU 12GB. Therefore, I reduce the batchsize and learning rate down to 4 times and increase max iteration and stepsize up to 4 times. Here is my results when I test my model with 1 shot:
The results seem lower than yours about a few percentages. May you show me it is ok or not? If not, could you tell me how I can reproduce your result with my only 1 GPU.
Hope to hear from you soon!
Sincerely,
Duynn
what does "src_cls_dim" mean? when I train my custom dataset, I need to modify it.
Hi @Ze-Yang,
I successfully re-produce your result with your readme. And I now want to train on my customized dataset which have the format similar to PASCAL. The dataset have 150 base classes and 50 novel classes. When I train base classes, it is ok. However, when transferring to novel classes, the loss is just nan!
I also change the classes in voc0712.py and some line in RFB_NET_vgg.py to be suitable to 150 base classes and 50 novel classes:
So Could you help me out why it is in the case or there is any lines in some files that I have to change?
Your work is excellent, but I'm confused with the contents of "trainval_1shot.txt" when I reproducing "Phase 2, Transfer Setting, To finetune on VOC dataset (1 shot)".
In your paper, it is said that "The few-shot training set consists of N images (per category)", so in the above settings, if I understand correctly, the contents of "trainval_1shot.txt" should have 20 categories of boxes in total and have 1 image per category.
However, as shown below, if I didn't count wrongly, there are only 11 categories of boxes in total, and some categories have more than 1 image, which is not consistent with the above and confused me a lot.
I’d be grateful if you could explain the processes of creating all the "trainval_1shot.txt, trainval_2shot.txt, trainval_3shot.txt, ..." files. Thanks a lot!
trainval_1shot.txt
Image_index|box category|# of boxes|
007654 aero 1
003137 bus 1
008442 cat 1
003452 dinnertable 1 chair 6
004141 cow 2 person 1
000249 chair 7 dinnertable 1
005018 dog 2
006177 motorbike 1 person 1 cow 2
004424 person 5
006351 person 1 pottedplant 2
006803 train 1 person 2Total:
box category|# of images|
aero 1|bus 1|cat 1|dinnertable 2|chair 2|cow 2|person 4|dog 1|motorbike 1|pottedplant 1|train 1
Hello
Thanks for contributing this paper/project.
I have a question.
when I run the VOC2007.sh, the link about Main2007.tar is the same as Main2007.zip?
thanks.
Hello
Thanks for contributing this paper/project.
I have a question.
This paper/project compared the experiment results with some papers/projects researched in 2017 and 2019.
We want to know the result compared with some SOTA methods developed in this year.
I share some papers published in this year.
https://arxiv.org/pdf/2003.04668.pdf
https://arxiv.org/pdf/2003.06957.pdf
Thank you and Regards
can you provide the python file of voc data split?
thanks
Hello!
I want to implement an Object Confusion experiment as your paper shows in Figure.5, but couldn't find an appropriate way. I tried to use the number of True Positives(TP) and False Positives(FP) to represent (classification √ localization √) and (classification √ localization ×) respectively, but not sure if it makes sense and the results are far beyond from yours. Could you please briefly share experiment details about Figure.5, like what kind of metrics do you use. Thanks!
I run the python data/split_coco_dataset_voc_nonvoc.py ,and I get this problem, how can I solve it.
Split 1: 346721 anns; save to ./data/COCO/annotations/split_voc_instances_train2014.json
Split 2: 258186 anns; save to ./data/COCO/annotations/split_nonvoc_instances_train2014.json
processing dataset ./data/COCO/annotations/instances_valminusminival2014.json
Traceback (most recent call last):
File "data/split_coco_dataset_voc_nonvoc.py", line 87, in
split_dataset(dataset_prefix + s, voc_inds, split1_prefix + s, split2_prefix + s)
File "data/split_coco_dataset_voc_nonvoc.py", line 16, in split_dataset
with open(dataset_file) as f:
FileNotFoundError: [Errno 2] No such file or directory: './data/COCO/annotations/instances_valminusminival2014.json'
Your code used to run properly, but after the GPU driver on our lab's server was reinstalled, I met this error "ImportError: cannot import name '_mask' from 'utils.pycocotools' ". I tried to reupload the 'pycocotools' folder, but it didn't work.
Hi, thanks for your work.
In 'split_coco_dataset_voc_nonvoc.py', you split 3 annotation files, but I can't find 'instances_valminusminival2014.json' and 'instances_minival2014.json' in the official website of COCO. Could you tell me where I can get these files?
hello, thanks for your nice work, i want to train a model with input size=512, but i got an error
list index out of range
in the script "RFB_Net_vgg.py"
could you give me a help, thank you!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.