Dear author, I am trying to reproduce your paper, but when I run the config of pascal

Download it from FsDet. <a href="http://dl.yf.io/fs-det/models/voc/split1/base_model/m

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Unable to reproduce the FSCE result of pascal voc split1 shot3 about fsce HOT 30 OPEN

megvii-research commented on September 25, 2024

Unable to reproduce the FSCE result of pascal voc split1 shot3

from fsce.

Comments (30)

bsun0802 commented on September 25, 2024

Which base training checkpoint (model_final.pth) you are using? You train it yourself or downloaded from FsDet.

from fsce.

Retiina commented on September 25, 2024

Download it from FsDet. http://dl.yf.io/fs-det/models/voc/split1/base_model/model_final.pth

from fsce.

Chauncy-Cai commented on September 25, 2024

Maybe this yaml may help.

Because of such few instances and early-stop strategy to prevent overfitting, the unstable result is normal.

from fsce.

yhcao6 commented on September 25, 2024

Sir, thanks for your reply. I will try this config and will inform you when I get the result.

from fsce.

bsun0802 commented on September 25, 2024

@Retiina @yhcao6

Hi guys, will you be able to reproduce the nAP for 10shot PASCAL VOC split1?

from fsce.

yhcao6 commented on September 25, 2024

@bsun0802 Have not tried 10 shot but still can't reproduce the map for 1 shot and 3 shot of FSCE, but can reproduce the mAP of the improved TFA.

from fsce.

bsun0802 commented on September 25, 2024

those shots are unstable and can have large variance in different run.

please try if 5-shot and 10-shot can be reproduced, another thread finds not, if so, we need to revise what's wrong.

Thanks.

from fsce.

yhcao6 commented on September 25, 2024

Sure, I will try it now

from fsce.

Cuzny commented on September 25, 2024

What if I only use 1 gpu? Will this affect the result?
In addition, the config file shows that the backbone seems to be trained.

However, during training, it shows

from fsce.

yhcao6 commented on September 25, 2024

@bsun0802 , this is the result of split1 10shot of FSCE, still lower than paper

[03/29 16:10:55 fsdet.evaluation.pascal_voc_evaluation]: Evaluating voc_2007_test_all1 using 2007 metric. Note that results do not use the official Matlab API.
[03/29 16:11:13 fsdet.evaluation.pascal_voc_evaluation]: Evaluate per-class mAP50:
|  aeroplane  |  bicycle  |  boat  |  bottle  |  car   |  cat   |  chair  |  diningtable  |  dog   |  horse  |  person  |  pottedplant  |  sheep  |  train  |  tvmonitor  |  bird  |  bus   |  cow   |  motorbike  |  sofa  |
|:-----------:|:---------:|:------:|:--------:|:------:|:------:|:-------:|:-------------:|:------:|:-------:|:--------:|:-------------:|:-------:|:-------:|:-----------:|:------:|:------:|:------:|:-----------:|:------:|
|   85.873    |  85.576   | 66.554 |  67.776  | 87.936 | 88.366 | 63.925  |    64.852     | 85.325 | 85.267  |  78.948  |    49.353     | 76.907  | 85.293  |   77.127    | 41.074 | 75.418 | 68.892 |   68.620    | 54.369 |
[03/29 16:11:13 fsdet.evaluation.pascal_voc_evaluation]: Evaluate overall bbox:
|   AP   |  AP50  |  AP75  |  bAP   |  bAP50  |  bAP75  |  nAP   |  nAP50  |  nAP75  |
|:------:|:------:|:------:|:------:|:-------:|:-------:|:------:|:-------:|:-------:|
| 45.616 | 72.873 | 48.901 | 48.464 | 76.605  | 51.869  | 37.073 | 61.674  | 40.000  |
[03/29 16:11:13 fsdet.engine.defaults]: Evaluation results for voc_2007_test_all1 in csv format:
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: Task: bbox
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: AP,AP50,AP75,bAP,bAP50,bAP75,nAP,nAP50,nAP75
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: 45.6161,72.8726,48.9014,48.4638,76.6053,51.8685,37.0729,61.6745,40.0001
[03/29 16:11:13 fsdet.utils.events]:  eta: 0:00:00  iter: 14999  total_loss: 0.4749  loss_cls: 0.04555  loss_box_reg: 0.04276  loss_contrast: 0.3756  loss_rpn_cls: 0.002355  loss_rpn_loc: 0.004225  time: 0.4634  data_time: 0.0396  lr: 0.00025  max_mem: 2058M
[03/29 16:11:14 fsdet.engine.hooks]: Overall training speed: 14996 iterations in 1:55:51 (0.4635 s / it)
[03/29 16:11:14 fsdet.engine.hooks]: Total training time: 3:16:01 (1:20:10 on hooks)

from fsce.

bsun0802 commented on September 25, 2024

What if I only use 1 gpu? Will this affect the result?
In addition, the config file shows that the backbone seems to be trained.

However, during training, it shows

== 1 ==
I don't think 1 gpu can reproduce the same results. All experiments are performed on 8-gpus.
== 2 ==
Resnet layers are frozen, FPN lateral and top-down convs are finetuned,

from fsce.

bsun0802 commented on September 25, 2024

@bsun0802 , there is the result of split1 10shot of FSCE, still lower than paper

[03/29 16:10:55 fsdet.evaluation.pascal_voc_evaluation]: Evaluating voc_2007_test_all1 using 2007 metric. Note that results do not use the official Matlab API.
[03/29 16:11:13 fsdet.evaluation.pascal_voc_evaluation]: Evaluate per-class mAP50:
|  aeroplane  |  bicycle  |  boat  |  bottle  |  car   |  cat   |  chair  |  diningtable  |  dog   |  horse  |  person  |  pottedplant  |  sheep  |  train  |  tvmonitor  |  bird  |  bus   |  cow   |  motorbike  |  sofa  |
|:-----------:|:---------:|:------:|:--------:|:------:|:------:|:-------:|:-------------:|:------:|:-------:|:--------:|:-------------:|:-------:|:-------:|:-----------:|:------:|:------:|:------:|:-----------:|:------:|
|   85.873    |  85.576   | 66.554 |  67.776  | 87.936 | 88.366 | 63.925  |    64.852     | 85.325 | 85.267  |  78.948  |    49.353     | 76.907  | 85.293  |   77.127    | 41.074 | 75.418 | 68.892 |   68.620    | 54.369 |
[03/29 16:11:13 fsdet.evaluation.pascal_voc_evaluation]: Evaluate overall bbox:
|   AP   |  AP50  |  AP75  |  bAP   |  bAP50  |  bAP75  |  nAP   |  nAP50  |  nAP75  |
|:------:|:------:|:------:|:------:|:-------:|:-------:|:------:|:-------:|:-------:|
| 45.616 | 72.873 | 48.901 | 48.464 | 76.605  | 51.869  | 37.073 | 61.674  | 40.000  |
[03/29 16:11:13 fsdet.engine.defaults]: Evaluation results for voc_2007_test_all1 in csv format:
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: Task: bbox
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: AP,AP50,AP75,bAP,bAP50,bAP75,nAP,nAP50,nAP75
[03/29 16:11:13 fsdet.evaluation.testing]: copypaste: 45.6161,72.8726,48.9014,48.4638,76.6053,51.8685,37.0729,61.6745,40.0001
[03/29 16:11:13 fsdet.utils.events]:  eta: 0:00:00  iter: 14999  total_loss: 0.4749  loss_cls: 0.04555  loss_box_reg: 0.04276  loss_contrast: 0.3756  loss_rpn_cls: 0.002355  loss_rpn_loc: 0.004225  time: 0.4634  data_time: 0.0396  lr: 0.00025  max_mem: 2058M
[03/29 16:11:14 fsdet.engine.hooks]: Overall training speed: 14996 iterations in 1:55:51 (0.4635 s / it)
[03/29 16:11:14 fsdet.engine.hooks]: Total training time: 3:16:01 (1:20:10 on hooks)

@yhcao6 This is the final checkpoint, did you checked the best checkpoint?

from fsce.

yhcao6 commented on September 25, 2024

@bsun0802 I checked the best nAP50 is 62.346

from fsce.

bsun0802 commented on September 25, 2024

@yhcao6 Seems odd. I would say above 62.6 should be easy to reach.
We will have time to inspect that until this weekend.

from fsce.

yhcao6 commented on September 25, 2024

Thanks for taking your time to check it.

from fsce.

Chauncy-Cai commented on September 25, 2024

@yhcao6 Seems odd. I would say above 62.6 should be easy to reach.

This is my rerun today. It does reach 62.5+ without any change.

Since few-shot task is not stable and data reported in paper is the best result in multiple runs, I think slight difference is normal.

from fsce.

Cuzny commented on September 25, 2024

@yhcao6 Seems odd. I would say above 62.6 should be easy to reach.

This is my rerun today. It does reach 62.5+ without any change.

Since few-shot task is not stable and data reported in paper is the best result in multiple runs, I think slight difference is normal.

Is this the result on seed 0？
thanks for your reply.

from fsce.

yhcao6 commented on September 25, 2024

@Chauncy-Cai Thanks for your reply. One possible reason may come from the randomness of surgery. If convenient would you like to upload your model_reset_surgery.pth?

from fsce.

Chen-Song commented on September 25, 2024

Yes, just as TFA , seed 0 actually is manually sampled. Thus, it always has the best result.

What does seed0 mean? In http://dl.yf.io/fs-det/datasets/vocsplit/, there no seed0 folder.

from fsce.

Chen-Song commented on September 25, 2024

In Table1, the performance on 10-shot is 61.4 while in Table2, the result is 63.4 and the average over 10 random seeds is 59.7. These results are confusing.

from fsce.

Chen-Song commented on September 25, 2024

I use your base model and train it 'Stage 2: Fine-tune for novel data' with 4 gpus, but the results are much lower than the reported. I use the txt file of seed1 folder.

from fsce.

Chauncy-Cai commented on September 25, 2024

What does seed0 mean? In http://dl.yf.io/fs-det/datasets/vocsplit/, there no seed0 folder.

OK,I should describe more accurately. "http://dl.yf.io/fs-det/datasets/vocsplit/*.txt" instead of "seed0".

In Table1, the performance on 10-shot is 61.4 while in Table2, the result is 63.4 and the average over 10 random seeds is 59.7. These results are confusing.

All experiment,except the average performance among 10 random seed in table 2, we have done is based on "http://dl.yf.io/fs-det/datasets/vocsplit/*.txt".

I use your base model and train it 'Stage 2: Fine-tune for novel data' with 4 gpus, but the results are much lower than the reported. I use the txt file of seed1 folder.

First, we get the result based on 8 gpus for training/finetune. Thus, we don't know the performance with 4 gpus. Moreover, nAP greatly depends on the finetune data you choose.

from fsce.

yunh-w commented on September 25, 2024

@Chauncy-Cai Can you tell me how to get 59.7 in over 10 random seeds? Just use the source code to train 10 times?
Or edit this code?
meta_pascal_voc.py line 69 split_dir = os.path.join(split_dir, "seed{}".format(1))

from fsce.

Chauncy-Cai commented on September 25, 2024

Just simply train the code with data in "seeds [1-10]" files in "http://dl.yf.io/fs-det/datasets/vocsplit/".
You can change the train&test dataset in yaml directly.
For instance, (coco_trainval_all_30shot) ->(coco_trainval_all_30shot_seed1) to use seed1 file.

from fsce.

yunh-w commented on September 25, 2024

@Chauncy-Cai Thanks for you reply !

from fsce.

yuyiwings commented on September 25, 2024

Recently, I used the original split-1 10-shot config and the base model downloaded from FsDet with 8 GPUs to train a model. But why can't I reproduce the result over 10 random seed?
I only get bAP50 is 71.6 and nAP50 is 57.5. I use the final checkpoint but not the best checkpoint. Does the final model need to combine the base model classifier and the fine-tuned classifier?

Could you provide me a model you have trained to reach the expected result?

from fsce.

kike-0304 commented on September 25, 2024

How can I download txt files at one time from http://dl.yf.io/fs-det/datasets/vocsplit/ ? Do I need to copy them manually？

from fsce.

kike-0304 commented on September 25, 2024

如果我只使用 1 个 gpu 怎么办？这会影响结果吗？
此外，配置文件显示主干似乎已经过训练。
但是，在训练过程中，它显示



== 1 == 我不认为 1 gpu 可以重现相同的结果。所有实验均在 8-gpus 上进行。 == 2 == Resnet 层被冻结，FPN 横向和自上而下的 convs 被微调，

What if I only use 1 gpu? Will this affect the result?
In addition, the config file shows that the backbone seems to be trained.

However, during training, it shows

== 1 == I don't think 1 gpu can reproduce the same results. All experiments are performed on 8-gpus. == 2 == Resnet layers are frozen, FPN lateral and top-down convs are finetuned,

Why 1 gpu can not reproduce the same results? Can i get the same results with same batchsize and lr?

from fsce.

qjh666888 commented on September 25, 2024

如果我只使用 1 个 gpu 怎么办？这会影响结果吗？
此外，配置文件显示主干似乎已经过训练。
但是，在训练过程中，它显示



== 1 == 我不认为 1 gpu 可以重现相同的结果。所有实验均在 8-gpus 上进行。 == 2 == Resnet 层被冻结，FPN 横向和自上而下的 convs 被微调，

What if I only use 1 gpu? Will this affect the result?如果我只使用 1 个 GPU 怎么办？这会影响结果吗？
In addition, the config file shows that the backbone seems to be trained.此外，配置文件显示主干似乎经过训练。

However, during training, it shows但是，在训练期间，它显示

== 1 == I don't think 1 gpu can reproduce the same results. All experiments are performed on 8-gpus. == 2 == Resnet layers are frozen, FPN lateral and top-down convs are finetuned,== 1 == 我不认为 1 个 gpu 可以重现相同的结果。所有实验均在 8 个 GPU 上进行。== 2 == 冻结 Resnet 层，微调 FPN 横向和自上而下的转换，

Why 1 gpu can not reproduce the same results? Can i get the same results with same batchsize and lr?为什么 1 个 GPU 不能重现相同的结果？我可以用相同的批次大小和 lr 获得相同的结果吗？
May I ask if you have successfully replicated one of your GUPs

from fsce.

qjh666888 commented on September 25, 2024

如果我只使用 1 个 gpu 怎么办？这会影响结果吗？
此外，配置文件显示主干似乎已经过训练。
但是，在训练过程中，它显示



== 1 == 我不认为 1 gpu 可以重现相同的结果。所有实验均在 8-gpus 上进行。 == 2 == Resnet 层被冻结，FPN 横向和自上而下的 convs 被微调，

What if I only use 1 gpu? Will this affect the result?
In addition, the config file shows that the backbone seems to be trained.

However, during training, it shows

== 1 == I don't think 1 gpu can reproduce the same results. All experiments are performed on 8-gpus. == 2 == Resnet layers are frozen, FPN lateral and top-down convs are finetuned,

Why 1 gpu can not reproduce the same results? Can i get the same results with same batchsize and lr?

May I ask if you have successfully replicated one of your GUPs

from fsce.

Unable to reproduce the FSCE result of pascal voc split1 shot3 about fsce HOT 30 OPEN

Comments (30)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent