Giter Site home page Giter Site logo

liheyoung / unimatch Goto Github PK

View Code? Open in Web Editor NEW
414.0 3.0 57.0 6.23 MB

[CVPR 2023] Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation

Home Page: https://arxiv.org/abs/2208.09910

License: MIT License

Python 97.57% Shell 2.43%
fixmatch semi-supervised-learning semi-supervised-segmentation

unimatch's People

Contributors

liheyoung avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

unimatch's Issues

Remote Sensing Interpretation

您好,我细读了您出色的半监督语义分割方法,我关注到了附加实验中的变换检测内容,我想请问一下你是如何实施本方法在变换检测任务中,我简单修改源代码使其变为二分类,但是不知道其它具体部分的细节,如果有源代码的话是否可以上传一下相关代码或者描述一下您在应用到变换检测任务中的实现细节。

sh train.sh 1 20024报这个错了 能帮忙看看嘛

还是不行,报错了,能指点一下吗
warnings.warn(
usage: launch.py [-h] [--nnodes NNODES] [--nproc_per_node NPROC_PER_NODE] [--rdzv_backend RDZV_BACKEND] [--rdzv_endpoint RDZV_ENDPOINT] [--rdzv_id RDZV_ID] [--rdzv_conf RDZV_CONF] [--standalone]
[--max_restarts MAX_RESTARTS] [--monitor_interval MONITOR_INTERVAL] [--start_method {spawn,fork,forkserver}] [--role ROLE] [-m] [--no_python] [--run_path] [--log_dir LOG_DIR]
[-r REDIRECTS] [-t TEE] [--node_rank NODE_RANK] [--master_addr MASTER_ADDR] [--master_port MASTER_PORT] [--use_env]
training_script ...
launch.py: error: the following arguments are required: training_script, training_script_args
train.sh: 28: unimatch.py: not found

Single GPU be used for training?

Hi, Thanks for your great job,
Can a single GPU be used for training? How to evaluate after training, can you provide eval.sh file

关于Fixmatch的细节问题

作者您好,请问您在实现FixMatch时,噪声标签的过滤是过滤掉低置信度的像素,只保留高置信度的像素参与训练?还是过滤掉低置信度的一整张图像,只保留高置信度的像素参与训练?

CutMix

Did you use the CutMix to reproduce the results on the PASCAL tables?

Pretrained Backbone issue

hello! Thanks for the good research!

I tried to use resnet-50,101, but I can't download it with a link that exists in Pretrained Backbone.

please check! 🙏 thank you!

Reproduce results

Hi, thanks for your excellent work, I have some problems in reproduce the result:

  • when I reproduced Unimatch on Voc(1464), I could not get the highest result of 81.2, and I also couldn't get the result in the paper when I used training size as 513*513 for 1/16(662) and so on, is that because the environment or other reasons?
  • By the way, I see the code has OHEM losscriterion_l = ProbOhemCrossEntropy2d(**cfg['criterion']['kwargs']).cuda(local_rank), but it doesn't mention it on the paper, could you please tell me the details to how to use OHEM loss?
    Thank you very much!

Compared to the previous version

I noticed that compared to the previous version, the latest code has removed 'multi_grid' from the backbone (resnet) and changed the settings of parameters such as dilations.

May I ask if the adjusted version can improve the performance?

关于特征扰动

你好,作者,请问特征扰动的具体代码是哪里呢,我没有找见,可以帮忙标注一下么,谢谢!

Config file for high resolution training for Pascal VOC

Hi,

Thanks for sharing your code. I found that only config file for pascal 321x321 is presented. Could you please provide your training settings for high resolution training which is 513x513? Or they just share the same setting. Thanks.

Question for CutMix

Hi, thanks for the great work!

I have found that you used additional dataloader for the cutmix sources.

Since most of the work will mix by using the image from the same loaded batches, I was wondering what is the intention of using mixing sources from another batch, and will it affect the training result by using the common setting which only using one dataloader and the same batch for mixing? Thanks!

An observation of using 513x513 crop size under 92 split.

Great work! According to your paper, UniMatch performs well on most splits, especially small ones (like 92, 183). However, when I tried to reproduce and further explore some more, I found an interesting phenomenon.

In most semi-supervised semantic segmentation methods, using a larger crop size(513 vs 321) usually leads to a performance improvement. When 92 split is selected, UniMatch can achieve good performance (74.5-75.0) in the crop size 321 scenario. However, under the crop size of 513x513, the performance can only reach around 72.5-73.0, and the overfitting will appear very early and lead to performance degradation.

I noticed that in the paper, the result you reported is also the scene under 321 crop size. I wonder if you have done experiments with 513 crop size under a small number of splits(92 or 183)?

Attached here is a change curve of miou during training using 92 split at 513 crop size. The highest performance will be reached at about 10 epochs (72.8). After training for 80 epochs, the model performance will be around 68.
image

About Pascal Voc 2012 results

Hi,
Great work. I have some confusion about Supervised baseline and UniMatch comparison in Pascal Voc 2012 dataset.

  1. What is Supervised baseline in your study?
  2. In the Pascal Voc dataset if we used all labelled dataset then how it can be used as semi supervised training. I have attached the table you presented. Please explain the last column of this table .
    Thanks
    image

关于cudnn加速

您好,我在本地进行训练与推理时发现,是否使用cudnn加速(torch.backends.cudnn.enabled)将带来显著差别。我注意到您的ST++代码没有设置(保持默认),而Unimatch代码则设置为True。
我想要请教的是 :
1.训练和测试是否要保证torch.backends.cudnn.enabled设置一致?(我注意到如果不一致,似乎会造成结果错误?)
2.torch.backends.cudnn.enabled的默认是?(我在官方文档中没有找到关于默认设置的描述,特向您请教)

Try to apply teacher model?

Thank you so much for your wonderful work! Have you considered introducing the Teacher model (EMA) to produce the output of the weakly view? Since in the semi-supervised semantic segmentation, the Teacher model can produce a more robust output.

Ask questions

1.The first question is:

example:
loss_u_s2 = criterion_u(pred_u_s2, mask_u_w_cutmixed2)
loss_u_s2 = loss_u_s2 * ((conf_u_w_cutmixed2 >= cfg['conf_thresh']) & (ignore_mask_cutmixed2 != 255))
loss_u_s2 = torch.sum(loss_u_s2) / torch.sum(ignore_mask_cutmixed2 != 255).item()

Take the above loss as an example: we can see that after the unsupervised loss is calculated, the loss_item is filtered according to two conditions (the second line); Then on the third line, we see that there is a division operation (can it be understood as a normalization of weight? ? Why did the denominator here become?
ignore_mask ! = 255 This conditional region.

2.The second question is:
Because you are aiming at the direction of semantic segmentation in this article, you have adopted two conditions: argmax () and confidence threshold greater than 0.95 when setting false labels for unlabeled data! Since my current research direction does not involve multiple classes and belongs to the task of binary classification, it implicitly shows that 0.5 becomes the first condition, so if I want to learn from your idea, do I still need a threshold of 0.95?

3.The third question:
The third question may be related to the second question. If I don't use 0.95 as the threshold, can the weight of the unlabeled loss be fixed at 0.5 or 0.25? Do you think there are other good ways?

Why not FlexMatch?

Your model is based on the fixmatch method. I see that the flexmatch method is also mentioned in your paper, so why not use flexmatch as the baseline?
您好,看到您论文中提到flexmatch,想请教一下不以它为baseline的原因。

a question about data split

A nice work!
Is the data split of the Cityscapes dataset used the same as that of the previous work (i,e,. ST++)? I noticed that there were many splits in the previous work. Is it the same as theirs? Thank you!

关于使用自己的数据集训练

您好,首先非常感谢您的贡献!
我的问题是,在尝试使用自己的数据集(医学图像血管分割,所以我直接用了more_scenarios/medical)训练时,我尝试修改more_scenarios/medical/spilts/acdc中的文件,发现有如下很多文件
image
我想请问:

  1. spilts在这里指的是划分的训练数据集吗,如果训练的话我能否只建一个文件夹放入我的有标签训练数据txt清单和无标签训练数据清单。例如more_scenarios/medical/spilts/mydataset/1/labeled.text 以及more_scenarios/medical/spilts/mydataset/1/unlabeled.text
  2. 对于more_scenarios/medical/spilts/acdc下的其他四个文件,我该分别选择什么数据放进去?
  3. 对于训练自己的数据集,除了修改以上所说的,还需要修改别的内容吗

再次感谢您的贡献,同时请您不吝赐教!

Pre-trained model on the Cityscapes dataset

Could you provide a pre-trained model on the Cityscapes dataset? (Under different data partitions)
I think it will help other researchers quickly reproduce this excellent work, thank you!

About weak and strong image perturbations

Hello
In DusPerb framework is said that a shared weak view of an image is used to supervised the 2 strong views. But in the code of Unimatch I see that each strong view has a corresponding weak pseudo label generated from the forward pass of the weak augmented image. So, it seems that there is a weak1 pseudo label to supervise strong1 image and weak2 pseudo label for the strong2 image. The two strong views have generated by attaching a different random patch so they are needed their own pseudo label that is created with a weakly manner. Could you help me with this?

Also, a second question is if the DusPerb framework by itself without feature perturbation is overrunning the baseline FixMatch.

Many thanks

您好,想请教一个关于medical场景下的问题。

作者你好,我把医学场景下的backbone由unet换成了deeplabv3+,但由于医学图像是单通道的,无法使用预训练权重,只能从头开始训练,但出现了如下问题,我截取了一个epoch的日志,可以看的更清晰,就是在evaluation阶段,他的每一个dice都是0,请问这个是正常的吗,或者说是哪里出了问题。希望请您指教一下。

[2023-04-10 16:48:11,793][ INFO] ===========> Epoch: 3, LR: 0.00095, Previous best: 0.00
[2023-04-10 16:48:11,981][ INFO] Iters: 0, Total loss: 0.349
[2023-04-10 16:49:00,196][ INFO] Iters: 379, Total loss: 0.165
[2023-04-10 16:49:48,422][ INFO] Iters: 758, Total loss: 0.163
[2023-04-10 16:50:37,289][ INFO] Iters: 1137, Total loss: 0.161
[2023-04-10 16:51:25,787][ INFO] Iters: 1516, Total loss: 0.160
[2023-04-10 16:52:14,801][ INFO] Iters: 1895, Total loss: 0.159
[2023-04-10 16:53:03,536][ INFO] Iters: 2274, Total loss: 0.159
[2023-04-10 16:53:52,277][ INFO] Iters: 2653, Total loss: 0.157
[2023-04-10 16:54:40,751][ INFO] Iters: 3032, Total loss: 0.156
[2023-04-10 16:54:57,018][ INFO] ***** Evaluation ***** >>>> Class [0 Right Ventricle] Dice: 0.00
[2023-04-10 16:54:57,019][ INFO] ***** Evaluation ***** >>>> Class [1 Myocardium] Dice: 0.00
[2023-04-10 16:54:57,019][ INFO] ***** Evaluation ***** >>>> Class [2 Left Ventricle] Dice: 0.00
[2023-04-10 16:54:57,019][ INFO] ***** Evaluation ***** >>>> MeanDice: 0.00

Segmentation model and backbone

Is there any difference between the deeplabv3+ & resnet code here and the code in your previous work(ST++)?
I noticed that there seems to be a difference in parameter settings?

训练完成但是没有自动退出

应该不是什么大问题,简单提出来一下
image
看到的时候已经过了快俩小时,但是没有退出,不知道大佬知道原因吗。

Qestions about weak augmentation on raw image

Hello, your work has inspired me a lot! But I have a question about weak augumentation on raw image. The paper mentions applying weak perturbations such as crop and flip to the input images at the image level. However, I couldn't find the specific code for these perturbations. Could you please let me know which perturbations were applied and provide the relevant code if possible? Thank you very much!

About the "ignore_mask" in the training code

Amazing work! simple and effective!

I noticed that the build of "ignore_mask" requires the "mask"(dataset/semi.py)
image

Then, I noticed that "ignore_mask", "ignore_mask_mix" are involved in building the loss, so is there some label(gt mask) information used in the calculation of unlabeled data loss?(unimatch.py, also in fixmatch.py)
image
image
image

Thank you!

关于数据集问题

你好作者,我想问一下,segmentclass.zip这个文件和官方的数据集不太相同,我想知道这个不同是什么呢。

training_logs中的log文件名字前的数字含义

您好,请问一下training_logs中的log文件名字前的数字代表什么含义呢?例如training-logs/Pascal-VOC-2012/High-Quality-Split-Size321/ResNet-101/下的1464-run1.log、183-run1.log、366-run1.log中的1464,183,366。

Pretrained backbone

Hi,

Great work ! I was wondering where your pretrained backbone come from ?

Renaud

Weight File for PASCALVOC

I can find the Cityscapes and COCO weights, but couldn't find any PASCALVOC weight file. As I reproduce the same code without modifications of yours, also checked the config file (as below), there was a slight gap of 2-3%. I got 76.91 on 732 split, which yours were 79.9. Can you upload any weight file for PASCALVOC? Thank you. Really appreciate your works by the way!

[2023-04-10 19:41:31,486][ INFO] {'backbone': 'resnet101',
'batch_size': 2,
'conf_thresh': 0.95,
'config': 'configs/pascal.yaml',
'criterion': {'kwargs': {'ignore_index': 255}, 'name': 'CELoss'},
'crop_size': 321,
'data_root': '/data2/ksy/PASCALVOC2012/',
'dataset': 'pascal',
'dilations': [6, 12, 18],
'epochs': 80,
'labeled_id_path': 'splits/pascal/732/labeled.txt',
'local_rank': 0,
'lr': 0.001,
'lr_multi': 10.0,
'model': 'deeplabv3plus',
'nclass': 21,
'ngpus': 2,
'port': 20024,
'replace_stride_with_dilation': [False, False, True],
'save_path': 'exp/pascal/unimatch/r101/732',
'unlabeled_id_path': 'splits/pascal/732/unlabeled.txt'}

请问怎么理解图像级干扰和特征级干扰应该分成单独的流?

  1. unimatch结构中的辅助干扰流不也是图像级干扰和特征级干扰的混合吗?原始图像xu经过图像级弱干扰变成xw,然后输入编码器g,接着对生成的ew特征进行drop out干扰,这不是图像级干扰和特征级干扰的混合吗?
  2. 论文中第7页左下角原话“we inject the dropout on the features of strongly perturbed images”,是指将unimatch中的xs1和xs2经过编码器g输出的特征也进行drop out然后再输入到解码器h中吗?
    以上就是我的疑问,期待您的解惑,如果对于论文理解有误的话,还烦您指出并纠正,感谢!

About COCO Suponly

Hello, thanks for your inspiring work!
I tried to reproduce the experiments of COCO dataset (SupOnly) with your provided code 'supvised.py'. But the results are far from your reported results. Would you please provide your logs and checkpoints for COCO (SupOnly)?
image

您好,请问下如何对训练好的模型进行单卡预测?

您好,作者。我想利用训练好的模型制作预测结果,按照训练函数中的分布式模型构建方式,可以成功进行多卡预测。但在进行非分布式构建模型及预测时,我遇到了下述问题:

image

构建模型的代码如下:

image

About the reproduction of results on Pascal dataset

Hello,
Would it be easy to reproduce the results for pascal dataset on a single gpu? What learning rate is required for training on a single gpu to reproduce the results for FixMatch / UniMatch?

Thanks

DDP has a mistake,The code has a bug, the code reported such an error during training

` File "/public/home/wdc/project/UniMatch/more-scenarios/remote-sensing/unimatch.py", line 39, in main
rank, world_size = setup_distributed(port=args.port)
File "/public/home/wdc/project/UniMatch/more-scenarios/remote-sensing/util/dist_helper.py", line 36, in setup_distributed
dist.init_process_group(
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 627, in init_process_group
_store_based_barrier(rank, store, timeout)
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 255, in _store_based_barrier
raise RuntimeError(
RuntimeError: Timed out initializing process group in store based barrier on rank: 0, for key: store_based_barrier_key:1 (world_size=8, worker_count=1, timeout=0:30:00)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 139953) of binary: /public/home/wdc/anaconda3/envs/unimatch/bin/python
Traceback (most recent call last):
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/public/home/wdc/anaconda3/envs/unimatch/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

unimatch.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2023-06-19_18:05:03
host : gpu1
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 139953)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================`

Unlabeled Images Mask in Split

Hi, thank you for the work!

Not sure if I'm missing something but why are we including the masks for the unlabeled images in the split folder? I see that in semi.py we are processing the masks for the unlabeled images as well, but according to Algorithm 1 in the paper, we get the predicted mask from the perturbed images.

Is it possible to run this model on a dataset for which we don't have the masks for the unlabeled images?

关于梯度反传的小问题

看您关于特征扰动的实现是在原始特征上drop后与原始特征cat起来输入解码器,也就是说,这样计算得到的损失会同时更新编码器+解码器。

作者有试过将drop的特征detach,然后直接输入到解码器吗?这样drop特征输入只训练解码器,也就是梯度只经过解码器。

How much memory do you use during training?

Hello, I appreciate your generosity in sharing your code. I have attempted to replicate your work on two GTX 4090s with a combined memory of 24 GB. However, during the training process on the cityscapes dataset, the GPU memory allocation proved insufficient, forcing me to reduce the batch size to 1 and the backbone to ResNet50. Despite these adjustments, the program still consumes a considerable amount of memory at around 23 GB. I was wondering how much memory is typically required for the model during your own training process?"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.