pengtaojiang / l2g Goto Github PK

The PyTorch Code for our CVPR 2022 paper "L2G: A Simple Local-to-Global Knowledge Transfer Framework for Weakly Supervised Semantic Segmentation"

Python 97.21% Shell 2.79%

l2g's People

Contributors

Stargazers

Watchers

Forkers

ml-edu cv-ip thaneos ahwhbc hust-wayne monish-natarajan suhyunyoon yininkorea abandonsea jimmyma99 cat-claws mt-cly tian-003 4three2one shi-10 djene-mengistu pipizhum labiao qinxinfang

l2g's Issues

COCO用deeplabv2_resnet的配置文件训练，测试结果出来除了背景被分割，其他类miou值都是0

你好，COCO数据集使用deeplabv2_vgg16训练结果都是正常，但用deeplabv2_resnet出来结果除了背景。其他都不对，是因为什么原因呢？

resnet模型

您好，能提供一下'ilsvrc-cls_rna-a1_cls1000_ep-0001.params'这个文件吗，谷歌云盘的地址失效了，非常感谢。

下载数据集的问题

你好，你们提供的PASCAL VOC下载地址PASCAL VOC 2012貌似下载下来大小是0k，不知道是不是你们上传出错了，希望能得到你们的帮助～

你好，能否开源一下L2G在训练过程中的evaluate的代码

我尝试去为L2G加入一些新的模块，但是在写训练过程中使用的evaluate的时候会出现爆显存的问题，然后我想向您寻求L2G在训练过程中的evaluate的代码

AttributeError: 'Net' object has no attribute 'module'

在运行test_l2g_voc.py文件时出现了这个报错：
validating ... models.resnet38
Weights cannot be loaded:
[]
0it [00:05, ?it/s]
Traceback (most recent call last):
File "/home/work/Desktop/Projects/xy/L2G/scripts/test_l2g_voc.py", line 123, in
validate(args)
File "/home/work/Desktop/Projects/xy/L2G/scripts/test_l2g_voc.py", line 76, in validate
cam_map = model.module.get_heatmaps()
File "/home/work/anaconda3/envs/l2g/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1178, in getattr
type(self).name, name))
AttributeError: 'Net' object has no attribute 'module'

Process finished with exit code 1
请问这可能是哪方面的问题呢？

网络结构

您好，请教一下。源代码中的局部网络和全局网络具体用的是什么网络结构，是resnet38+deeplabv2吗

The VGG-pretrained weight

Hi, sorry to bother, can you provide the vgg-pretrain.pth in the deeplab-pytorch? Thanks

a question

How does the code deal with passing several cropped images to the local model at a time? Does it put together the cropped to make one picture.

densecrf

Hello, currently, most papers perform post-processing with denseCRF when comparing performance. I would like to ask how to implement denseCRF optimization for the localization maps and segmentation results in our project's code?

。

预训练权重

请问“ilsvrc-cls_rna-a1_cls1000_ep-0001.params”这个权重是什么权重，是在什么数据集上的训练权重？

模型参数训练问题

你好，想请教一下，全局网络和局部网络在代码中采用了同一个模型但是属于不同的实例，局部网络的输出调用了.clone().detach()，请问这行代码的作用是什么。两支网络是共享参数吗，如果不是的话局部网络的参数是怎么训练的呢，是和全局网络同时独立的更新参数吗

bg_threshold问题

您好，想问下bg_threshold的作用是什么

关于epoch

您好，我用您的实验设置进行了训练，但在第4到5个epoch时模型就已经收敛了，请问为什么要设置10个epoch呢？谢谢！

怎么用自己的数据集？

您好，这个如果我想用自己的数据集训练生成伪标签，该怎么做？我看到这里似乎没有自己训练的代码。

关于Attention maps的问题

你好，我尝试在VOC2012数据集上测试你们的代码时发现，生成pseudo label的阶段中缺少了attention maps，然而我并没有找到你们提供了工具加载checkpoint生成这份东西的代码，所以我想请教一下，你们是在代码的哪个阶段生成这个attention maps的？

Pseudo labels on COCO train set

Hey, this is a very interesting work.
Can you provide the pseudo labels for semantic segmentation on the COCO train set?

Discrepancy in Pseudo Label Accuracy & Repository Settings for COCO

Hello Pengtao,
I reproduced the weakly supervised semantic segmentation on COCO using this repo, and followed all the default setting. However, I achieved a pseudo-label accuracy of ~40%. As of my knowledge, the final performance rarely exceeds the pseudo label accuracy significantly, there seems to be a difference compared to the paper's reported results on COCO val set.

I was wondering if there are specific settings or adjustments required for COCO that aren't set by default in the repository?

Thank you,
TC

request for initial weights of classification

Hi. regarding this paragraph:

'''Download the pretrained model 提取码：t4ce to initialize the classification network and put it to ./models/.'''

Could you please upload the weights to google drive? I live outside of China and cannot create a Baidu account.
It would be much kind and considerate of you.

Thank you

请教一下，论文中的图4是怎么生成的。

不使用显著图进行训练

您好，如果我想复现不使用显著图时的结果，是否只需要注释掉以下代码：
feat_local_label_bool = (feat_local_label > args.bg_thr).type_as(feat_local_label)
crop_sals = F.interpolate(crop_sals, (h, w)).type_as(feat_local_label)
feat_local_label[:, :-1, :, :] = feat_local_label_bool[:, :-1, :, :] * crop_sals.repeat(1, args.num_classes, 1, 1)
feat_local_label[:, -1, :, :] = feat_local_label[:, -1, :, :] * ((1 - crop_sals).squeeze(1))

Evaluation problem

The problem has been solved.

PASCAL VOC2012测试集的性能

在pascal官网上下载测试集数据后，没有图像的分类标签_cls文件，那实验中生成测试图像伪标签的时候是提前给数据集打好标签的吗？

预测测试集出来是空的

你好，我用训练完的check_final.pth文件预测测试集分割结果，但不知道是不是因为没label，存储路径里没有分割结果图，困扰挺长时间的，能帮忙解答一下吗

cam_png

代码中cam_png是伪标签吗?和用于分割的伪标签是什么联系呢？

预训练模型下载链接

预训练模型ilsvrc-cls_rna-a1_cls1000_ep-0001.params下载链接打不开，https://drive.google.com/file/d/15F13LEL5aO45JU-j45PYjzv5KW5bn_Pn/view

data问题

请问data/coco14/JPEGImages和SegmentationClass中放的是训练集的还是验证集的图片

a problem of the code

Excuse me. What is the folder named SegmentationOnClassAug? And what is used to calculate the mIou with 'Attentions' in the test stage. I think 'Attentions' are evaluated by the number of 1464 segmentation labeled pictures in the voc12 at the beginning. BUT the code do not match my thought seamingly.

pascal voc测试结果问题

您好，我按照readme产生了pseudo_seg_labels，miou是70.3，将伪掩码拿去deeplabv2_resnet监督训练（这里我将train_aug.txt中的所有SegmentationClassAug目录换成了所有pseudo_seg_labels目录），训练之后测试阶段加载得到的checkpoint_final.pth，但是在val上的结果非常差，miou只有22%左右，不过我的预训练权重是按照deeplab的readme转换得到的：Generate deeplabv2_resnet101_msc-vocaug.pth from train2_iter_20000.caffemodel ，是这个vocaug.pth，不知道是否是这个造成了影响呢？

crop_size的问题

您好，请问您的方法中crop_size大小是否会影响模型的精度，因为您局部图像的标签和全局图像标签是一致的，但是crop_size过小的话，局部图像可能会出现全是背景的情况，但是类别标签为1，这样是否会导致精度下降呢？

显著性图的问题

请问voc12的显著性图是直接使用poolnet的预训练权重生成的吗，还是先用voc数据集训练了poolnet?

这个显著图是不是经过处理的，跟我下载的另一个版本的不太一样

deeplab训练问题

你好，我在使用生成的voc伪掩码训练deeplab时，所采用的预训练模型均和论文一致，但无法达到论文的精度，分别是：vgg16_v1：65%，vgg16_v2,：66.2%，resnet_v1:68.5，resnet_v2:68.9，以上全是对应在val set上的miou ，而论文值分别是68.1%，68.5%，72%，72.1%，我想请问在使用生成的伪掩码训练deeplab时是否需要注意什么超参数的设置？比如随机种子的设置是否有讲究，谢谢

使用本代码进行训练过程中local loss变为nan

您好，我使用本仓库提供的代码，安装了相关的依赖库（其中如torch，torchvision等框架包根据requirement.txt中的版本要求进行安装，其他依赖库由于版本兼容性原因，未指定版本安装）后，由于设备原因，将batchsize从3降至1在PASCAL VOC2012上训练分类模型（其他超参数没有进行修改），在训练过程中，local loss变为nan，请问您知道问题所在吗？

训练配置参数如下：

MS COCO数据集问题

coco14数据集中的SegmentationClass的图片可以在哪里下载呢？我在官网下载只有JEPGImages图片

框架图注意力图如何生成的？

论文网络框架图的局部注意力图是在哪步生成的，跑完程序没见到呀？attention是全局的，cam_png也不是，是需要自己处理吗？

letter of thanks

Very nice work！Thanks for the authors' contribution. The proposed method is effective and helpful.

训练二分类

作者你好，感谢你的开源代码！我目前在用你的代码跑实验，我的数据集仅包含一个类别，有含有前景物体的图像及其仅包含背景的图像，这样的话我在训练模型的时候该如何设置numclass呢，目前我设置了两个类别，背景图像的标签是0，含有前景的图像标签设置为1，这样操作会不会有什么影响，有没有好的建议？

The aggregation of saliency map

Hi, thanks for your work.
I have confusion in terms of the aggregation of the saliency map. After getting the local CAM, the saliency map is adopted to revise the pseudo mask as in https://github.com/PengtaoJiang/L2G/blob/main/scripts/train_l2g_sal_voc.py#L203 . It seems to remove a CAM foreground region if the region has no overlap with the saliency map? i.e., the transfer loss would be L_st =||0-G_i|| if |S_i|=0 instead of L_st =||A_i-G_i|| if |S_i|=0 as paper report.
Thank you.

注意力图和伪分割标签的生成之间是如何进行的？

您好，我看论文中似乎没有提到关于伪分割是如何通过注意力图生成的，代码里也没有使用显著图来生成伪标签，那注意力图是如何用于生成伪标签的呢？

miou差距较大，请问是哪里出了问题？

老师您好，我按照给的md文件的流程，跑了一下voc数据集，miou结果差距很大，经过代码中提供的deep_labv2训练后miou只有0.49。想咨询一下您是哪里出了问题。

以下是各个sh文件下的超参数（只改了文件路径）：
train_l2g_sal_voc.sh 文件中：

EXP=exp_voc
RUN_FILE=train_l2g_sal_voc.py
BASH_FILE=train_l2g_sal_voc.sh
GPU_ID=0
CROP_SIZE=320
PATCH_NUM=6

mkdir -p runs/${EXP}/model/
cp ${BASH_FILE} runs/${EXP}/model/${BASH_FILE}
cp scripts/${RUN_FILE} runs/${EXP}/model/${RUN_FILE} 

CUDA_VISIBLE_DEVICES=0 python3 ./scripts/${RUN_FILE} \
    --img_dir=/mnt/data/zcw/segmentation/data/VOCdevkit/VOC2012/ \
    --train_list=./data/voc12/train_cls.txt \
    --test_list=./data/voc12/val_cls.txt \
    --epoch=10 \
    --lr=0.001 \
    --batch_size=3 \
    --iter_size=1 \
    --dataset=pascal_voc \
    --input_size=448 \
    --crop_size=${CROP_SIZE} \
    --disp_interval=100 \
    --num_classes=20 \
    --num_workers=16 \
    --patch_size=${PATCH_NUM} \
    --snapshot_dir=./runs/${EXP}/model/  \
    --att_dir=./runs/${EXP}/  \
    --decay_points='5' \
    --kd_weights=10 \
    --bg_thr=0.001
    # --load_checkpoint="./runs/${EXP}/model/author_pretrained/pascal_voc_epoch_9.pth" \
    # --current_epoch=10

在test_l2g_voc.sh文件中：

EXP=exp_voc
TYPE=ms
THR=0.25

CUDA_VISIBLE_DEVICES=1 python3 ./scripts/test_l2g_voc.py \
    --img_dir=/mnt/data/zcw/segmentation/data/VOCdevkit/VOC2012/JPEGImages/ \
    --test_list=./data/voc12/train_cls.txt \
    --arch=vgg \
    --batch_size=1 \
    --dataset=pascal_voc \
    --input_size=224 \
	--num_classes=20 \
    --thr=${THR} \
    --restore_from=./runs/${EXP}/model/author_pretrained/pascal_voc_epoch_9.pth \
    --save_dir=./runs/${EXP}/${TYPE}/attention/ \
    --multi_scale \
    --cam_png=./runs/${EXP}/cam_png/

CUDA_VISIBLE_DEVICES=1 python3 scripts/evaluate_mthr_voc.py \
    --datalist ./data/voc12/train_aug.txt \
    --gt_dir /mnt/data/zcw/segmentation/data/VOCdevkit/VOC2012/SegmentationClassAug// \
    --save_path ./runs/${EXP}/${TYPE}/result.txt \
    --pred_dir ./runs//${EXP}/${TYPE}/attention/

在gen_gt_voc.sh文件中

EXP=exp_voc
TYPE=ms

CUDA_VISIBLE_DEVICES=0 python3 gen_gt.py \
   --dataset=pascal_voc \
   --datalist=data/voc12/train_aug.txt \
   --gt_dir=/mnt/data/zcw/segmentation/data/VOCdevkit/VOC2012/JPEGImages/ \
   --save_path=/mnt/data/zcw/segmentation/data/VOCdevkit/VOC2012/pseudo_seg_labels/\
   --pred_dir=./runs/${EXP}/${TYPE}/attention/ \
   --num_workers=16

消融实验

实验1中使用滑动窗口，局部图像和L2G方法比较mIoU值的时候，用CAM作的基线，但原始CAM提出时没有经过分割；所以这个实验是把CAM生成注意力图的方法替换成了本文中的注意力生成方法吗

pascal voc 2012 增强集

您好，能否提供一下pascal voc 2012增强版的下载链接

about reproducing

Hello.
Thank you for sharing the impressive work, L2G!

I tried to reproduce the L2G following the implementation script, however, I obtained about 66% mIoU on VOC2012, which is far behind the 72.1% of paper mIoU.

I used the bellow scripts.

use the Pretrained models for VOC, without ./train_l2g_sal_voc.sh
run ./test_l2g_voc.sh
run ./gen_gt_voc.sh
run deeplab v2 python main.py train --config-path configs/voc12_resnet_dplv2.yaml

Could you check reproducing of this work?
Are there any recommendations about this result?

# data/scores/voc12_resnet_v2/deeplabv2_resnet101_msc/val/scores_crf.json
{
    "Class IoU": {
        "0": 0.8893978389769717,
        "1": 0.7332514237124688,
        "2": 0.2647600116462558,
        "3": 0.7530019963539774,
        "4": 0.6229389456497264,
        "5": 0.6844399420943811,
        "6": 0.8477865759662723,
        "7": 0.7323184116750477,
        "8": 0.8361064618817893,
        "9": 0.268236496479961,
        "10": 0.7934048627881579,
        "11": 0.4293026338035794,
        "12": 0.7782026771867929,
        "13": 0.7793843619160901,
        "14": 0.6993258587392808,
        "15": 0.7108989092335944,
        "16": 0.5175071559996408,
        "17": 0.7581466783853126,
        "18": 0.4200272117472957,
        "19": 0.8026335398365552,
        "20": 0.5178384991250041
    },
    "Frequency Weighted IoU": 0.839895707892765,
    "Mean Accuracy": 0.8471590439388089,
    "Mean IoU": 0.6589957377713407,
    "Pixel Accuracy": 0.9036374933774816
}