wdhudiekou / umf-cmgr Goto Github PK
View Code? Open in Web Editor NEW[IJCAI2022 Oral] Unsupervised Misaligned Infrared and Visible Image Fusion via Cross-Modality Image Generation and Registration
License: MIT License
[IJCAI2022 Oral] Unsupervised Misaligned Infrared and Visible Image Fusion via Cross-Modality Image Generation and Registration
License: MIT License
看是否方便给个能跑的数据集?
不知道您是否方便提供跨模态感知风格迁移网络( CPSTN) 的代码和损失函数?之前也有尝试过做配准融合 但是由于红外和可见光图像模态差异太大直接设计配准网络并不可行。也尝试过先进行模态转换 但是转换的效果都不尽如人意 您的转换效果确实非常令人眼前一亮 所以想学习一下您关于模态转换方面的设计。
作者您好,
关于配准损失包含两个部分,其中一个部分,使用的是变形场作用的伪红外图像特征和真实变形的红外特征去近似,这一部分可以理解,从何学到变形场。
然后另一个部分,使用的是MRRN网络估计的配准红外图像的特征和伪红外图像的特征去做近似,这一部分不是很理解。 不理解的地方是,为什么用伪红外图像的特征做近似,是因为把伪红外图像的特征看作了当前网络红外图像的GT,从而让生成的配准图像特征与它相似?
是从扭曲的红外图像学习变形场,从伪红外图像学习红外特征?因为扭曲的红外图像,是真实的源图像,但是因为他扭曲了,所以不能使用他的特征?这样会不会让对齐后的图像内容真实性降低?
不知道理解的有没有问题,很抱歉带来这个问题,期待您的答复,谢谢。
你好,我当我运行test_reg.py时报这个错误,然后我去查看了报错位置,
File "../models/layers.py", line 124, in forward
new_locs = self.grid + flow
RuntimeError: The size of tensor a (128) must match the size of tensor b (236) at non-singleton dimension 3
然后我查看了flow是[1,2,170,236]
grid是[1,2,128,128]
我想问下这个问题怎么解决呢?
想问下Trainer/train_reg.py中--pretrained的具体含义和--ckpt的具体含义是什么?相关文件从哪里生成(希望能获得详细描述)?
Trainer/train_fuse.py中--ckpt的具体含义是什么? 相关文件从哪里生成(希望能获得详细描述)?
Trainer/train_reg_fusion.py中--ckpt的具体含义是什么? 相关文件具体从哪里生成?,--load_model_reg和--load_model_fuse相关文件从哪里生成(希望能获得详细描述)
Test/test_reg_fusion.py中--disp,--ckpt_reg,--dst_reg,--ckpt_fus,--dst_fus的具体含义是什么? 相关文件是从哪里生成(希望能获得详细描述)?
Test/test_fuse_ycbcr.py和Test/test_fuse.py中--ckpt和--dst的具体含义是什么? 相关文件是从哪里生成(希望能获得详细描述)?
Test/test_reg.py中--disp,--ckpt,--dst的具体含义是什么? 相关文件是从哪里生成(希望能获得详细描述)?
Thank you for your great work! I noticed that you apply deformation fields to infrared images to meet the requirement of misaligned image pairs. So I wonder how to make the model work with real misaligned image pairs (the resolutions of the infrared image and the visible image are very different)?
老师你好,我主要研究方向是医学图像融合处理的。在你这篇UMF-CMGR中,请问一下,一般我们是不可知或者根本无法完
全获得形变图像的位移矢量信息(比如,图像发生了局部形变或者全局变换的,以及对于更加复杂的形变中我们一般是无法预知
其自由度大小的),你这种方式相当于提前知道了图像的形变参数,然后在画出相应的形变网格图或者稠密光流图的。但如果不
知道形变参数,只有网络的预测形变场(2D-> (B, H, W, W), 3D-> (B, X, Y, Z, 3)),以及经过STN网络预测得到的warped img,请问加
形变网格的warped img图像该咋画?而你在本文中是对已知形变参数矩阵求逆(形变场不一定光滑可逆)得到的。
其次,我想问下,你提出的跨模态风格迁移网络CPSTN能否用于误对齐的2D多模态医学图像融合领域,比如MRI-CT, MRI-
PET, MRI-UlTRA等图像融合,考虑到不同模态传感器成像或者断层图像的差异,这种利用GAN网络的跨模态的风格迁移方式能否
适用呢?
用项目官方提供的[CPSTN]预训练模型和自己数据集训练的CPSTN模型生成的伪红外图像都是分辨率极低,纹理细节丢失很严重?
想问下latest_net_D_A.pth,latest_net_D_B.pth,latest_net_G_A.pth和latest_net_G_B.pth的含义什么?
latest_net_D_A.pth中的D含义是什么以及A的含义是什么,是代表接收红外图像从而生成可见光图像吗?
latest_net_G_A.pth中的G含义是什么以及A的含义是什么,是代表接收红外图像从而生成可见光图像吗?
您好,
请问怎样计算文中提到的三个性能指标CC,VIF,SSIM
作者您好,
在我的数据集上可见光和红外图像的原始分辨率不一致,于是我直接resize之后开始训练,并不像您提供的FILR数据集每对图像分辨率保持一致。于是我的数据集在配准测试时,ir_reg中输出的配准结果边缘出现了扭曲和黑边,我认为是关键点不匹配导致的。
我想请教一下这个问题是什么原因,怎么去解决。
期待您的回复!
谢谢!
请问为什么我使用您所给出的配准的预训练模型进行测试后,并没有配准的效果,我的it就是您CPSTN生成的为红外图像,ir就是形变后的红外图像呀。
师兄您好:
我在使用您提供的https://pan.baidu.com/s/1JO4hjdaXPUScCI6oFtPEnQ) (code: i9ju) of CPSTN 模型,生成伪红外图像时,遇到这个错误,是否您提供的预训练模型和代码不匹配?能重新穿个合适的么
RuntimeError: Error(s) in loading state_dict for ResnetGenerator:
Missing key(s) in state_dict: "model.10.conv_block.6.weight", "model.10.conv_block.6.bias", "model.11.conv_block.6.weight", "model.11.conv_block.6.bias", "model.12.conv_block.6.weight", "model.12.conv_block.6.bias", "model.13.conv_block.6.weight", "model.13.conv_block.6.bias", "model.14.conv_block.6.weight", "model.14.conv_block.6.bias", "model.15.conv_block.6.weight", "model.15.conv_block.6.bias", "model.16.conv_block.6.weight", "model.16.conv_block.6.bias", "model.17.conv_block.6.weight", "model.17.conv_block.6.bias", "model.18.conv_block.6.weight", "model.18.conv_block.6.bias".
Unexpected key(s) in state_dict: "model.10.conv_block.5.weight", "model.10.conv_block.5.bias", "model.11.conv_block.5.weight", "model.11.conv_block.5.bias", "model.12.conv_block.5.weight", "model.12.conv_block.5.bias", "mod
Thank you for great your work!
I have a question about your datasets.
I get the RoadScene dataset from here, and then there are two types of thermal images(cropinfrared and infrared).
What is correct infrared image for making the test images using this code..?
To my understanding, I use the corp infrared images for making the test dataset. Is it right?
Thank you. I'm waiting your reply.
Hi, This is a good work!
I see a similar paper:Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration. In this paper, where have same loss function and method, but the code(https://github.com/wdhudiekou/IMF) not open source, so I hope it public.
If you're reading this message, please reply soon, thank you very much!
代码开头部分from dataloader.fuse_data_vsm import FuseTestDataYcbcr,加载了FuseTestDataYcbcr方法,但是在dataloader、fuse_data_vsm.py中没有看到FuseTestDataYcbcr方法
想请问下,该配准网络对于视角变化的图像配准是否同样有效果?
Hi teacher,When I train train_reg_fusion.py,I got an error,KeyError: 'spatial_transform.grid'.
The information about this error is as following:
Loading pre-trained RegNet checkpoint ../reg_0280.pth
Traceback (most recent call last):
File "E:/Study/fusing/UMF-CMGR/Trainer/train_reg_fusion.py", line 218, in
main(args, visdom)
File "E:/Study/fusing/UMF-CMGR/Trainer/train_reg_fusion.py", line 107, in main
RegNet.load_state_dict(state)
File "E:/Study/fusing/UMF-CMGR/models/deformable_net.py", line 74, in load_state_dict
state_dict.pop('spatial_transform.grid')
KeyError: 'spatial_transform.grid'
Can you help me? T-T
My anaconda env is as follow:
Kornia 0.5.11
pytorch 1.6.0
CUDA 10.2
opencv-contrib-python 3.4.2.16
visdom 0.1.5
torchvision 0.7.0
作者您好,在3.2节中,您对CGRP和FlowNet以及VoxelMorph的配准结果进行了定量比较,这一部分代码在您提供的/Evaluation/metrics.py文件。这个文件中有两个路径,分别是:
root_in = '/home/zongzong/WD/Fusion/JointRegFusion/results_Road/Reg/220507_Deformable_2Fe_10Grad/ir_reg/'
root_gt = '../dataset/raw/ctest/Road/ir_121/'
我想问问这两个路径分别代表什么?root_in是对齐后的红外图片吗?root_gt是可见光图片还是对其前的红外图片,或者是其他?
我想运行metrics.py,得到您论文中的评估结果,及MSE为0.004,NCC为0.926,MI为1.648
Hello,
Could you please provide a link to your training and testing data, pre-trained models, etc. on Google Drive?
Thank you in advance.
Mojtaba
代码中的vggloss中的pcp实现似乎与论文中定义的不太一样
运行test_reg.py出现RuntimeError: The size of tensor a (128) must match the size of tensor b (320) at non-singleton dimension 3
详细错误信息如下:
===> Starting Testing
Traceback (most recent call last):
File "E:\UMF-CMGR\UMF-CMGR-main\Test\test_reg.py", line 189, in
main(args)
File "E:\UMF-CMGR\UMF-CMGR-main\Test\test_reg.py", line 66, in main
test(net, test_data_loader, args.dst, device)
File "E:\UMF-CMGR\UMF-CMGR-main\Test\test_reg.py", line 86, in test
ir_pred, f_warp, flow, int_flow1, int_flow2, disp_pred = net(it, ir)
File "D:\Users\Anaconda3\envs\UMF_CMGR\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\UMF-CMGR\UMF-CMGR-main\models\deformable_net.py", line 106, in forward
features_s_warped, _ = self.spatial_transform_f(c11, up_int_flow2) # torch.Size([16, 16, 128, 128])
File "D:\Users\Anaconda3\envs\UMF_CMGR\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\UMF-CMGR\UMF-CMGR-main\models\layers.py", line 124, in forward
new_locs = self.grid + flow
RuntimeError: The size of tensor a (128) must match the size of tensor b (320) at non-singleton dimension 3
Process finished with exit code 1
请问有高手能帮忙解决这个问题吗?
想请问下,这个里面的disp具体指的是什么?可以解答一下吗?
When I use test_reg.py and pretrained model reg_0280.pth to test the registration process,I got a keyerror: 'spatial_transform.grid'. Can you help me? The more information about this error is as following:
===> loading trained model '../reg_0280.pth'
Traceback (most recent call last):
File "test_reg.py", line 187, in
main(args)
File "test_reg.py", line 61, in main
net.load_state_dict(model_state_dict)
File "/home/wu/UMF/Test/../models/deformable_net.py", line 74, in load_state_dict
state_dict.pop('spatial_transform.grid')
KeyError: 'spatial_transform.grid'
您好:
最近读了您的论文获益匪浅,只是不解的是,我在论文中看到论文中的校正表达式使用的是11的卷积表达式:
但是代码中使用的却是33的卷积核,是我哪里理解的有问题吗?麻烦您看一下
self.query_conv = nn.Conv2d(in_dim, in_dim, 3, 1, 1, bias=True)
self.key_conv = nn.Conv2d(in_dim, in_dim, 3, 1, 1, bias=True)
def forward(self, x, prior):
x_q = self.query_conv(x)
prior_k = self.key_conv(prior)
energy = x_q * prior_k
attention = self.sig(energy)
谢谢您的回复!
作者您好,在readme中您给出训练好的 G 权重,为可见光迁移红外的权重,请问可以给我一份 红外迁移可见光的权重吗?期待您的回复
warped, affine_param = self.trs(input) 为什么trs函数报错
说是因为返回一个值要用两个变量接受
作者您好,最近在复现您的工作,我照着您所列出的环境要求配置好了环境,cuda和torch也是匹配的,但是运行train_reg.py文件时,遇到以下报错,不知是什么原因,期待您能够解答:
Traceback (most recent call last):
File "E:\fdr\UMF-CMGR\Trainer\train_reg.py", line 164, in
main(args, visdom)
File "E:\fdr\UMF-CMGR\Trainer\train_reg.py", line 83, in main
train(training_data_loader, optimizer, net, criterion, epoch, elastic, affine)
File "E:\fdr\UMF-CMGR\Trainer\train_reg.py", line 105, in train
ir_affine, affine_disp = affine(ir)
File "E:\fdr\conda\envs\UMF-CMGR\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\fdr\UMF-CMGR\functions\affine_transform.py", line 26, in forward
warped, affine_param = self.trs(input) # [batch_size, 3, 3]
File "E:\fdr\conda\envs\UMF-CMGR\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\fdr\conda\envs\UMF-CMGR\lib\site-packages\kornia\augmentation\base.py", line 244, in forward
output = self.apply_func(in_tensor, in_transform, self._params, return_transform)
File "E:\fdr\conda\envs\UMF-CMGR\lib\site-packages\kornia\augmentation\base.py", line 204, in apply_func
output = self.apply_transform(in_tensor, params, trans_matrix)
File "E:\fdr\conda\envs\UMF-CMGR\lib\site-packages\kornia\augmentation\augmentation.py", line 719, in apply_transform
padding_mode=self.padding_mode.name.lower(),
File "E:\fdr\conda\envs\UMF-CMGR\lib\site-packages\kornia\geometry\transform\imgwarp.py", line 170, in warp_affine
dst_norm_trans_src_norm: torch.Tensor = normalize_homography(M_3x3, (H, W), dsize)
File "E:\fdr\conda\envs\UMF-CMGR\lib\site-packages\kornia\geometry\transform\homography_warper.py", line 378, in normalize_homography
src_pix_trans_src_norm = _torch_inverse_cast(src_norm_trans_src_pix)
File "E:\fdr\conda\envs\UMF-CMGR\lib\site-packages\kornia\utils\helpers.py", line 50, in _torch_inverse_cast
return torch.inverse(input.to(dtype)).to(input.dtype)
RuntimeError: CUDA error: no kernel image is available for execution on the device
请问一下代码中96行 ir_pred, f_warp, flow, int_flow1, int_flow2, disp_pred = RegNet(it, ir)配准网络只有两个输入会导致报错,这如何解决?
我看deformable_net中输入的确需要三个值def forward(self, tgt, src, real,shape=None):
而且我输入三个值,ir_pred输出的灰度值全是0,不知道是不是因为这个导致的
如果return_transform不为空,就会报错 可以帮帮孩子吗
ValueError: return_transform
is deprecated. Please access the transformation matrix with .transform_matrix
. For chained matrices, please use AugmentationSequential
.
请问,配准的目的是将两幅图片尽量对齐。请问,本工作的代码是在什么阶段对图片进行变化然后作为训练\测试数据集的呢?
是在训练CPSTN时 将dataset_mode 设为 unaligned就可以了吗?
Hello, i'm really interested in this work and i'm looking forward to try CPSTN to generate pseudo-ir images, but unfortunately i cannot download your pretrained model from baidun because from Europe is really difficult to create an account. Would it be possible for you to make it available from other sources (like google drive) or to let me have the pre-trained models in another way?
Thank you in advance!
您好,非常感谢您的工作。
我在test_reg中将disp相关代码删除,但配准的图像不在有空间变换,请问这个是什么原因?
其次,请问test_reg中的--it输入的是什么?
我在获取变形红外图像和训练配准和测试配准都遇到同样的问题
warped, affine_param = self.trs(input) # [batch_size, 3, 3]
ValueError: not enough values to unpack (expected 2, got 1)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.