jorge-pessoa / pytorch-msssim Goto Github PK
View Code? Open in Web Editor NEWThis project forked from po-hsun-su/pytorch-ssim
PyTorch differentiable Multi-Scale Structural Similarity (MS-SSIM) loss
License: Other
This project forked from po-hsun-su/pytorch-ssim
PyTorch differentiable Multi-Scale Structural Similarity (MS-SSIM) loss
License: Other
under Stability and normalization: normalized="relu" should be normalize="relu" in README.md
Currently the MS-SSIM calculation might return NAN when comparing two images with very low MS-SSIM scores, breaking the training process unless when accounted for.
Easy to reproduce, and can be avoided but the root cause should be discovered for fixing
Based on the wiki and the paper to compute contrast you need
In this line you are using
So I implemented this as follows:
import pytorch_msssim
[...]
lr_loss = pytorch_msssim.MSSSIM()
[...]
lr_tensor = torch.tensor(np.expand_dims(lr_img.astype(np.float32), axis=0)).type('torch.DoubleTensor').to(DEVICE)
in_tensor = torch.tensor(np.expand_dims(sr_img.astype(np.float32), axis=0)).type('torch.DoubleTensor').to(DEVICE)
[...]
ds_in_tensor = bds(in_tensor, nhwc=True)
lr_l = lr_loss(ds_in_tensor, lr_tensor)
l2_l = l2_loss(in_tensor, org_tensor)
l = lr_l + LAMBDA * l2_l
l.backward()
And I'm getting this error:
Traceback (most recent call last):
File "/usr/xtmp/superresoluter/superresolution/tester_msssim.py", line 137, in
lr_l = lr_loss(ds_in_tensor, lr_tensor)
File "/home/home5/abarnett/sr/lib/python3.5/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/usr/project/xtmp/superresoluter/superresolution/pytorch_msssim/init.py", line 133, in forward
return msssim(img1, img2, window_size=self.window_size, size_average=self.size_average)
File "/usr/project/xtmp/superresoluter/superresolution/pytorch_msssim/init.py", line 78, in msssim
sim, cs = ssim(img1, img2, window_size=window_size, size_average=size_average, full=True, val_range=val_range)
File "/usr/project/xtmp/superresoluter/superresolution/pytorch_msssim/init.py", line 41, in ssim
mu1 = F.conv2d(img1, window, padding=padd, groups=channel)
RuntimeError: Expected object of scalar type Double but got scalar type Float for argument #2 'weight'
Any ideas?
There is a todo in the code
# TODO: store window between calls if possible
An easy way to do this is to wrap create_window
with the functools.lru_cache
wrapper.
hello, I ran to problem while running your code,
Using mnist dataset and the input shape to ssim is two (batch_size, 1, 28 , 28)
seems like the variable "win" is, in default, only supporting 3d tensors
is there any work around?
thank you
p.s. following is my error
File "/home/*/anaconda3/lib/python3.7/site-packages/pytorch_msssim/ssim.py", line 37, in gaussian_filter
out = F.conv2d(out, win.transpose(2, 3), stride=1, padding=0, groups=C)
RuntimeError: Given groups=1, weight of size [3, 1, 11, 1], expected input[128, 3, 28, 18] to have 1 channels, but got 3 channels instead
In your implementation:
https://github.com/jorge-pessoa/pytorch-msssim/blob/master/pytorch_msssim/__init__.py
line 96:
output = torch.prod(pow1[:-1] * pow2[-1])
Should it be:
output = torch.prod(pow1[:-1]) * pow2[-1]
Otherwise the pow2 would be multiplied too many times?
Thanks!
Hello
How are you?
Thanks for contributing to this project.
Can we use this msssim loss in binary image segmentation?
What about using ssim loss rathern than msssim?
Hi
This code contains the same error as skimage
, you can read full description here: scikit-image/scikit-image#5192
Shortly, when used for estimation of perceptual quality, authors of original paper proposed to downsample images first to make SSIM focus on major differences between reference and distorted inputs.
So what?
If you are using this implementation as a loss function for CNN, you're likely leading it in the wrong direction.
Alternatives
You can find correct implementation of SSIM, MS-SSIM and some other metrics here:
https://github.com/photosynthesis-team/piq
Hi @jorge-pessoa,
Thanks a lot for sharing the code. I have some doubts about the usage of your implementation.
This is the way I am using the msssim in my code as a loss function.
loss_SSIM_A = pytorch_msssim.msssim(G_BA(real_A), real_A, normalize=True)
loss_SSIM_B = pytorch_msssim.msssim(G_AB(real_B), real_B, normalize=True)
loss_SSIM = (loss_SSIM_A + loss_SSIM_B) / 2
The result value until epoch 20 is betwwen 0.85 - 0.97 and then decrease to 0.65 until 0.81 , Is it fine or I need to define the threshold to maximize the value of MSSSIM as you did in max-ssim.py ?
Am I using the msssim correctly? because in the max-ssim.py you add the minus sign; should I do the same or not and what is the reason of using the minus sign?
msssim_out = -loss_func(img1, img2)
Thanks in advance!
Sorry to bother you. But I used ssim as a loss function, and found some strange outputs. The gray values of the inputs are normalized to [0,1], but the output can be out of this range, like (-100,...), which is meaningless. In other words, the network is optimized toward a wrong direction. I do not know what is the reason, could you please give me some suggestions? btw, the network can work with other loss functions.
Hi,
Thanks for this tool. I use both pytorch_mssim.ssim and skimage.measure.compare_ssim to compute ssim, but the results are different. For example, ssim evaluation on an image sequence:
pytorch_msssim.ssim: [0.9655, 0.9500, 0.9324, 0.9229, 0.9191, 0.9154]
skimage.measure.compare_ssim: [0.97794482, 0.96226299, 0.948432, 0.9386946, 0.93113704, 0.92531453]
Why will this happen?
Thanks for making this implementation.
I found the current implementation may have a small bug if I am not confused.
When throwing batches of the image with size > 1, cs
returned from ssim
function have size 1.
This may be expected behaviour when user wants average MS-SSIM score of images in batch (i.e. size_average=True
), but not ideal when user wants cs
for each image.
As a result, images within the same batch have a very similar MS-SSIM score even size_average=False
since they all have the same mcs
.
Here is some minimal reproduction on Google Colab.
https://colab.research.google.com/drive/1tNWb0QTqn3clnKcMlFeA8QOGJZDDRJe3?usp=sharing
I would appreciate if it is possible to specify whether to average cs
like ret
in implementation here
Proposed change:
Use size_average
flag to specify the behaviour of mean
operation on cs.
i.e.
cs = v1 / v2 # contrast sensitivity
ssim_map = ((2 * mu1_mu2 + C1) * v1) / ((mu1_sq + mu2_sq + C1) * v2)
if size_average:
cs = cs.mean()
ret = ssim_map.mean()
else:
cs = cs.mean(1).mean(1).mean(1)
ret = ssim_map.mean(1).mean(1).mean(1)
Additional Reference:
Tensorflow Implementation of MS-SSIM. They keep cs
for each batch. (here, they take mean of width and height and take mean over each channel later)
https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/ops/image_ops_impl.py#L3581
From Tensorflow implementation line 182, I saw that they first apply an average pooling filter then downsample by slicing with step 2:
for _ in range(levels):
ssim, cs = _SSIMForMultiScale(...)
...
filtered = [convolve(im, downsample_filter, mode='reflect')
for im in [im1, im2]]
im1, im2 = [x[:, ::2, ::2, :] for x in filtered]
But in your implementation, you only apply a 2x2 AvgPool, so the images are not downsampled by 2 as in the paper.
for _ in range(levels):
sim, cs = ssim(...)
...
img1 = F.avg_pool2d(img1, (2, 2))
img2 = F.avg_pool2d(img2, (2, 2))
Hi, wondering how can I compute the residual map for the MS-SSIM?
Quick question on the output range. Does the SSIM implementation output number of range [0;1] or [-1;1] instead as described in the original SSIM paper http://www.cns.nyu.edu/pub/lcv/wang03-reprint.pdf ?
How to deal with the problem of nan loss during training?
Thanks for making this implementation.
I found the current implementation may have a small typo if I am not confused.
The current implementation of output calculation seems to have the wrong bracket location, changing the result of MS-SSIM score.
https://github.com/jorge-pessoa/pytorch-msssim/blob/master/pytorch_msssim/__init__.py#L104
I think it should be
output = torch.prod(pow1[:-1]) * pow2[-1]
instead of
output = torch.prod(pow1[:-1] * pow2[-1])
Reference:
The original Implementation of MS-SSIM in Matlab: https://ece.uwaterloo.ca/~z70wang/research/iwssim/
Their calculation take product over pow1[1:-1] first and multiply pow2[-1]
overall_msssim = prod(mcs_array(1:level-1).^weight(1:level-1))*(msssim_array(level).^weight(level));
The original paper of MS-SSIM: https://www.cns.nyu.edu/pub/eero/wang03b.pdf
The calculation is pow2[-1] multiply product over pow1
I have seen that you use while value < threshold:
in your code - could you explain if it is necessary and why?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.