taohuang2018 / neighbor2neighbor Goto Github PK
View Code? Open in Web Editor NEWNeighbor2Neighbor: Self-Supervised Denoising from Single Noisy Images
License: BSD 3-Clause "New" or "Revised" License
Neighbor2Neighbor: Self-Supervised Denoising from Single Noisy Images
License: BSD 3-Clause "New" or "Revised" License
Will you release the pretrained model on SIDD? And is there a RRG based model for illustration?
Thanks for your replies!
I have a question about the training dataset. Why can you assume we can obtain multiple degradation versions of the same latent images in unsupervised denoising? The Line 370 in the "Train.py" script, you add different noise samples on clear images in each epoch, which results in generating multiple degradation versions of the same latent images. I believe this operation violates the statement in your paper. "The proposed self-supervised framework aims at training denoising networks with only single noisy images available" (Not multiple noisy images as N2N).
That is a very crucial problem. If we can generate multiples noisy versions of the same latent image, for example, 100 versions as you do (100 epochs), we can simply average all these 100 counterparts to generate the pseudo ground truth and use them for supervision learning. In fact, collecting multiple noisy versions of the same latent images and averaging all of them to generate the pseudo ground truth is the way for real-world datasets, e.g., DND and SIDD.
Hi there, is the architecture of UNet you reported in your paper similar to blind spot UNet in High-Quality Self-Supervised Deep Image Denoising paper or just a simple UNet in medical image segmentation?
Hi, thanks for this great works. I trained the denoise networks in ILSVRC2012_img_val(with noisetype = gauss25) and my local sRGB dataset, using the same configure parameters. When testing the trained models, the first model preforms good as expected, while the last performs not so good and it sames like have little denoise affect.
Testing in BSD datasets.
From left to right, its clean image\add_noisy image\denoisy image.
Testing in local images.
Left is noisy input, right is denoise output.
It sames like training with synthetic dataset, the model performs good, while training with real local noisy datasets, the model performs not so good. I would appreciate it if anyone could give some advice.
Hello
How are you?
Thanks for contributing to this project.
It seems that this method is customized for ONLY denoising.
Is it possible to apply this method without any change to a general image restoration tasks such as image deblurring & deraining?
Thanks for your great work.
I apply neighbor sub-sampler on the packed 4-channel raw images and cannot reproduce the results.
Did I do anything wrong?
Here is my data-processing code.
NOISY_PATH = ['_NOISY_RAW_010.MAT','_NOISY_RAW_011.MAT']
MODEL_BAYER = {'GP':'BGGR','IP':'RGGB','S6':'GRBG','N6':'BGGR','G4':'BGGR'}
TARGET_PATTERN = 'RGGB'
class Dataset(data.Dataset):
def __init__(self, path, crop_size, is_train=True):
super(Dataset, self).__init__()
self.crop_size = crop_size
self.is_train = is_train
if self.is_train:
self.unify_mode = 'crop'
else:
self.unify_mode = 'pad'
self.file_lists = []
folder_names = os.listdir(os.path.join(path,'Data'))
for folder_name in folder_names:
scene_instance_number,scene_number,_ = meta_read(folder_name)
for file_path in NOISY_PATH:
self.file_lists.append(os.path.join(path, 'Data', folder_name, scene_instance_number+file_path))
def __getitem__(self, index):
noisy_path = self.file_lists[index]
gt_path = noisy_path.replace('NOISY', 'GT')
_,_,bayer_pattern = meta_read(noisy_path.split('/')[-2])
noisy = h5py_loadmat(noisy_path)
noisy = BayerUnifyAug.bayer_unify(noisy, bayer_pattern, TARGET_PATTERN, self.unify_mode)
gt = h5py_loadmat(gt_path)
gt = BayerUnifyAug.bayer_unify(gt, bayer_pattern, TARGET_PATTERN, self.unify_mode)
if self.is_train:
augment = np.random.rand(3) > 0.5
noisy = BayerUnifyAug.bayer_aug(noisy, augment[0], augment[1], augment[2], TARGET_PATTERN)
gt = BayerUnifyAug.bayer_aug(gt, augment[0], augment[1], augment[2], TARGET_PATTERN)
noisy = pack_raw_np(noisy[:,:,None])
gt = pack_raw_np(gt[:,:,None])
if self.crop_size[0] != 0 and self.crop_size[1] != 0:
H, W, _ = noisy.shape
rnd_h = random.randint(0, max(0, H - self.crop_size[0]))
rnd_w = random.randint(0, max(0, W - self.crop_size[1]))
noisy = noisy[rnd_h:rnd_h + self.crop_size[0], rnd_w:rnd_w + self.crop_size[1], :]
gt = gt[rnd_h:rnd_h + self.crop_size[0], rnd_w:rnd_w + self.crop_size[1], :]
noisy = torch.from_numpy(noisy.transpose(2, 0, 1))
gt = torch.from_numpy(gt.transpose(2, 0, 1))
return noisy, gt, bayer_pattern
def __len__(self):
return len(self.file_lists)
def meta_read(info):
info = info.split('_')
scene_instance_number = info[0]
scene_number = info[1]
smartphone_code = info[2]
#ISO_level = info[3]
#shutter_speed = info[4]
#illuminant_temperature = info[5]
#illuminant_brightness_code = info[6]
return scene_instance_number,scene_number,MODEL_BAYER[smartphone_code]
def pack_raw_np(im):
img_shape = im.shape
H = img_shape[0]
W = img_shape[1]
## R G G B
out = np.concatenate((im[0:H:2,0:W:2,:],
im[0:H:2,1:W:2,:],
im[1:H:2,0:W:2,:],
im[1:H:2,1:W:2,:]), axis=2)
return out
def h5py_loadmat(file_path:str):
with h5py.File(file_path, 'r') as f:
return np.array(f.get('x'),dtype=np.float32)
Looking forward to reading the mathematical proof.
Hi, I am confused about the code in the training_cript.md:
with torch.no_grad():
noisy_denoised = network(noisy)
It seems that the disabling gradients operation didn't appear in the paper, and could you please explain the special purpose for doing this?
when will your code be uploaded?
The paper proposes a novel method for self-supervised image denoising. However, I find some experimental results inaccurate due to a problem in training code.
The problem is that clean images should not be used in training epochs. In your current implementation, noise are randomly generated and added to clean images in every epoch. In this way, you get different noisy images from the same clean images by adding random noise across different epochs. In fact, the training scheme somehow falls back to "noise2noise" across epochs.
The correct implementation is:
I repeat the experiment in the corrected implementation and find the experimented results on Gaussian (
Dataset | PSNR | SSIM |
---|---|---|
Kodak | 31.81 | 0.8668 |
BSD300 | 30.59 | 0.8610 |
Set14 | 30.56 | 0.8408 |
(Using model parameters from training epoch 91)
The PSNR results are around 0.4-0.5 dB less than the numbers reported in Table 1 of the paper.
Thank you for your recent open source.
I reproduced the experiment on SIDD raw-RGB, but it can only achieve about 49.07dB. I know there are problems with some details and hope to get your help.
Here is my code:
`
for valid_name, valid_data in valid_dict.items():
psnr_result = []
ssim_result = []
num_img, num_block, valid_noisy, valid_gt = valid_data
for idx in range(num_img):
for idy in range(num_block):
im = valid_gt[idx, idy][:, :, np.newaxis]
noisy_im = valid_noisy[idx, idy][:, :, np.newaxis]
origin255 = im.copy() * 255.0
origin255 = origin255.astype(np.uint8)
noisy255 = noisy_im.copy() * 255.0
noisy255 = noisy255.astype(np.uint8)
# padding to square
H = noisy_im.shape[0]
W = noisy_im.shape[1]
val_size = (max(H, W) + 31) // 32 * 32
noisy_im = np.pad(
noisy_im,
[[0, val_size - H], [0, val_size - W], [0, 0]],
'reflect')
transformer = transforms.Compose([transforms.ToTensor()])
noisy_im = transformer(noisy_im)
noisy_im = torch.unsqueeze(noisy_im, 0)
noisy_im = noisy_im.cuda()
# pack raw data
noisy_im = space_to_depth(noisy_im, block_size=2)
with torch.no_grad():
prediction = network(noisy_im)
# unpack raw data
prediction = depth_to_space(prediction, block_size=2)
prediction = prediction[:, :, :H, :W]
prediction = prediction.permute(0, 2, 3, 1)
prediction = prediction.cpu().data.clamp(0, 1).numpy()
prediction = prediction.squeeze(0)
pred255 = np.clip(prediction * 255.0 + 0.5, 0,
255).astype(np.uint8)
# calculate psnr
cur_psnr = calculate_psnr(origin255.astype(np.float32),
pred255.astype(np.float32))
psnr_result.append(cur_psnr)
cur_ssim = calculate_ssim(origin255.astype(np.float32),
pred255.astype(np.float32))
ssim_result.append(cur_ssim)
# visualization
save_path = os.path.join(
validation_path,
"{}_{:03d}-{:03d}-{:03d}_clean.png".format(
valid_name, idx, idy, epoch))
Image.fromarray(origin255.squeeze()).save(save_path)
save_path = os.path.join(
validation_path,
"{}_{:03d}-{:03d}-{:03d}_noisy.png".format(
valid_name, idx, idy, epoch))
Image.fromarray(noisy255.squeeze()).save(save_path)
save_path = os.path.join(
validation_path,
"{}_{:03d}-{:03d}-{:03d}_denoised.png".format(
valid_name, idx, idy, epoch))
Image.fromarray(pred255.squeeze()).save(save_path)
psnr_result = np.array(psnr_result)
avg_psnr = np.mean(psnr_result)
avg_ssim = np.mean(ssim_result)
log_path = os.path.join(validation_path,
"A_log_{}.csv".format(valid_name))
with open(log_path, "a") as f:
f.writelines("{},{},{}\n".format(epoch, avg_psnr, avg_ssim))
`
It would be extremely helpful if training logs are provided (for any dataset). This would give a clue to see if the custom training (i.e. modified versions by other people) is going as expected.
Why do you need to pad img in val to this:
val_size = (max(H, W) + 31) // 32 * 32
Hello
Thank you for your amazing job!
According your paper, gamma (in Section 4.2) is the most import parameter. but in your code, I doesn't see it. But in your code, there is the increase_ratio
, which doesn't show in your paper. What does the increase_ratio
mean?
And what does the Lambda1 and Lambda2 mean? These two doesn't show in your paper and it seems like that these two are only balance the loss1 and loss2. Is that right?
Thank you in advance!
Hello, I don 't seem to see the test code here,can you upload the test code?
Hi, Tao:
Thanks for your great work.
I meet some challenges when I reimplement the results in Neighbor2Neighbor. Specifically, I strictly followed the modified UNet in https://github.com/NVlabs/selfsupervised-denoising and implemented training code in Pytorch. But I can't reproduce the results in Table 1. For Gaussian noise (std: 5-50), the PSNR in BSD300, Set14, and KODAK are respectively 31.18, 31.2, 32.44. Since I have strictly followed the experimental setup reported in Neighbor2Neighbor. I guess your architecture may be a little different from the aforementioned modified UNet. If possible, I would like to be clear about the difference or be able to know some other details.
Thanks for your excellent work.
I have encountered a problem with Table 1 in reproducing your method. Is σ selected randomly from [5,50] or fixed at 25 in evaluation when σ∈[5, 50]?
Following is the error I get after following the steps and executing the training script. Haven't gone through the script yet, any clues to what the issue might be?
Traceback (most recent call last):
File "train.py", line 378, in
noisy_denoised = network(noisy)
File "/home/abhigyan2/env1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/abhigyan2/flcp/Neighbor2Neighbor/arch_unet.py", line 206, in forward
x = self.up5(x, pool4)
File "/home/abhigyan2/env1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/abhigyan2/flcp/Neighbor2Neighbor/arch_unet.py", line 43, in forward
x1 = self.deconv(x1)
File "/home/abhigyan2/env1/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/abhigyan2/env1/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 916, in forward
return F.conv_transpose2d(
TypeError: conv_transpose2d(): argument 'output_padding' (position 6) must be tuple of ints, not tuple
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.