jwsoh / mzsr Goto Github PK
View Code? Open in Web Editor NEWMeta-Transfer Learning for Zero-Shot Super-Resolution (CVPR, 2020)
Meta-Transfer Learning for Zero-Shot Super-Resolution (CVPR, 2020)
Please can you kindly explain me how to calculate this weight loss ?
def get_loss_weights(self):
loss_weights = tf.ones(shape=[self.TASK_ITER]) * (1.0/self.TASK_ITER)
decay_rate = 1.0 / self.TASK_ITER / (10000 / 3)
min_value= 0.03 / self.TASK_ITER
loss_weights_pre = tf.maximum(loss_weights[:-1] - (tf.multiply(tf.to_float(self.global_step), decay_rate)), min_value)
loss_weight_cur= tf.minimum(loss_weights[-1] + (tf.multiply(tf.to_float(self.global_step),(self.TASK_ITER- 1) * decay_rate)), 1.0 - ((self.TASK_ITER - 1) * min_value))
loss_weights = tf.concat([[loss_weights_pre], [[loss_weight_cur]]], axis=1)
return loss_weights
Hi
While i am trying to setup the tensorboard, i am not being able to create the event log file. It seems like to be some issues with tf.summary.FileWriter(). This issues only happens on the particular computer. I could not find any related solution online. Would you give me suggestion of how to fix it?
Error message
T:\src\github\tensorflow\tensorflow\core\util\events_writer.cc:104] Write failed because file could not be opened.
(base) wit@wit:/media/wit/Data1/WH/MZSR-master-new/Large-Scale_Training$ python3 main.py --gpu 0 --trial 2 --step 0
Initialize Training
Build Model MODEL
Initialize weights MODEL
Setting Train Configuration
Model Params: 225 K
========== Reading Checkpoints ============
============= Failed to find a checkpoint =============
========== No model to load ===========
Training Starts
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1278, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found.
[[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_STRING, DT_STRING], dense_keys=["image", "label"], dense_shapes=[[], []], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const, ParseSingleExample/Const)]]
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?,16,16,3]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 49, in
main()
File "main.py", line 46, in main
Trainer.run()
File "/media/wit/Data1/WH/MZSR-master-new/Large-Scale_Training/train.py", line 88, in run
label_train_, input_train_ = sess.run([label_train, input_train])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 877, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1272, in _do_run
run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found.
[[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_STRING, DT_STRING], dense_keys=["image", "label"], dense_shapes=[[], []], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const, ParseSingleExample/Const)]]
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?,16,16,3]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hello, the author, this method is really great. Can you tell me when the training code will be released
My question was would training the model on a dataset more specific to the use case be a better choice compared to datasets such as Div2k.
Is it possible to set up distributed training (running on multiple GPUs)?
When I set parser.add_argument('--gpu', type=str, dest='gpu', default='0,1') it still runs on one GPU
Thank you
Thank you for your outstanding work , When I run training steps, the models of X2 and X4 are not saved, can you help me?
I run it:
python main.py --train --gpu 0 --trial 1 --step 0
The results only log1 file. Where are the models?
Hi,
You said that "Remove 'label' key in 'write_to_tfrecord()' function.“ and save the ground-truth patches in your GitHub. However, the label key is relevant to grond-truth. Is this wrong?
I "Remove all contents regarding low-resolution images in the 'generate_TFRecord()' function" as you said and save the label key.
Then ** I have got the following error :**
tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found.
[[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_STRING], dense_keys=["image"], dense_shapes=[[]], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const)]]
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3]], output_types=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Does anyone have ideas ?
Dear Sir,
Amazing work ! Congratulation!!
please , I have a question.can you kindly provide me with the full path I should insert of checkpoint the trained large scale training model to be able to use it as a pre-trained to meta transfer training?
I'm waiting for your reply.
Thanks in advance
Thanks for your great job!
I want to use my own dataset during the test,but I find that there are errors due to different image resolutions.How should I change the model parameters to change the size of the input image resolution?
Thanks your work. And I found a problem when reproduced the results in the paper using the pre-trained models you gave. I found the result of setting of g^d_2.0 is corresponding to the paper(using the kernel.mat and the demo test commandline). But I just replaced the kernel.mat with the kernel of g^d_0.2, the performace dropped a lot, only psnr 30.185 even with 10 iterations. But the result in paper is 33.74(MZSR(10)), so what's the problem of my test process? Maybe the model tested for g^d_2.0 and the one for g^d_0.2 is not the same model?
Thanks for the nice work, and very well documentation.
I have some real low-resolution images without any ground-truth. I wanted to test your state-of-the-art MZSR model on that. I noticed that for testing as mention in the readme file I need "Ready for the input data (low-resolution) and corresponding kernel (kernel.mat file.)". I couldn't find any information neither in the paper nor in the repo regarding how to get the Kernel file. Could you please let me know what is the code for that? Did you use any other paper to compute the kernel? Thank you.
==================== PRETRAINED MODEL Loading Succeeded ====================
==================== Reading Checkpoints ====================
=================== Fail to find a Checkpoint ====================
==================== No model to load ======================================
[*] Training Starts
Traceback (most recent call last):
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call
return fn(*args)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[{{node IteratorGetNext}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "G:/class/papper/MZSR-master/main.py", line 67, in
main()
File "G:/class/papper/MZSR-master/main.py", line 26, in main
Trainer()
File "G:\class\papper\MZSR-master\train.py", line 170, in call
inputa, labela, inputb, labelb = self.data_generator.make_data_tensor(sess, self.scale_list, noise_std=0.0)
File "G:\class\papper\MZSR-master\dataGenerator.py", line 21, in make_data_tensor
label_train_=sess.run(self.label_train)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
run_metadata)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[node IteratorGetNext (defined at G:\class\papper\MZSR-master\dataGenerator.py:82) ]]
Errors may have originated from an input operation.
Input Source operations connected to node IteratorGetNext:
OneShotIterator (defined at G:\class\papper\MZSR-master\dataGenerator.py:80)
Original stack trace for 'IteratorGetNext':
File "G:/class/papper/MZSR-master/main.py", line 67, in
main()
File "G:/class/papper/MZSR-master/main.py", line 20, in main
task_batch_size=TASK_BATCH_SIZE,tfrecord_path=TFRECORD_PATH)
File "G:\class\papper\MZSR-master\dataGenerator.py", line 16, in init
self.label_train = self.load_tfrecord()
File "G:\class\papper\MZSR-master\dataGenerator.py", line 82, in load_tfrecord
label_train = iterator.get_next()
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\data\ops\iterator_ops.py", line 426, in get_next
output_shapes=self._structure._flat_shapes, name=name)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\ops\gen_dataset_ops.py", line 1973, in iterator_get_next
output_shapes=output_shapes, name=name)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in init
self._traceback = tf_stack.extract_stack()
I ran your code and got a model, but the effect of the model running on the Set5 dataset is different from that of the paper. What is the reason for this?
Operating parameters:
python main.py --gpu 0 --inputpath Input/g20/Set5/ --gtpath GT/Set5/ --savepath results/Set5 --kernelpath Input/g20/kernel.mat --model 0
The result after I iterate 100,000 times:
[] Average PSNR ** Initial: 14.9049, Final : 34.2266
You result:
[] Average PSNR ** Initial: 15.6594, Final : 35.1986
HI, it is a nice job, do you have a plan to share your training code?
Thanks for your great job!
I have read the paper but I cannot find the the the range of lambda when you synthesize the isotropic Gaussian blur kernel for training.
I set the sigma to np.asarray([[lamda, 0],[0, lamda]]) to synthesize the isotropic Gaussian blur kernel, is that right ?
Thank you for your admirable and meaningful work!
My question is that why the downsampling operator is implemented by a model rather than a algorithm in the meta-test step. And I only found the models of bicubic x2, direct x2,direct x4 and multi-scale. But I want to know how they are implemented. Could you please upload the code? Thank you very much!
Nice job for solve the time problem of ZSSR!
I notice you have a kernel in input, for SR problem, kernel is important, but you do not provide any code to get kernel. How can you get the kernel?
Does the experimental environment have to be "CUDA 9.0 & cuDNN 7.1"? I only have "CUDA 10.0 & cuDNN 7.4" environment.
Thank you for sharing the code. I have the following problem, do you know how to solve it? thanks
==================== PRETRAINED MODEL Loading Succeeded ====================
==================== Reading Checkpoints ====================
=================== Fail to find a Checkpoint ====================
==================== No model to load ======================================
[*] Training Starts
Traceback (most recent call last):
File "D:/2020/ReferenceCode/MZSR-master/main.py", line 71, in
main()
File "D:/2020/ReferenceCode/MZSR-master/main.py", line 30, in main
Trainer()
File "D:\2020\ReferenceCode\MZSR-master\train.py", line 171, in call
inputa, labela, inputb, labelb = self.data_generator.make_data_tensor(sess, self.scale_list, noise_std=0.0)
File "D:\2020\ReferenceCode\MZSR-master\dataGenerator.py", line 19, in make_data_tensor
label_train_=sess.run(self.label_train)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 958, in run
run_metadata_ptr)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1181, in _run
feed_dict_tensor, options, run_metadata)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1359, in _do_run
run_metadata)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1365, in _do_call
return fn(*args)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _run_fn
target_list, run_metadata)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1443, in _call_tf_sessionrun
run_metadata)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 86: invalid continuation byte
Process finished with exit code 1
What is the usage of SECOND_ORDER_GRAD_ITER=0
and self.total_loss1
?
As for SECOND_ORDER_GRAD_ITER=0
:
if step == SECOND_ORDER_GRAD_ITER:
second_grad=sess.run(self.second_grad_on)
If we have finished pre-training on large scale datasets, I think it is useless in this meta-transfer learning step.
As for self.total_loss1
:
self.total_loss1 = tf.reduce_sum(self.lossesa) / tf.to_float(self.META_BATCH_SIZE)
self.pretrain_op = tf.train.AdamOptimizer(self.META_LR).minimize(self.total_loss1)
self.gvs = self.opt.compute_gradients(self.weighted_total_losses2)
self.metatrain_op= self.opt.apply_gradients(self.gvs)
sess.run(self.metatrain_op, feed_dict=feed_dict)
In this meta-transfer learning step, total_loss1
is never used for optimizers. Is it correct?
Dear Sir,
Amazing work ! Congratulation!!
I have a problem when training,as follows:
========== Reading Checkpoints ============
============= Failed to find a checkpoint =============
========== No model to load ===========
did I did something wrogn with Generate TFRecord Dataset?
Hi, thanks for your meaningful work.
I want to ask if I do the following operations, which model ill I have?
train_MZSR.tfrecord
(7.09GB);python main.py --train --gpu 0 --trial 0 --step 0
.I trained for 27522 iters, but when I test the trained model on Input/g20/Set5/
, I get PSNR=33.6947 and SSIM=0.9265.
Can you give me some suggestions?
Hello. Is there any possible method of utilizing MZSR without CUDA and cuDNN?
Thanks.
Hello, thank you very much for sharing your codes.
When during large-scale training, I came across this error. Can you kindly let me know where I went wrong.
The error code is below.
========== Reading Checkpoints ============
============= Failed to find a checkpoint =============
========== No model to load ===========
WARNING:tensorflow:From /content/MZSR/Large-Scale_Training/train.py:74: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.
Training Starts
Traceback (most recent call last):
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __inference_Dataset_map_Train._parse_function_673}} Feature: image (data type: string) is required but could not be found.
[[{{node ParseSingleExample/ParseSingleExample}}]]
[[IteratorGetNext]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 51, in
main()
File "main.py", line 48, in main
Trainer.run()
File "/content/MZSR/Large-Scale_Training/train.py", line 87, in run
label_train_, input_train_ = sess.run([label_train, input_train])
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found.
[[{{node ParseSingleExample/ParseSingleExample}}]]
[[IteratorGetNext]]
My question is that I have images where I need to perform super resolution, but do not have the high-resolution ground truth for them. How can I perform inference on these images?
It seems that the PSNR and SSIM results of “bicubic” downsampling scenario in the paper cannot be obtained using your current released model. Could you please upload the LR images for “bicubic” downsampling test and code for bicubic downsampling?
In addition, when training the bicubicx2 model, the PSNR/SSIM results obtained by using the existing training code are quite different from those mentioned in the paper. Should I adjust some parameters corresponding to the bicubic downsampling scenario?
First, error message:
Traceback (most recent call last):
File ".../Large-Scale_Training/main.py", line 49, in
main()
File ".../Large-Scale_Training/main.py", line 46, in main
Trainer.run()
File "...\Large-Scale_Training\train.py", line 39, in run
self.calc_loss()
File "...\Large-Scale_Training\train.py", line 33, in calc_loss
self.loss=tf.losses.absolute_difference(self.MODEL.output , self.label)
File "...\lib\site-packages\tensorflow\python\ops\losses\losses_impl.py", line 271, in absolute_difference
predictions.get_shape().assert_is_compatible_with(labels.get_shape())
File "...\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 844, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (?, 96, 96, 3) and (?, 48, 48, 3) are incompatible
The cause of the error is that the parameter settings of the model itself cannot make the patch read in the dataset reach the size of the scale demanded, but remain the size itself unchanged, which leads to a mismatch in the dimension when calculating the l1_loss of the output of the model and ground truth.
So, I am curious, to get the experimental results proposed in the paper, how should the data be preprocessed in Large-Scale_Training.
Hi, thank you for sharing your wonderful work.
When running large scale training, I encountered this error
Traceback (most recent call last):
File "main.py", line 49, in
main()
File "main.py", line 45, in main
scale=SCALE,num_of_data=NUM_OF_DATA, conf=conf)
TypeError: init() missing 1 required positional argument: 'model_num'
It will be very helpful for me if you could help me out with this.
Thank you
Hi, thanks for your work.
I noticed in the paper, you mentioned you used both isotropic and anisotropic Gaussian kernels for the blur kernels, while in the code at here, I found that only random anisotropic Gaussian kernel would be generated.
Maybe this is my misunderstanding, can you give me some guides on this?
Thanks.
Hi, thanks for your work.
I wonder what is the difference between [inputa, labela]
and [inputb, labelb]
?
Please I'm facing a problem when i load the pretrained model , specially when it reads the checkpoint
this is the error .. how did you kindly solve it please ??
NotFoundError (see above for traceback): Key MODEL/conv7/kernel/Adam_3 not found in checkpoint
[[Node: save/RestoreV2_69 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_69/tensor_names, save/RestoreV2_69/shape_and_slices)]]
Hi, I only found the pre-training model, but I want to know this part of the code.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.