Giter Site home page Giter Site logo

mzsr's Issues

I do not understand how to calculate the weight loss ?

Please can you kindly explain me how to calculate this weight loss ?

def get_loss_weights(self):
loss_weights = tf.ones(shape=[self.TASK_ITER]) * (1.0/self.TASK_ITER)
decay_rate = 1.0 / self.TASK_ITER / (10000 / 3)
min_value= 0.03 / self.TASK_ITER

    loss_weights_pre = tf.maximum(loss_weights[:-1] - (tf.multiply(tf.to_float(self.global_step), decay_rate)), min_value)

    loss_weight_cur= tf.minimum(loss_weights[-1] + (tf.multiply(tf.to_float(self.global_step),(self.TASK_ITER- 1) * decay_rate)), 1.0 - ((self.TASK_ITER - 1) * min_value))
    loss_weights = tf.concat([[loss_weights_pre], [[loss_weight_cur]]], axis=1)
    return loss_weights

Unable to create event file

Hi

While i am trying to setup the tensorboard, i am not being able to create the event log file. It seems like to be some issues with tf.summary.FileWriter(). This issues only happens on the particular computer. I could not find any related solution online. Would you give me suggestion of how to fix it?

Error message
T:\src\github\tensorflow\tensorflow\core\util\events_writer.cc:104] Write failed because file could not be opened.

image

When I ran large-scale training code, I have some problems. Could you help me?

(base) wit@wit:/media/wit/Data1/WH/MZSR-master-new/Large-Scale_Training$ python3 main.py --gpu 0 --trial 2 --step 0
Initialize Training
Build Model MODEL
Initialize weights MODEL
Setting Train Configuration
Model Params: 225 K
========== Reading Checkpoints ============
============= Failed to find a checkpoint =============
========== No model to load ===========
Training Starts
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1278, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found.
[[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_STRING, DT_STRING], dense_keys=["image", "label"], dense_shapes=[[], []], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const, ParseSingleExample/Const)]]
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?,16,16,3]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 49, in
main()
File "main.py", line 46, in main
Trainer.run()
File "/media/wit/Data1/WH/MZSR-master-new/Large-Scale_Training/train.py", line 88, in run
label_train_, input_train_ = sess.run([label_train, input_train])
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 877, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1272, in _do_run
run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found.
[[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_STRING, DT_STRING], dense_keys=["image", "label"], dense_shapes=[[], []], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const, ParseSingleExample/Const)]]
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3], [?,16,16,3]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

about the training code

Hello, the author, this method is really great. Can you tell me when the training code will be released

distributed training

Is it possible to set up distributed training (running on multiple GPUs)?
When I set parser.add_argument('--gpu', type=str, dest='gpu', default='0,1') it still runs on one GPU
Thank you

train the model

Thank you for your outstanding work , When I run training steps, the models of X2 and X4 are not saved, can you help me?
I run it:
python main.py --train --gpu 0 --trial 1 --step 0
The results only log1 file. Where are the models?

How does tfrecord generate ?

Hi,

You said that "Remove 'label' key in 'write_to_tfrecord()' function.“ and save the ground-truth patches in your GitHub. However, the label key is relevant to grond-truth. Is this wrong?

I "Remove all contents regarding low-resolution images in the 'generate_TFRecord()' function" as you said and save the label key.

Then ** I have got the following error :**

tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found.
[[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_STRING], dense_keys=["image"], dense_shapes=[[]], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const)]]
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,64,64,3]], output_types=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Does anyone have ideas ?

Reproduction with the given model

Thanks your work. And I found a problem when reproduced the results in the paper using the pre-trained models you gave. I found the result of setting of g^d_2.0 is corresponding to the paper(using the kernel.mat and the demo test commandline). But I just replaced the kernel.mat with the kernel of g^d_0.2, the performace dropped a lot, only psnr 30.185 even with 10 iterations. But the result in paper is 33.74(MZSR(10)), so what's the problem of my test process? Maybe the model tested for g^d_2.0 and the one for g^d_0.2 is not the same model?

How to get kernel for the real image dataset?

Thanks for the nice work, and very well documentation.

I have some real low-resolution images without any ground-truth. I wanted to test your state-of-the-art MZSR model on that. I noticed that for testing as mention in the readme file I need "Ready for the input data (low-resolution) and corresponding kernel (kernel.mat file.)". I couldn't find any information neither in the paper nor in the repo regarding how to get the Kernel file. Could you please let me know what is the code for that? Did you use any other paper to compute the kernel? Thank you.

Hello, there is a problem when I run your code, could you please help me to solve it?

==================== PRETRAINED MODEL Loading Succeeded ====================
==================== Reading Checkpoints ====================
=================== Fail to find a Checkpoint ====================
==================== No model to load ======================================

[*] Training Starts
Traceback (most recent call last):
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call
return fn(*args)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[{{node IteratorGetNext}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "G:/class/papper/MZSR-master/main.py", line 67, in
main()
File "G:/class/papper/MZSR-master/main.py", line 26, in main
Trainer()
File "G:\class\papper\MZSR-master\train.py", line 170, in call
inputa, labela, inputb, labelb = self.data_generator.make_data_tensor(sess, self.scale_list, noise_std=0.0)
File "G:\class\papper\MZSR-master\dataGenerator.py", line 21, in make_data_tensor
label_train_=sess.run(self.label_train)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
run_metadata)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[node IteratorGetNext (defined at G:\class\papper\MZSR-master\dataGenerator.py:82) ]]

Errors may have originated from an input operation.
Input Source operations connected to node IteratorGetNext:
OneShotIterator (defined at G:\class\papper\MZSR-master\dataGenerator.py:80)

Original stack trace for 'IteratorGetNext':
File "G:/class/papper/MZSR-master/main.py", line 67, in
main()
File "G:/class/papper/MZSR-master/main.py", line 20, in main
task_batch_size=TASK_BATCH_SIZE,tfrecord_path=TFRECORD_PATH)
File "G:\class\papper\MZSR-master\dataGenerator.py", line 16, in init
self.label_train = self.load_tfrecord()
File "G:\class\papper\MZSR-master\dataGenerator.py", line 82, in load_tfrecord
label_train = iterator.get_next()
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\data\ops\iterator_ops.py", line 426, in get_next
output_shapes=self._structure._flat_shapes, name=name)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\ops\gen_dataset_ops.py", line 1973, in iterator_get_next
output_shapes=output_shapes, name=name)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "D:\learningtool\python\1\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in init
self._traceback = tf_stack.extract_stack()

Model results

I ran your code and got a model, but the effect of the model running on the Set5 dataset is different from that of the paper. What is the reason for this?
Operating parameters:
python main.py --gpu 0 --inputpath Input/g20/Set5/ --gtpath GT/Set5/ --savepath results/Set5 --kernelpath Input/g20/kernel.mat --model 0
The result after I iterate 100,000 times:
[] Average PSNR ** Initial: 14.9049, Final : 34.2266
You result:
[
] Average PSNR ** Initial: 15.6594, Final : 35.1986

Training code

HI, it is a nice job, do you have a plan to share your training code?

About the range of lambda of isotropic Gaussian blur kernel

Thanks for your great job!
I have read the paper but I cannot find the the the range of lambda when you synthesize the isotropic Gaussian blur kernel for training.
I set the sigma to np.asarray([[lamda, 0],[0, lamda]]) to synthesize the isotropic Gaussian blur kernel, is that right ?

How can you get the kernel?

Nice job for solve the time problem of ZSSR!
I notice you have a kernel in input, for SR problem, kernel is important, but you do not provide any code to get kernel. How can you get the kernel?

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 86: invalid continuation byte

Thank you for sharing the code. I have the following problem, do you know how to solve it? thanks

==================== PRETRAINED MODEL Loading Succeeded ====================
==================== Reading Checkpoints ====================
=================== Fail to find a Checkpoint ====================
==================== No model to load ======================================
[*] Training Starts
Traceback (most recent call last):
File "D:/2020/ReferenceCode/MZSR-master/main.py", line 71, in
main()
File "D:/2020/ReferenceCode/MZSR-master/main.py", line 30, in main
Trainer()
File "D:\2020\ReferenceCode\MZSR-master\train.py", line 171, in call
inputa, labela, inputb, labelb = self.data_generator.make_data_tensor(sess, self.scale_list, noise_std=0.0)
File "D:\2020\ReferenceCode\MZSR-master\dataGenerator.py", line 19, in make_data_tensor
label_train_=sess.run(self.label_train)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 958, in run
run_metadata_ptr)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1181, in _run
feed_dict_tensor, options, run_metadata)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1359, in _do_run
run_metadata)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1365, in _do_call
return fn(*args)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _run_fn
target_list, run_metadata)
File "G:\Anaconda\envs\tensorflow2.3-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1443, in _call_tf_sessionrun
run_metadata)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 86: invalid continuation byte

Process finished with exit code 1

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1?

What is the usage of SECOND_ORDER_GRAD_ITER=0 and self.total_loss1?
As for SECOND_ORDER_GRAD_ITER=0:

if step == SECOND_ORDER_GRAD_ITER:
       second_grad=sess.run(self.second_grad_on)

If we have finished pre-training on large scale datasets, I think it is useless in this meta-transfer learning step.
As for self.total_loss1:

self.total_loss1 = tf.reduce_sum(self.lossesa) / tf.to_float(self.META_BATCH_SIZE)
self.pretrain_op = tf.train.AdamOptimizer(self.META_LR).minimize(self.total_loss1)
        
self.gvs = self.opt.compute_gradients(self.weighted_total_losses2)
self.metatrain_op= self.opt.apply_gradients(self.gvs)

sess.run(self.metatrain_op, feed_dict=feed_dict)

In this meta-transfer learning step, total_loss1 is never used for optimizers. Is it correct?

Sir,I have a problem when training

Dear Sir,
Amazing work ! Congratulation!!
I have a problem when training,as follows:
========== Reading Checkpoints ============
============= Failed to find a checkpoint =============
========== No model to load ===========

did I did something wrogn with Generate TFRecord Dataset?

Which model will get if I run train python main.py --train

Hi, thanks for your meaningful work.
I want to ask if I do the following operations, which model ill I have?

  1. I create trainset use MainSR to get train_MZSR.tfrecord(7.09GB);
  2. run python main.py --train --gpu 0 --trial 0 --step 0.

I trained for 27522 iters, but when I test the trained model on Input/g20/Set5/, I get PSNR=33.6947 and SSIM=0.9265.

Can you give me some suggestions?

Use MZSR without CUDA?

Hello. Is there any possible method of utilizing MZSR without CUDA and cuDNN?

Thanks.

Error during large scale training

Hello, thank you very much for sharing your codes.

When during large-scale training, I came across this error. Can you kindly let me know where I went wrong.
The error code is below.

========== Reading Checkpoints ============
============= Failed to find a checkpoint =============
========== No model to load ===========
WARNING:tensorflow:From /content/MZSR/Large-Scale_Training/train.py:74: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

Training Starts
Traceback (most recent call last):
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __inference_Dataset_map_Train._parse_function_673}} Feature: image (data type: string) is required but could not be found.
[[{{node ParseSingleExample/ParseSingleExample}}]]
[[IteratorGetNext]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 51, in
main()
File "main.py", line 48, in main
Trainer.run()
File "/content/MZSR/Large-Scale_Training/train.py", line 87, in run
label_train_, input_train_ = sess.run([label_train, input_train])
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Feature: image (data type: string) is required but could not be found.
[[{{node ParseSingleExample/ParseSingleExample}}]]
[[IteratorGetNext]]

During reproducing “bicubic” downsampling scenario...

It seems that the PSNR and SSIM results of “bicubic” downsampling scenario in the paper cannot be obtained using your current released model. Could you please upload the LR images for “bicubic” downsampling test and code for bicubic downsampling?
In addition, when training the bicubicx2 model, the PSNR/SSIM results obtained by using the existing training code are quite different from those mentioned in the paper. Should I adjust some parameters corresponding to the bicubic downsampling scenario?

Variable dimensions are incompatible while calculating l1_loss(during Large-Scale_Training)

First, error message:
Traceback (most recent call last):
File ".../Large-Scale_Training/main.py", line 49, in
main()
File ".../Large-Scale_Training/main.py", line 46, in main
Trainer.run()
File "...\Large-Scale_Training\train.py", line 39, in run
self.calc_loss()
File "...\Large-Scale_Training\train.py", line 33, in calc_loss
self.loss=tf.losses.absolute_difference(self.MODEL.output , self.label)
File "...\lib\site-packages\tensorflow\python\ops\losses\losses_impl.py", line 271, in absolute_difference
predictions.get_shape().assert_is_compatible_with(labels.get_shape())
File "...\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 844, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (?, 96, 96, 3) and (?, 48, 48, 3) are incompatible

The cause of the error is that the parameter settings of the model itself cannot make the patch read in the dataset reach the size of the scale demanded, but remain the size itself unchanged, which leads to a mismatch in the dimension when calculating the l1_loss of the output of the model and ground truth.
So, I am curious, to get the experimental results proposed in the paper, how should the data be preprocessed in Large-Scale_Training.

Error during Large Scale Training

Hi, thank you for sharing your wonderful work.
When running large scale training, I encountered this error

Traceback (most recent call last):
File "main.py", line 49, in
main()
File "main.py", line 45, in main
scale=SCALE,num_of_data=NUM_OF_DATA, conf=conf)
TypeError: init() missing 1 required positional argument: 'model_num'

It will be very helpful for me if you could help me out with this.

Thank you

About blur kernel generate for meta learning

Hi, thanks for your work.

I noticed in the paper, you mentioned you used both isotropic and anisotropic Gaussian kernels for the blur kernels, while in the code at here, I found that only random anisotropic Gaussian kernel would be generated.

Maybe this is my misunderstanding, can you give me some guides on this?

Thanks.

Problem when i load the pretrained model , specially when it reads the checkpoint

Please I'm facing a problem when i load the pretrained model , specially when it reads the checkpoint
this is the error .. how did you kindly solve it please ??

NotFoundError (see above for traceback): Key MODEL/conv7/kernel/Adam_3 not found in checkpoint
[[Node: save/RestoreV2_69 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_69/tensor_names, save/RestoreV2_69/shape_and_slices)]]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.