Giter Site home page Giter Site logo

deep_motion_mag's People

Contributors

12dmodel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep_motion_mag's Issues

Unzipped dataset description.

I'm wondering if there is any description of the unzipped dataset.
I downloaded the tfrecords files from the list, merged them into one file and unzipped it. However, there were some warnings during the unzipping process, and there is only one file train.tfrecords. Is this as expected?

Got demension mismatch during testing

I have got a really wired issue during testing, the dimension got a mismatch during
enc = tf.concat([texture_enc, shape_enc], axis=3)
and the mistake information is
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,84,68,32] vs. shape[1] = [1,83,68,32]
[[Node: ynet_3frames/decoder/concat = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32, _device="/job:localhost/replica:0/task:0/gpu:0"](ynet_3frames/decoder/Relu, ynet_3frames/manipulator/add_1, ynet_3frames/decoder/concat/axis)]]

There are restriction on the image size?

I have faced an error (below) while running run_on_test_videos.sh on my sequence of images
Wondering what may be, the only clue I got was tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,322,364,32] vs. shape[1] = [1,321,364,32]. Seems somewhere the shapes do not match, but I do not know from what, or what should I change in the code.
However, observing the images I put in the data/vids/'videoname', their sizes are 727 x 642. Then I realize that the error showed numbers close to the half of these image dimensions.
However, the problem do not seem to be with 727 (as ceil(727/2)=364 match in both shape[0] and shape[1] in the error description), so, the problem might be with the dimension 642, as 332 do not match with 321 (respectively 642/2 +1 and 642/2)

So, my question is: there is any known restriction on the image sizes? They must be odd? Which image sizes could make the approximation errors that break the program?
Knowing it Would help me to crop the images in a size the can work.

Thank you in advance.

The full error:

Traceback (most recent call last):
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,322,364,32] vs. shape[1] = [1,321,364,32]
	 [[{{node ynet_3frames/decoder/concat}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 102, in <module>
    main(arguments)
  File "main.py", line 83, in main
    args.velocity_mag)
  File "/home/jonathan/Dropbox/UnB/motionMag/deep_motion_mag-master/magnet.py", line 279, in run
    out_amp = self.inference(prev_frame, frame, amplification_factor)
  File "/home/jonathan/Dropbox/UnB/motionMag/deep_motion_mag-master/magnet.py", line 235, in inference
    [amplification_factor]})
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,322,364,32] vs. shape[1] = [1,321,364,32]
	 [[node ynet_3frames/decoder/concat (defined at /home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'ynet_3frames/decoder/concat':
  File "main.py", line 102, in <module>
    main(arguments)
  File "main.py", line 83, in main
    args.velocity_mag)
  File "/home/jonathan/Dropbox/UnB/motionMag/deep_motion_mag-master/magnet.py", line 265, in run
    self.setup_for_inference(checkpoint_dir, image_width, image_height)
  File "/home/jonathan/Dropbox/UnB/motionMag/deep_motion_mag-master/magnet.py", line 209, in setup_for_inference
    self._build_feed_model()
  File "/home/jonathan/Dropbox/UnB/motionMag/deep_motion_mag-master/magnet.py", line 196, in _build_feed_model
    False)
  File "/home/jonathan/Dropbox/UnB/motionMag/deep_motion_mag-master/magnet.py", line 145, in image_transformer
    return self._decoder(self.texture_b, self.out_shape_enc)
  File "/home/jonathan/Dropbox/UnB/motionMag/deep_motion_mag-master/magnet.py", line 119, in _decoder
    enc = tf.concat([texture_enc, shape_enc], axis=3)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 1420, in concat
    return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_array_ops.py", line 1257, in concat_v2
    "ConcatV2", values=values, axis=axis, name=name)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/home/jonathan/Dropbox/UnB/motionMag/momagenv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

What is the meanings of `shape_type`, `color_mult` and `third_frame_motion` in meta files?

{  
   "background_file":"/home/tiam/project_data/tf_deep_mag/raw_ingredients/coco_train2017/000000225345.jpg",
   "third_frame_motion":1.0,
   "sample_properties":[  
      {  
         "color_add":[  
            1.925357951045477,
            9.502529585992523,
            7.880988956721406
         ],
         "fg_idx":1311,
         "orientation":3.301299684358326,
         "motion_dir":5.8994013477181255,
         "shape_type":2,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":2.191457788030521,
         "scaling":0.195800786759281,
         "motion_amount":0.3861106423132381,
         "center":[  
            190.46849859664286,
            208.1545052627958
         ]
      },
      {  
         "color_add":[  
            -10.01393090981086,
            -7.44447289150088,
            -3.836791152177721
         ],
         "fg_idx":4712,
         "orientation":1.2146459169689545,
         "motion_dir":2.8215923059585046,
         "shape_type":2,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":1.575751231357717,
         "scaling":0.7695978265063551,
         "motion_amount":0.9252538154237787,
         "center":[  
            138.44686122207858,
            274.563682226355
         ]
      },
      {  
         "color_add":[  
            13.653071468187918,
            14.028489528534774,
            -8.215312919870719
         ],
         "fg_idx":5718,
         "orientation":2.054352406981038,
         "motion_dir":0.10612433944136192,
         "shape_type":3,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":1.759935147917405,
         "scaling":0.799292509942318,
         "motion_amount":1.008468522062427,
         "center":[  
            299.6627111293551,
            244.2818327143585
         ]
      },
      {  
         "color_add":[  
            11.15283742278356,
            -2.560571143035471,
            13.147733175999939
         ],
         "fg_idx":3507,
         "orientation":0.4777896338358551,
         "motion_dir":3.184019233633535,
         "shape_type":1,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":2.7324625189657796,
         "scaling":0.10427110663734562,
         "motion_amount":0.4029012709278865,
         "center":[  
            374.5391294523404,
            85.92793533987069
         ]
      },
      {  
         "color_add":[  
            -7.741979262278923,
            10.189081151868574,
            14.194209720121702
         ],
         "fg_idx":1416,
         "orientation":4.676984784008412,
         "motion_dir":4.486443504656097,
         "shape_type":2,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":0.11384226239960371,
         "scaling":0.5548262804937897,
         "motion_amount":0.0843749640576295,
         "center":[  
            215.72180825160055,
            9.213579577060514
         ]
      },
      {  
         "color_add":[  
            -2.3170936238533315,
            -8.917824166917462,
            3.6573973430047566
         ],
         "fg_idx":4341,
         "orientation":5.684192681269192,
         "motion_dir":1.8791328006121593,
         "shape_type":3,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":2.6267749876376825,
         "scaling":0.6529448261831385,
         "motion_amount":0.43769959396502617,
         "center":[  
            41.803534791626944,
            61.853745682737994
         ]
      },
      {  
         "color_add":[  
            -7.576833772172929,
            8.619156100953902,
            6.769010895167728
         ],
         "fg_idx":1717,
         "orientation":6.17109738993578,
         "motion_dir":0.6974625676398899,
         "shape_type":1,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":0.37686866068980446,
         "scaling":0.6900220937610626,
         "motion_amount":0.20012596632715263,
         "center":[  
            125.69403191253784,
            93.55391836687608
         ]
      },
      {  
         "color_add":[  
            -8.870124580096775,
            -14.199621328432295,
            -2.6015452878369505
         ],
         "fg_idx":663,
         "orientation":3.8206371992641466,
         "motion_dir":3.524134968760174,
         "shape_type":1,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":1.3452644797115627,
         "scaling":0.45703039713457094,
         "motion_amount":0.5315456823000649,
         "center":[  
            337.18518966576335,
            293.3950208811551
         ]
      },
      {  
         "color_add":[  
            -12.987699689796031,
            5.82086276136825,
            -5.900104564786595
         ],
         "fg_idx":3452,
         "orientation":0.5694792141183294,
         "motion_dir":5.072288980793892,
         "shape_type":2,
         "color_mult":[  
            1.0,
            1.0,
            1.0
         ],
         "aspect_ratio":0.8311780579834941,
         "scaling":0.34532467780479253,
         "motion_amount":0.9680977920488166,
         "center":[  
            167.71642386436253,
            142.27622959742757
         ]
      }
   ],
   "color_amplification_factor":1.0,
   "amplification_factor":28.351030206327763,
   "bg_properties":{  
      "motion_dir":5.543601538543549,
      "color_mult":[  
         1.0,
         1.0,
         1.0
      ],
      "color_add":[  
         -10.990955175457486,
         -14.4096269084179,
         -13.60394413364702
      ],
      "motion_amount":0.030528508874763105,
      "bg_blur_amount":4.732548688741219
   }
}

I downloaded the whole dataset to figure out how you prepare the data for training. And I understand most of the parameters in the meta files. However, I am not clear about shape_type, color_mult and third_frame_motion in the meta files. Thank you for your help!

About dataset synthesis

Hello:
In the json file of the training dataset,the values of the parameters "motion_amount" and "center" are float point. How do you paste foreground objects to background image while pixel position is int point? The paper mentioned that you reconstructed the image in continious domain with bicubic interpolation method. Do you enlarge the image first using bicubic interpolation method ?And then paste the foreground objects and dowmsample the image synthesized. Can you tell me the process of the reconstruction ?

get the heart rate

Can you please briefly explain how to get heart rate using the baby clip example through the output of the decoder?

TF ConcatOp error: dimensions not same

When running on custom data:

InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,360,248,32] vs. shape[1] = [1,360,247,32]

I have no idea where the numbers 360, 248 are coming from. Image width/height parameters are correct for my data. Any ideas?

Working Google Colaboratory Notebook.

I was working with this code on google colab. but was facing various issues, so as to remove it for all of the other users I am uploading my notebook with all the proper instructions so that you have not to go through them.
Hope it will help you.

You can find it on:
https://github.com/its-mayank/Video-Magnification/blob/master/Video_Magnification.ipynb

It is the original work of this paper, and all the rights are reserved with them. I have just Implemented and shared it so that all do not have to face the same problem.

Dataset

Hello,
I am wondering if there is any source code about how you synthesize the amplified ground truth for your training dataset?
If so, can you share it?
Looking forward to your reply.
Thanks.

Does frameC is color perturbed frameB in the dataset?

I downloaded the image dataset and found that there are four image directories.
I guess that 'amplifed', 'frameA', and 'frameB' are Y, Xa, and Xb in the paper.
I wonder 'frameC' is X'b or not.

And do you also have the loss term L1(V'b,V'Y) in the code?
I couldn't find the perturbed Y in the dataset.

Best,

About regularization loss

Hi,
In the paper, regularization is used to drive the separation of the texture and the shape representations.
I'm doing a similar work, but I don't have frameC to do regularization. If there is no regularization, will the separation result be poor? And are there other alternatives to regularization?
Thank you~

about training

hello:
I want to ask about how the txt file used in the training process is generated? And what is its function? And the parameters in the txt file are more than 100000 data, so they should not be one-to-one correspondence?Looking forward to your reply!!!

Wrong configured License in the repository

According to the repository's MIT license I could use and distribute it commercially. But that's not what the repository description says.

image

It was important to adjust the license of the repository to avoid future licensing problems with people who did not read or wanted to take advantage of the wrong licensing configuration on github.

About dataset synthesis

Hello:
In the json file of the training dataset,the values of the parameters "motion_amount" and "center" are decimal. How do you paste foreground objects to background image while pixel position is integer? The paper mentioned that you reconstructed the image in continious domain with bicubic interpolation method. Do you enlarge the image first using bicubic interpolation method ?And then paste the foreground objects and dowmsample the image synthesized. Can you tell me the process of the reconstruction ?

Could you please send me the raw videos?

Could you please send me the raw videos?
For the throat video, the FPS is 1900. Do we have to use such high-speed camera to gain such effect?
Thank you!

Best regards!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.