Comments (57)
I'm going to train a 512x512 face model and release it to the public under the public domain.
from first-order-model.
Here it is: https://github.com/adeptflax/motion-models with any additional info you might want to know. I uploaded the model to mediafire. Hopefully that doesn't cause any issues.
from first-order-model.
@adeptflax First off, thanks for doing this :)
im having an issue
_pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified.
from here
File "demo.py", line 42, in load_checkpoints checkpoint = torch.load(checkpoint_path)
I think it has something to do with the file format of the checkpoint? any ideas?
from first-order-model.
Yes you should use scale_factor = 0.0625. In other words kp_detector and dense_motion should always operate on the same 64x64 resolution.
This sigma is parameter of anti-aliasing for downsampling, in principle any could be used, I select the one which is used by default in scikit-image. So sigma=1.5 is default for 256x256. But I don't think it affect results that much. So you can leave it equal to 1.5 or you can avoid loading this dense_motion_network.down.weight parameter, by removing it from state_dict.
from first-order-model.
@AliaksandrSiarohin @5agado I have run some tests using the method detailed in point 2.
Generally the result looks like this:
It would be good to get your thoughts on whether this an issue of using a checkpoint trained on 256 x 256 images, or if I am doing something wrong...
Many thanks for your excellent work.
from first-order-model.
Hi all,
I was wondering if anyone has succeeded in successfully retraining the network to support 512x512 (or higher) images ?
Before attempting this my self, I thought it might be a good idea to check if anyone has succeeded in retraining and if yes if that person would be kind enough to provide the checkpoints/configuration with the community ? 🙏
Kind regards
from first-order-model.
@adeptflax Thank you so much for your hard work. I managed to run your 512 version. Just for comparison, here are my old 256 footage and the new 512 version:
result256.mp4
result512.mp4
from first-order-model.
- The only reliable methods is to retrain on high resolution videos.
- You can also try to use of the shell video super-resolution method.
- Since all the networks are fully convolutional you can actually try to use pretrained checkpoints , trained on 256 images. In order to do this change the size in
Line 121 in 2ed57e0
first-order-model/config/vox-256.yaml
Line 26 in 2ed57e0
first-order-model/config/vox-256.yaml
Line 38 in 2ed57e0
If you have any lack with these please share your findings.
from first-order-model.
@pidginred i've used it for rather artistic purposes (applying to face-alike imagery), so cannot confirm 100%. it definitely behaved very similar with 1024 and 256 resolutions, though.
speaking animation quality, quite a lot was said here about the necessity of having similarity in poses (or face expressions) between the source image and the starting video frame. i think you may want to check that first.
from first-order-model.
I got it trained I will be uploading it shortly
from first-order-model.
@AliaksandrSiarohin thanks for the feedback.
Notice however that point 3 doesn't work out-of-the-box. If I change the scale factors as you mention I get an error for incompatible shapes.
Also, as I'm planning to try out some super-resolution methods for this, I'm curious about what you mean with "shell video super-resolution method"?
from first-order-model.
Can you post the error message you got?
I mean some video super resolution method, like one there https://paperswithcode.com/task/video-super-resolution
from first-order-model.
Error(s) in loading state_dict for OcclusionAwareGenerator:
size mismatch for dense_motion_network.down.weight: copying a param with shape torch.Size([3, 1, 13, 13]) from checkpoint, the shape in current model is torch.Size([3, 1, 29, 29]).
from first-order-model.
Ah yes you are right.
Can you try in
first-order-model/modules/util.py
Line 205 in 2ed57e0
to hard set sigma=1.5?
from first-order-model.
Cool, that worked! Could it be generalized for other resolutions?
I'll do some tests and comparisons using super-resolution
from first-order-model.
What do you mean? Generalized?
from first-order-model.
Is the scale factor proportional to image size? Like if I wanted to try with 1024x1024 I should use scale_factor = 0.0625?
Also is the fixed sigma (1.5) valid only for size 512? What about for size 1024?
I was interested in generalizing my setup such that these values can be derived automatically by the given image size.
from first-order-model.
Thanks so much for the support, really valuable info here!
from first-order-model.
Hi ,have you retrained on high resolution videos? If i do not retrain on new datasets, instead just do as the point3 mentioned, can I get a good result?
from first-order-model.
See https://github.com/tg-bomze/Face-Image-Motion-Model for point2
from first-order-model.
sigma=1.5
does not work for 1024x1024 source images (with scale factor of 0.0625). I get the following error:
File "C:\Users\admin\git\first-order-model\modules\util.py", line 180, in forward
out = torch.cat([out, skip], dim=1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 1 and 2 in dimension 2 at c:\a\w\1\s\tmp_conda_3.6_061433\conda\conda-bld\pytorch_1544163532679\work\aten\src\thc\generic/THCTensorMath.cu:83
But I can confirm that hard coding sigma=1.5
works only for 512x512 images (with scale factor of 0.125).
Can you please let us know the correct setting for 1024x1024 images? Thank you for your wonderful work.
from first-order-model.
@pidginred can you provide full stack trace, and your configs.
from first-order-model.
@AliaksandrSiarohin Certainly! Here are the changes I made (for 1024x1024 / 0.0625) & the full error stack:
Diffs
diff --git a/config/vox-256.yaml b/config/vox-256.yaml
index abfe9a2..10fce42 100644
--- a/config/vox-256.yaml
+++ b/config/vox-256.yaml
@@ -23,7 +23,7 @@ model_params:
temperature: 0.1
block_expansion: 32
max_features: 1024
- scale_factor: 0.25
+ scale_factor: 0.0625
num_blocks: 5
generator_params:
block_expansion: 64
@@ -35,7 +35,7 @@ model_params:
block_expansion: 64
max_features: 1024
num_blocks: 5
- scale_factor: 0.25
+ scale_factor: 0.0625
discriminator_params:
scales: [1]
block_expansion: 32
diff --git a/demo.py b/demo.py
index 848b3df..28bea70 100644
--- a/demo.py
+++ b/demo.py
@@ -134,7 +134,7 @@ if __name__ == "__main__":
reader.close()
driving_video = imageio.mimread(opt.driving_video, memtest=False)
- source_image = resize(source_image, (256, 256))[..., :3]
+ source_image = resize(source_image, (1024, 1024))[..., :3]
driving_video = [resize(frame, (256, 256))[..., :3] for frame in driving_video]
generator, kp_detector = load_checkpoints(config_path=opt.config, checkpoint_path=opt.checkpoint, cpu=opt.cpu)
diff --git a/modules/util.py b/modules/util.py
index 8ec1d25..cb8b149 100644
--- a/modules/util.py
+++ b/modules/util.py
@@ -202,7 +202,7 @@ class AntiAliasInterpolation2d(nn.Module):
"""
def __init__(self, channels, scale):
super(AntiAliasInterpolation2d, self).__init__()
- sigma = (1 / scale - 1) / 2
+ sigma = 1.5 # Hard coded as per issues/20#issuecomment-600784060
kernel_size = 2 * round(sigma * 4) + 1
self.ka = kernel_size // 2
self.kb = self.ka - 1 if kernel_size % 2 == 0 else self.ka
Full Errors
(base) C:\Users\admin\git\first-order-model-1024>python demo.py --config config/vox-256.yaml --driving_video driving.mp4 --source_image source.jpg --checkpoint "C:\Users\admin\Downloads\vox-cpk.pth.tar" --relative --adapt_scale
demo.py:27: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
config = yaml.load(f)
Traceback (most recent call last):
File "demo.py", line 150, in <module>
predictions = make_animation(source_image, driving_video, generator, kp_detector, relative=opt.relative, adapt_movement_scale=opt.adapt_scale, cpu=opt.cpu)
File "demo.py", line 65, in make_animation
kp_driving_initial = kp_detector(driving[:, :, 0])
File "C:\Users\admin\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\admin\Anaconda3\lib\site-packages\torch\nn\parallel\data_parallel.py", line 141, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\Users\admin\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\admin\git\first-order-model-1024\modules\keypoint_detector.py", line 53, in forward
feature_map = self.predictor(x)
File "C:\Users\admin\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\admin\git\first-order-model-1024\modules\util.py", line 196, in forward
return self.decoder(self.encoder(x))
File "C:\Users\admin\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "C:\Users\admin\git\first-order-model-1024\modules\util.py", line 180, in forward
out = torch.cat([out, skip], dim=1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 1 and 2 in dimension 2 at c:\a\w\1\s\tmp_conda_3.6_061433\conda\conda-bld\pytorch_1544163532679\work\aten\src\thc\generic/THCTensorMath.cu:83
from first-order-model.
@pidginred fixed sigma worked on my side for any resolution, including 1024x1024. it's not the reason of your problems.
from first-order-model.
@eps696 What was your scale factor for 1024x1024? And did you get a proper output?
from first-order-model.
@pidginred same as yours, 0.0625.
but i also resize driving_video, not only source_image (which i see you don't).
from first-order-model.
@eps696 Confirmed that worked. However, I lost almost complete eye & mouth tracking (compared to 256x256), and it results in lots of weird artifacts and very poor quality output.
Are you getting good quality results (in terms of animation) using 1024x1024 compared to 256x256?
from first-order-model.
@AliaksandrSiarohin @5agado I have run some tests using the method detailed in point 2.
Generally the result looks like this:
It would be good to get your thoughts on whether this an issue of using a checkpoint trained on 256 x 256 images, or if I am doing something wrong...
Many thanks for your excellent work.
I had the same problem
from first-order-model.
@eps696
Can you share the revised file?After I followed the above steps, the facial movements were normal, but the mouth could not open.
from first-order-model.
@zpeiguo that project is unreleased yet, sorry.
and this topic is about high res images. check other issues for 'normality' of movements.
from first-order-model.
@eps696
Can you share the revised file? After I followed the above steps, the facial movements were normal, but the mouth could not open.
Same here. Mouth won't open. I believe that the best is to retrain everything with a 512 rez
from first-order-model.
@eps696 Confirmed that worked. However, I lost almost complete eye & mouth tracking (compared to 256x256), and it results in lots of weird artifacts and very poor quality output.
Are you getting good quality results (in terms of animation) using 1024x1024 compared to 256x256?
I have also tested with the third method with 512, the animation quality is lower than the 256. I have to judgement as to why, I expect the quality to be the same with the same 64 keypoints.
from first-order-model.
I got method 3 working on Windows 10 following the steps above and successfully output a 512 version. However, the results are of much lower quality animation wise. Hoping we can get a 512 or higher checkpoint trained soon.
from first-order-model.
I got method 3 working on Windows 10 following the steps above and successfully output a 512 version. However, the results are of much lower quality animation wise. Hoping we can get a 512 or higher checkpoint trained soon.
I also followed method 3 and the animation is not acceptable :-( Mouth does not open at all and the face is distorted all the time.
Maybe have to use AI to upscale 256 to 512 video :-)
from first-order-model.
I got method 3 working on Windows 10 following the steps above and successfully output a 512 version. However, the results are of much lower quality animation wise. Hoping we can get a 512 or higher checkpoint trained soon.
I also followed method 3 and the animation is not acceptable :-( Mouth does not open at all and the face is distorted all the time.
Maybe have to use AI to upscale 256 to 512 video :-)
Yes in theory. It depends on the video output quality I suppose. I have tried with Topaz Labs software and it also enhances distortions.
from first-order-model.
@AliaksandrSiarohin @5agado I have run some tests using the method detailed in point 2.
Generally the result looks like this:
It would be good to get your thoughts on whether this an issue of using a checkpoint trained on 256 x 256 images, or if I am doing something wrong...
Many thanks for your excellent work.
Which super resolution network did you end up using? :)
from first-order-model.
I got method 3 working on Windows 10 following the steps above and successfully output a 512 version. However, the results are of much lower quality animation wise. Hoping we can get a 512 or higher checkpoint trained soon.
In demo.py, I try also resizing "driving_video", it works:
driving_video = [resize(frame, (512, 512))[..., :3] for frame in driving_video]
from first-order-model.
In demo.py, I try also resizing "driving_video", it works:
driving_video = [resize(frame, (512, 512))[..., :3] for frame in driving_video]
Yes, it ran. But my result (animation) was terrible.
from first-order-model.
How can I change blending mask size?
from first-order-model.
@AliaksandrSiarohin @5agado I have run some tests using the method detailed in point 2.
Generally the result looks like this:
It would be good to get your thoughts on whether this an issue of using a checkpoint trained on 256 x 256 images, or if I am doing something wrong...
Many thanks for your excellent work.
hi @LopsidedJoaw, which super-resolution method did you use to get the 320320 size result from 256256 input as your gif shows?
from first-order-model.
from first-order-model.
I used the same method described in the first 10 or so entries on this post.
…
On 9 Apr 2021, at 08:04, TracelessLe @.***> wrote: @AliaksandrSiarohin https://github.com/AliaksandrSiarohin @5agado https://github.com/5agado I have run some tests using the method detailed in point 2. Generally the result looks like this: https://user-images.githubusercontent.com/37964292/78800976-fda86580-79b3-11ea-866e-6dfe046b6a20.gif It would be good to get your thoughts on whether this an issue of using a checkpoint trained on 256 x 256 images, or if I am doing something wrong... Many thanks for your excellent work. hi @LopsidedJoaw https://github.com/LopsidedJoaw, which super-resolution method did you use to get the 320320 size result from 256256 input as your gif shows? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJBUUBDP6UGKSPLIXO3KDULTH2RH7ANCNFSM4LKIORHA.
Got that, thank you. :)
from first-order-model.
Can't wait.
Please also share the process. I think many people are interested.
Thanks.
from first-order-model.
I'm going to take 5 days to train on a rtx3090. I'm also going to train a 512x512 motion-cosegmentation model and release to the public as well under the public domain.
from first-order-model.
from first-order-model.
I need these models for a project I'm working, so I might as well release them to the public.
from first-order-model.
When trying to run the 512
model with this command: python demo.py --config config/vox-512.yaml --driving_video videos/2.mp4 --source_image images/4.jpg --checkpoint checkpoints/first-order-model-checkpoint-94.pth.tar --relative --adapt_scale --cpu
I get the following error:
/home/USER/miniconda3/envs/first/lib/python3.7/site-packages/imageio/core/format.py:403: UserWarning: Could not read last frame of /home/USER/General/Creating animated characters/First order motion model/first-order-model/videos/2.mp4.
warn('Could not read last frame of %s.' % uri)
/home/USER/miniconda3/envs/first/lib/python3.7/site-packages/skimage/transform/_warps.py:105: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
warn("The default mode, 'constant', will be changed to 'reflect' in "
/home/USER/miniconda3/envs/first/lib/python3.7/site-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
demo.py:27: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
config = yaml.load(f)
Traceback (most recent call last):
File "demo.py", line 144, in <module>
generator, kp_detector = load_checkpoints(config_path=opt.config, checkpoint_path=opt.checkpoint, cpu=opt.cpu)
File "demo.py", line 44, in load_checkpoints
generator.load_state_dict(checkpoint['generator'])
File "/home/USER/miniconda3/envs/first/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for OcclusionAwareGenerator:
size mismatch for dense_motion_network.down.weight: copying a param with shape torch.Size([3, 1, 13, 13]) from checkpoint, the shape in current model is torch.Size([3, 1, 29, 29]).
It runs fine with the 256 model.
Has anyone run into the same problem or does anyone know how it could be fixed?
Update: I've fixed the problem, I had to change sigma
to 1.5 as described here:
https://github.com/adeptflax/motion-models
#20 (comment) (it's also described there how to change 256 to 512 in the demo.py
file)
Steps to fix:
- in
demo.py
change everything from 256 to 512 around this line:source_image = resize(source_image, (256, 256))[..., :3]
- change
sigma
to 1.5 inutils.py
:sigma = (1 / scale - 1) / 2
tosigma = 1.5
- Use videos of 512x512 resolution
from first-order-model.
Update: I've fixed the problem, I had to change
sigma
to 1.5 as described here:
https://github.com/adeptflax/motion-models
#20 (comment) (it's also described there how to change 256 to 512 in thedemo.py
file)When trying to run the
512
model with this command:python demo.py --config config/vox-512.yaml --driving_video videos/2.mp4 --source_image images/4.jpg --checkpoint checkpoints/first-order-model-checkpoint-94.pth.tar --relative --adapt_scale --cpu
I get the following error:/home/USER/miniconda3/envs/first/lib/python3.7/site-packages/imageio/core/format.py:403: UserWarning: Could not read last frame of /home/USER/General/Creating animated characters/First order motion model/first-order-model/videos/2.mp4. warn('Could not read last frame of %s.' % uri) /home/USER/miniconda3/envs/first/lib/python3.7/site-packages/skimage/transform/_warps.py:105: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15. warn("The default mode, 'constant', will be changed to 'reflect' in " /home/USER/miniconda3/envs/first/lib/python3.7/site-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images. warn("Anti-aliasing will be enabled by default in skimage 0.15 to " demo.py:27: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. config = yaml.load(f) Traceback (most recent call last): File "demo.py", line 144, in <module> generator, kp_detector = load_checkpoints(config_path=opt.config, checkpoint_path=opt.checkpoint, cpu=opt.cpu) File "demo.py", line 44, in load_checkpoints generator.load_state_dict(checkpoint['generator']) File "/home/USER/miniconda3/envs/first/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for OcclusionAwareGenerator: size mismatch for dense_motion_network.down.weight: copying a param with shape torch.Size([3, 1, 13, 13]) from checkpoint, the shape in current model is torch.Size([3, 1, 29, 29]).
It runs fine with the 256 model.
Has anyone run into the same problem or does anyone know how it could be fixed?
I have the same issue
from first-order-model.
@adeptflax First off, thanks for doing this :)
im having an issue
_pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified.
from here
File "demo.py", line 42, in load_checkpoints checkpoint = torch.load(checkpoint_path)
I think it has something to do with the file format of the checkpoint? any ideas?
Same error here "_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified."
from first-order-model.
@bigboss97 did you do anything to the 512 checkpoint from @adeptflax to get it to work?
from first-order-model.
@william-nz No, the only thing I've done... I downloaded the file and ran it with the above (512) modifications. I saw a 512x512 video has been generated. That's all. I'm not very convinced with my result. Therefore I posted it here and hope that people can judge it themselves. Probably, I still have done something wrong 😆
When I get the time I'll do more experiments with that.
from first-order-model.
@AliaksandrSiarohin @5agado I have run some tests using the method detailed in point 2.
Generally the result looks like this:
It would be good to get your thoughts on whether this an issue of using a checkpoint trained on 256 x 256 images, or if I am doing something wrong...
Many thanks for your excellent work.
Hi @LopsidedJoaw, The video you showed looks very high resolution, which super resolution method did you use to get the result? Thanks!
from first-order-model.
Hi @AliaksandrSiarohin ,
I am training a model on 512x512 dataset from scratch. After 15 epochs, the loss is decreasing, but the keypoints seem to have just been confined to a small region.
Any Idea why this is happening? I haven't made any changes in the code. I have only modified config file.
dataset_params:
root_dir: 512_dataset/
frame_shape: [512, 512, 3]
id_sampling: False
pairs_list: data/vox256.csv
augmentation_params:
flip_param:
horizontal_flip: True
time_flip: True
jitter_param:
brightness: 0.1
contrast: 0.1
saturation: 0.1
hue: 0.1
model_params:
common_params:
num_kp: 10
num_channels: 3
estimate_jacobian: True
kp_detector_params:
temperature: 0.1
block_expansion: 32
max_features: 1024
scale_factor: 0.25
num_blocks: 5
generator_params:
block_expansion: 64
max_features: 512
num_down_blocks: 2
num_bottleneck_blocks: 6
estimate_occlusion_map: True
dense_motion_params:
block_expansion: 64
max_features: 1024
num_blocks: 5
scale_factor: 0.25
discriminator_params:
scales: [1]
block_expansion: 32
max_features: 512
num_blocks: 4
sn: True
train_params:
num_epochs: 100
num_repeats: 75
epoch_milestones: [60,90]
lr_generator: 2.0e-4
lr_discriminator: 2.0e-4
lr_kp_detector: 2.0e-4
batch_size: 4
scales: [1, 0.5, 0.25, 0.125]
checkpoint_freq: 5
transform_params:
sigma_affine: 0.05
sigma_tps: 0.005
points_tps: 5
loss_weights:
generator_gan: 0
discriminator_gan: 1
feature_matching: [10, 10, 10, 10]
perceptual: [10, 10, 10, 10, 10]
equivariance_value: 10
equivariance_jacobian: 10
reconstruction_params:
num_videos: 1000
format: '.mp4'
animate_params:
num_pairs: 50
format: '.mp4'
normalization_params:
adapt_movement_scale: True
use_relative_movement: True
use_relative_jacobian: True
visualizer_params:
kp_size: 5
draw_border: True
colormap: 'gist_rainbow'
Here is the loss till 15 epochs:
00000000) perceptual - 121.42917; equivariance_value - 0.71458; equivariance_jacobian - 0.75562
00000001) perceptual - 109.27000; equivariance_value - 0.35340; equivariance_jacobian - 0.65690
00000002) perceptual - 100.28600; equivariance_value - 0.16266; equivariance_jacobian - 0.56337
00000003) perceptual - 96.12051; equivariance_value - 0.14541; equivariance_jacobian - 0.51318
00000004) perceptual - 93.17576; equivariance_value - 0.14200; equivariance_jacobian - 0.48087
00000005) perceptual - 90.71331; equivariance_value - 0.15415; equivariance_jacobian - 0.47770
00000006) perceptual - 88.90341; equivariance_value - 0.22227; equivariance_jacobian - 0.49095
00000007) perceptual - 86.39249; equivariance_value - 0.21560; equivariance_jacobian - 0.47799
00000008) perceptual - 84.61519; equivariance_value - 0.20801; equivariance_jacobian - 0.46283
00000009) perceptual - 84.08470; equivariance_value - 0.21185; equivariance_jacobian - 0.46702
00000010) perceptual - 82.73890; equivariance_value - 0.20613; equivariance_jacobian - 0.45508
00000011) perceptual - 81.45905; equivariance_value - 0.19839; equivariance_jacobian - 0.44276
00000012) perceptual - 81.00780; equivariance_value - 0.20207; equivariance_jacobian - 0.44244
00000013) perceptual - 80.08536; equivariance_value - 0.19849; equivariance_jacobian - 0.43349
00000014) perceptual - 79.34811; equivariance_value - 0.19838; equivariance_jacobian - 0.42291
00000015) perceptual - 78.98586; equivariance_value - 0.19916; equivariance_jacobian - 0.41774
00000016) perceptual - 78.48245; equivariance_value - 0.19998; equivariance_jacobian - 0.41450
from first-order-model.
@Animan8000 @william-nz Did you guys ever manage to get over the "_pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified." error? I'm getting the same thing.
from first-order-model.
@Animan8000 @william-nz Did you guys ever manage to get over the "_pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified." error? I'm getting the same thing.
Nope
from first-order-model.
@Animan8000 @william-nz Did you guys ever manage to get over the "_pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified." error? I'm getting the same thing.
Another user seems to have fixed it for themselves, I have not tried myself yet so have not run into the error.
adeptflax/motion-models#2 (comment)
from first-order-model.
With 512x512 model shared above, code runs smoothly with suggested changes. First of all, thank you for sharing it. However, results that I get are not different than upscaled 256x256 results. Animation is not bad, but output video is blurry as if I upscaled 256x256 output to 512x512. Is this expected?
from first-order-model.
Related Issues (20)
- Can't use an own video for colab demo HOT 6
- 把first order motion整合到现有框架里 HOT 1
- issue on requirement.txt HOT 1
- out = torch.cat([out, skip], dim=1)
- RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 28 but got size 29 for tensor number 1 in the list. HOT 4
- architecture
- Training doesn't work on custom datasets HOT 12
- model_state_dict and other details
- onxx
- About evaluation
- The model of fashion.pth.tar can not down,because it is error. HOT 1
- real time use
- Software developer
- Retraining 512x512 with 68 keypoints,then retraining 512x512 with dlib 68 keypoints detector HOT 2
- Datasets and contact details
- Cannot install correctly, What do i have to do?
- where is the checkpoint downloaded?
- Cita dgt
- Display of key points HOT 1
- Hi
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from first-order-model.