Giter Site home page Giter Site logo

advimman / lama Goto Github PK

View Code? Open in Web Editor NEW
7.2K 79.0 784.0 6.46 MB

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Home Page: https://advimman.github.io/lama-project/

License: Apache License 2.0

Python 9.75% Shell 0.33% Jupyter Notebook 89.89% Dockerfile 0.03%
inpainting inpainting-methods inpainting-algorithm computer-vision cnn deep-learning deep-neural-networks image-inpainting fourier fourier-transform

lama's People

Contributors

ankuprk avatar calpt avatar chenbinghui1 avatar cohimame avatar hanswolff avatar mallman avatar moldoteck avatar sanster avatar senya-ashukha avatar tobi823 avatar windj007 avatar zeerizvee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lama's Issues

Spectral Positional Encoding

I see in your FourierUnit you added an optional spectral_pos_encoding argument. Have you experimented at all with this? Has it improved/reduced performance?

Influence: Amount of training data and data augmentation via Detectron2

Hi,

Thank you for sharing LaMa! The inpainting quality is really impressive!

I was wondering:

  1. Comparing "LaMa-Fourier" with "Big LaMa-Fourier": How much did the larger training data (4.5M images from the Places-Challenge dataset) contribute to the improved quality of Big LaMa-Fourier? Do you think that similar results could have also been achieved for Big LaMa-Fourier with less data?

  2. You have proposed a sophisticated approach for data augmentation. How much did the training and the inference benefit from data augmentation using segmentation masks from Detectron2?

Best wishes,
Alex

Integrate to iOS

I am trying to integrate this in a ios project. But I couldn't find any way to integrate this. Can anyone help me with this.

there are some strange white areas in my result.

Thanks for your exciting work firstly.
when i use -cn lama-fourier to train my own dataset . i find there are some white areas in some train and test images (not all images, and according to my observation it is irrelevant to mask size), like below( these two images are selected from epoch33/40):
image
and
image
Do you know how to avoid this situation? Thanks in advance.

PS:
my dataset is a food image set and there are 150,000 images.
And i use this command to train my model
CUDA_VISIBLE_DEVICES=0,1,2,3 python bin/train.py -cn lama-fourier location=food data.batch_size=10 data.num_workers=8 trainer.kwargs.gpus=[0,1,2,3] trainer.kwargs.limit_train_batches=12360 optimizers.generator.lr=0.001 optimizers.discriminator.lr=0.0001

Training batch size and other parameters?

Hi, thanks for the great work!
I am trying to reproduce the training results.
I used the default batch size and run the lama-fourier model on 4 V100 GPUs for 40 epochs. The training takes about 12 hrs, and the results on the training dataset look very nice, but it went wrong in the testing and other validation images. There will be some texture artifacts like this.

image

I wonder what will be the reason: batch size, training epoch or others?
If I set the batch size to 10, the training time on lama-fourier will be too long.

Usually how long should the training be finished on 4 V100 GPUs? (1 day or 1 week) and what batch size should be set to make the model generalized well to other images?

Thanks so much!

Image2Image model

Congrats on your great work and thanks for sharing your code. I'm using this model for an image2image translation. but after training for some time I'm losing the details from the input image in the predicted results. do you have any suggestions on how to modify the loss functions for better detail perservation? Thanks.

How to convert bestresult to onnx

At present, bestressult is in CKPT format, and I hope to convert it to onnx mode. I tried the code,

        model = load_checkpoint(train_config, checkpoint_path, strict=False, map_location='cpu')
        model.freeze()
        model.to(device)
        torch.save(model,'bestresult.pth')

        img = torch.rand(1,3,320,320,requires_grad=False)
        img = img.to(torch.device('cpu'))
        torch.onnx.export(model,img,'bestresult.onnx',opset_version=11)
        print('=====================best onnx result is saved!================')

but it reported an error. How can I solve it? thank you.

  [2021-11-26 06:25:35,036][__main__][CRITICAL] - Prediction failed due to too many indices for tensor of dimension 4:
Traceback (most recent call last):
  File "bin/predict.py", line 68, in main

integrate with Lightning ecosystem CI

Hello and so happy to see you use Pytorch-Lightning! 🎉
Just wondering if you already heard about quite the new Pytorch Lightning (PL) ecosystem CI where we would like to invite you to... You can check out our blog post about it: Stay Ahead of Breaking Changes with the New Lightning Ecosystem CI
As you use PL framework for your cool project, we would like to enhance your experience and offer you safe updates to our future releases. At this moment, you run tests with a particular PL version, but it may accidentally happen that the next version will be incompatible with your project... 😕 We do not intend to change anything on our project side, but still here we have a solution - ecosystem CI with testing both - your and our latest development head we can find it very early and prevent releasing eventually bad version... 👍

What is needed to do?

What will you get?

  • scheduled nightly testing configured for development/stable versions
  • slack notification if something went wrong to investigate
  • testing also on multi-GPU machine as our gift to you 🐰

Environment variable 'USER' not found

Thanks for your share! When I run python bin/train.py -cn lama-fourier location=my_dataset data.batch_size=10,the errors occured:
omegaconf.errors.InterpolationResolutionError: ValidationError raised while resolving interpolation: Environment variable 'USER' not found
full_key: hydra.run.dir
object_type=dict

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "bin/train.py", line 74, in
main()
File "/root/lama-main/saicinpainting/utils.py", line 163, in new_main
main_func(*args, **kwargs)
File "/root/.local/lib/python3.7/site-packages/hydra/main.py", line 53, in decorated_main
config_name=config_name,
File "/root/.local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 368, in _run_hydra
lambda: hydra.run(
File "/root/.local/lib/python3.7/site-packages/hydra/_internal/utils.py", line 270, in run_and_report
cur.tb_lasti = iter_tb.tb_lasti
AttributeError: 'NoneType' object has no attribute 'tb_lasti'

Could you tell me how to solve it, thanks!

Export to Onnx

Got this failed attempt to convert the model to ONNX

Code example:
save_onnx_path = "/content/lama.onnx"

image = torch.rand(1, 3, 120, 120)
mask = torch.rand(1, 1, 120, 120)
inputs = {
    "image": img,
    "mask": mask
}

torch.onnx.export(model,   
                  inputs,   
                  save_onnx_path,
                  opset_version=12,
                  do_constant_folding=True,
                  input_names = ['img', 'mask'],
                  output_names = ['output'],
                  dynamic_axes={
                      'img' : {0 : 'batch_size', 2 : 'width', 3 : 'height'},
                      'mask' : {0 : 'batch_size', 2 : 'width', 3 : 'height'},
                      'output' : {0 : 'batch_size', 2 : 'width', 3 : 'height'},
                  })

Stack trace:
RuntimeError: Exporting the operator fft_rfftn to ONNX opset version 12 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

As far I understood the problem is the fact that ONNX does not support "torch.fft.rfftn" operation used in the module FourierUnit

sync_batchnorm

I tried to set "sync_batchnorm: True " in configs.training.trainer, and the training just got stuck, why?

Questions about training big-lama and the full-checkpoint

Hi, thanks again for your excellent works.
Is the big-lama model trained on places-challenge dataset? Whether it performs greatly better than a big-lama trained with places2-standard?
Is it possible to release the full checkpoints of the big-lama model, so we can finetune it on other data? Thanks.

Random Mask Generation

Can you please guide over how can random masks be created on custom images? The link to the script given in the description does not seem to be working.

Is there a way to train in the fp16 mode?

It looks like fp16 is not supported in pytorch.

ffted = torch.fft.rfftn(x, dim=fft_dim, norm=self.fft_norm)
RuntimeError: Unsupported dtype Half

Is there a way to make parts of the network that do not support fp16 run in fp32, and those that support in fp16?

No inpainted results generated on JPG

I have followed all the instructions to setup lama on my system. I went with the conda installation, and the big-lama.zip model.

I'm running on a custom .jpg image, and have rightly modified configs/prediction/default.yaml and changed .png to .jpg

I created a folder named ts_images containing img.jpg and img_mask.jpg (also tried with img_mask001.jpg), and ran the command:

python3 bin/predict.py model.path=$(pwd)/big-lama indir=$(pwd)/ts_images outdir=$(pwd)/output

I get a Detectron v2 not installed message, then after a series of processing I get:

[2022-02-14 16:18:45,353][saicinpainting.training.trainers.base][INFO] - BaseInpaintingTrainingModule init done
[2022-02-14 16:18:49,151][saicinpainting.training.data.datasets][INFO] - Make val dataloader default from /home2/varungupta/lama/ts_images/
0it [00:00, ?it/s]

An outputs folder gets created in lama/ (and NO output folder) containing file predict.log, the top 5 lines of which are:

[2022-02-14 16:18:44,873][saicinpainting.utils][WARNING] - Setting signal 10 handler <function print_traceback_handler at 0x14fe587bcf28>
[2022-02-14 16:18:44,905][root][INFO] - Make training model default
[2022-02-14 16:18:44,905][saicinpainting.training.trainers.base][INFO] - BaseInpaintingTrainingModule init called
[2022-02-14 16:18:44,905][root][INFO] - Make generator ffc_resnet
[2022-02-14 16:18:45,352][saicinpainting.training.trainers.base][INFO] - Generator

and ending with:

[2022-02-14 16:18:45,353][saicinpainting.training.trainers.base][INFO] - BaseInpaintingTrainingModule init done
[2022-02-14 16:18:49,151][saicinpainting.training.data.datasets][INFO] - Make val dataloader default from /home2/varungupta/lama/ts_images/

I created my masked image using openCV, and it looks the following:

img_mask

No inpainting results are being generated. What am I missing?
Kindly help me out,
Thanks :)

@windj007

Edit: I converted my images to .png from .jpg, and now the code works! As mentioned, I updated the .yaml file :

indir: no  # to be overriden in CLI
outdir: no  # to be overriden in CLI

model:
  path: no  # to be overriden in CLI
  checkpoint: best.ckpt

dataset:
  kind: default
  img_suffix: .jpg
  pad_out_to_modulo: 8

device: cuda
out_key: inpainted

So, has anyone else tested the model with .jpg images and got it working?

How to use ddp during training

Hi, I use the Places365-Standard dataset to train the model, which has more than 1.8 Million images. And when i set batch_size=15, accelerator=ddp, gpus=8, from my experience the batchs in one epoch is 1.8M/15/8, each gpu just gets visibility into a subset of the overall dataset. But i found that the batchs on each gpu is 1.8M/15 in lama.
I run python3 bin/train.py -cn mylama-fourier to start training.
Is this the right way to use ddp to accelerate training?

ValueError: `val_check_interval` (25000) must be less than or equal to the number of the training batches (3006)

Thank you for sharing your great work!!

When I ran train.py with my own dataset, I got the following error.

[2021-12-07 11:28:26,947][__main__][CRITICAL] - Training failed due to `val_check_interval` (25000) must be less than or equal to the number of the training batches (3006). If you want to disable validation set `limit_val_batches` to 0.0 instead.:
Traceback (most recent call last):
  File "/home/naoki/lama/bin/train.py", line 64, in main
    trainer.fit(training_model)
  File "/home/naoki/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 499, in fit
    self.dispatch()
  File "/home/naoki/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 546, in dispatch
    self.accelerator.start_training(self)
  File "/home/naoki/miniconda3/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 73, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/home/naoki/miniconda3/lib/python3.9/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 114, in start_training
    self._results = trainer.run_train()
  File "/home/naoki/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 620, in run_train
    self.train_loop.reset_train_val_dataloaders(model)
  File "/home/naoki/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/training_loop.py", line 218, in reset_train_val_dataloaders
    self.trainer.reset_train_dataloader(model)
  File "/home/naoki/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/data_loading.py", line 243, in reset_train_dataloader
    raise ValueError(
ValueError: `val_check_interval` (25000) must be less than or equal to the number of the training batches (3006). If you want to disable validation set `limit_val_batches` to 0.0 instead.

My dataset consists of the following number of images.

  • Train: 12022
  • val_source: 2068
  • visual_test_source: 188
  • eval_source: 2032

Is this due to having too few training images?

Evaluate predictions on multiple GPUs

Hi, thanks for the amazing codebase. Is it possible to run the bin/evaluate_predicts.py process on multiple GPUs? As per my understanding and experiments, it only uses a single GPU, making an evaluation on Places2 relatively slow.

Batch/Video Processing

I was looking to install this. I was wondering if it supports some way to do a batch of images from a sequence or even do a video? Or do I have to manual highlight the object i want removed in each image? Thanks everyone.

[Proposal] Proposal for a better code hygiene

Hi, I love the project, and the code quality is relatively high.

But there could be a few more minor steps to make it even more readable.

Could you please consider adding

to the project?

It should be relatively straightforward and will not change any code logic but benefit maintainers and other users.

At some point, I wrote a blog post on the topic with examples of all the steps: I trained a model. What is next?

Or you can see how we do it in Albumentations: https://github.com/albumentations-team/albumentations

The repository will also become a role model for all other Research Projects at GitHub :)

P.S. If you consider doing this and encounter problems, I will be happy to answer any questions.

Perceptual loss weight

In your paper, you write that:
Naive supervised losses require the generator to reconstruct
the ground truth precisely. However, the visible parts of the
image often do not contain enough information for the exact
reconstruction of the masked part. Therefore, using naive
supervision leads to blurry results due to the averaging of
multiple plausible modes of the inpainted content.
In contrast, perceptual loss evaluates a distance between features extracted from the predicted and the target
images by a base pre-trained network

But inside some main configs, you set the perceptual weight to 0.
Is it a config problem, or do you train the models without perceptual loss?

perceptual:
    weight: 0

In lama-fourier config

no yaml file

hydra.main(config_path='../configs/training', config_name='tiny_test.yaml')

where is the config file? configs/training/tiny_test.yaml does not exist.

Invalid signal value

Hello and thank you for your great work. I encountered this error trying to start the predict.py...what does it tell me? Im not quite sure how to react and what to change :D

python bin\predict.py model.path=B:\ProgrammierDateiBaum\Bachelor\lama\big-lama indir=B:\ProgrammierDateiBaum\Bachelor\bachelorarbeit\Image_data\lama\In outdir=B:\ProgrammierDateiBaum\Bachelor\bachelorarbeit\Image_data\lama\Out
Detectron v2 is not installed
[2022-02-26 16:09:34,785][saicinpainting.utils][WARNING] - Setting signal 1 handler <function print_traceback_handler at 0x0000028C6828E670>
[2022-02-26 16:09:34,787][__main__][CRITICAL] - Prediction failed due to invalid signal value:
Traceback (most recent call last):
  File "bin\predict.py", line 41, in main
  File "B:\ProgrammierDateiBaum\Bachelor\lama\bin\saicinpainting\utils.py", line 109, in register_debug_signal_handlers
    signal.signal(sig, handler)
  File "B:\Python3\lib\signal.py", line 47, in signal
    handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
ValueError: invalid signal value

Edit:
I made everything work by outcommenting register_debug_signal_handlers() and changing device to cpu in the default.yml file. It seems like the torch 1.8.0 version does struggle with the cuda version I have?! Im not an expert here...

Im working on a windows machine btw...had some trouble creating the env first, seems like with conda on windows some packages are not available - I switched to a "normal" python version and used pip, then it worked!

I would still like to know what the signal error is and whether torch version 1.8.1 would be sufficient as well, because this version does provide a better cuda support it seems.

[Bug] In `predict.py`. images are not unpadded

For inference, images are padded to be divisible by 8, but after the prediction is done they are not unpadded.

=> when one uses bin/predict.py input images and output images have a different shape

Mask "shadows" in some images?

In some predicted images, there is a noticeable marking of the applied masks. I attached a small example here. This is part of a bigger image. https://imgur.com/a/19pIbQo
I circled in red the "shadows" that I'm referring to. Those are exactly where the random masks were applied.

Worth mentioning: This is a model that was trained on my own data, using the proposed architecture and configuration proposed here: https://github.com/saic-mdal/lama/blob/main/configs/training/lama-regular.yaml

I am not certain what other information is relevant here but I will provide more if necessary.

Any suggestions on what the issue is here and what could fix it are welcomed. Thanks and thank you for the great project.

Discriminator config

Hello, your paper mentions that the discriminator uses Fourier or dilated convolution, but I see that the discriminator is set to'pix2pixhd_nlayer' in the config files. What is the reason?

Question about L1 loss weight_know vs weight_missing

Hi! First of all thanks for sharing the code. I have a doubt about the loss L1.First of all this loss does not appear at the paper right? Second of all about the loss weight_know vs weight_missing: Why in most of the configs you set weight_missing to 0, As I understand this weights the part of the masked image in order to make the network match the gt with the predicted in the zone to inpainted. That is the zone where mask == 1. Why You set that to 0? Have you studied this param on the effect of convergence?

Proposal for the handling of B&W images

Hello!
Thank you for this fantastic model.

I am using it with Black and White images (i.e. gray) which only have 2 channels.
Your code reads images using matplot lib, which will only read the 2 channels of an BnW image.

It will be better to read images using openCV, which adds the extra channel.

So replacing everything that reads the image like this:

img = plt.imread(fname)[:,:,:3] #this won't work in BnW

by using:

img = cv2.imread(fname) #this adds the extra channel and converts to np array on the fly

I can make the changes in your code and crete a pull request if you want.

Cheers, Lucia.

the input_size and out_size of big-lama

Hi,
Firstly, Thank you for making such a great project open source.
I found the out_size in released big-lama config.yaml is 256, was the big-lama model trained with images' size 256?

Doubt about optimizer_idx

Looking at the code I found the _do_step function that as I can tell via torch ligthining what it does is train at each iteration one time the generator and the other the discrimantor? Is this always the receipt in image inpatining. Have you studied oder strategies in order to train the generator and the discriminator.Maybe both at the same iteration ....

GPU load is severely unbalanced

I have 8 2080ti GPUs (11GB). When I train, I can only use 4 cards, and the batchsize can only be set to 5. GPU 0 occupies 10481 MB, and the other three cards occupies 6906 MB each. I don’t know how to solve it. I am not very familiar with pytorch lightning. If DDP is used for parallel training in pytorch, the load is balanced.

Question about big-lama.yaml

Hi, thanks for your excellent works.
I have a question about what is "weights_path: ${env:TORCH_HOME}" mean in big-lama.yaml ? When I retrain the works, I don't kown what I had writen in here.
Thank you for time!

Colab error

FileNotFoundError: [Errno 2] No such file or directory: '/content/output/1224276_original_mask.png'

sdxfcgvbhyunjiasddas

It happens with both custom images and the ones as examples

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.