Giter Site home page Giter Site logo

younggod / sturcture-inpainting Goto Github PK

View Code? Open in Web Editor NEW
109.0 6.0 21.0 2.79 MB

Source code of AAAI 2020 paper 'Learning to Incorporate Structure Knowledge for Image Inpainting'

Python 99.94% Shell 0.04% PureBasic 0.02%
image-inpainting image-completion attention-mechanism pyramid-structure-loss structure-embedding

sturcture-inpainting's Introduction

Learning to Incorporate Structure Knowledge for Image Inpainting

Introductions and source code of AAAI 2020 paper 'Learning to Incorporate Structure Knowledge for Image Inpainting'. You can get the paper in **AAAI proceedings or here.

Citation

@inproceedings{jie2020inpainting,
  title={Learning to Incorporate Structure Knowledge for Image Inpainting},
  author={Jie Yang, Zhiquan Qi, Yong Shi},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={7},
  pages={12605-12612},
  year={2020}
}

Introduction

This project develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.

The overview of our multi-task framework is as in figure below. It leverages the structure knowledge with multi-tasking learning (simultaneous image and structure generation), structure embedding and attention mechanism.

architecture

Pyramid structure loss

We propose a pyramid structure loss to guide the structure generation and embedding, thus incorporating the structure information into the generation process. Here, the gradient and edge which are holded in sobel gradient maps as in figure below are used as the structure information.

The loss function pyramid_structure_loss(..) is realized in structure_loss.py.

def pyramid_structure_loss(image, predicts, edge_alpha, grad_alpha):
    _, H, W, _ = image.get_shape().as_list()
    loss = 0.
    for predict in predicts:
        _, h, w, _ = predict.get_shape().as_list()
        if h != H:
            gt_img = tf.image.resize_nearest_neighbor(image, size=(h, w))
            
            # grad
            gt_grad = tf.image.sobel_edges(gt_img)
            gt_grad = tf.reshape(gt_grad, [-1, h, w, 6])    # 6 channel
            grad_error = tf.abs(predict - gt_grad)

            # edge
            gt_edge = tf.py_func(canny_edge, [gt_img], tf.float32, stateful=False)
            edge_priority = priority_loss_mask(gt_edge, ksize=5, sigma=1, iteration=2)
        else:
            gt_img = image

            # grad
            gt_grad = tf.image.sobel_edges(gt_img)
            gt_grad = tf.reshape(gt_grad, [-1, H, W, 6])  # 6 channel
            grad_error = tf.abs(predict - gt_grad)

            # edge
            gt_edge = tf.py_func(canny_edge, [gt_img], tf.float32, stateful=False)
            edge_priority = priority_loss_mask(gt_edge, ksize=5, sigma=1, iteration=2)

        grad_loss = tf.reduce_mean(grad_alpha * grad_error)
        edge_weight = edge_alpha * edge_priority
        # print("edge_weight", edge_weight.shape)
        # print("grad_error", grad_error.shape)
        edge_loss = tf.reduce_sum(edge_weight * grad_error) / tf.reduce_sum(edge_weight) / 6.    # 6 channel

        loss = loss + grad_loss + edge_loss

    return loss

Attention Layer

Our attention operation is inspired by the non-local mean mechanism which has been used for deionizing and super-resolution. It calculates the response at a position of the output feature map as a weighted sum of the features in the whole input feature map. And the weight or attention score is measured by the feature similarity. And when k=1, it works just like Self-Attention. Through attention, similar features from surroundings can be transferred to the missing regions to refine the generated contents and structures (e.g. smoothing the artifacts and enhancing the details).

Some qualitative results

Qualitative

qualitative qualitative

Ablation

ablation

Real life object removal

Code

Painter

To evaluate the generalization ability of our inpainting models, we carry out object removal experiments in user scenarios. We develop a interactive image removal and completion tool with Opencv. You may download the checkpoint of the inpainting model pretrained on Places2 training and validation data from here with pass code: uiqn.

Or google drive

Run the paint.py in command line (We implement our model using tensorflow 1.15.2, python 3.7):

python painter.py --checkpoint checkpoint/places2 --save_path imgs

Do object removal experiments, it will work like:

Citation

@inproceedings{jie2020inpainting,
  title={Learning to Incorporate Structure Knowledge for Image Inpainting},
  author={Jie Yang, Zhiquan Qi, Yong Shi},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={7},
  pages={12605-12612},
  year={2020}
}

License

CC 4.0 Attribution-NonCommercial International. The software is for educaitonal and academic research purpose only.

sturcture-inpainting's People

Contributors

younggod avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

sturcture-inpainting's Issues

About the pretrained model and results.

Firstly, I'm sorry to bother you. I'm interested in your inpainting method, but I have a problem as following: When I was running painter.py with pretrained model which you have provided, the result shown like this, so I sincerely want to know if the pretrained model should train again and the reason why this happened. Looking forward to your reply.
asd_result
mask1

New easy to use inpanting method with transformers

Dear reasercher, please also consider checking our newly introduced face inpainting method to address the symmetry problems of general inpainting mehthods by using swin transformer and semantic aware discriminators.
Our proposed method showed better results in terms of fid score and newly proposed metric which focus on the face symmetry compared to some of the state-of-the-art methods including lama.
Our paper is availabe at:
https://www.researchgate.net/publication/366984165_SFI-Swin_Symmetric_Face_Inpainting_with_Swin_Transformer_by_Distinctly_Learning_Face_Components_Distributions

The code also will be published in:
https://github.com/mohammadrezanaderi4/SFI-Swin

Training code

Hi,
Your work is great!
Recently I am doing some inpainting tasks in specific scenes, could you please release training code and setting, I want to try finetune on your pre-trained model to solve some problems that I could not break through before.

Thanks!

Variable size input

Hi, Great work.

I tried checking input with variable size input i.e. 500x500. but it ended up giving error that dimensions are not equal somewhere between the model

Traceback (most recent call last):
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1607, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 124 and 125 for 'inpaint_net/attention_pooling_64/add' (op: 'AddV2') with input shapes: [1,124,124,256], [1,125,125,256].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "painter.py", line 359, in <module>
    ge = Paint(config)
  File "painter.py", line 61, in __init__
    self.setup()
  File "painter.py", line 107, in setup
    args=self.config, reuse=self.reuse)
  File "/home/asim/Desktop/Esper-WorkSpace/Inpainting/sturcture-inpainting/painter/inpaint_model.py", line 117, in evaluate
    mask, args, reuse=reuse,training=training, padding=args.PADDING)
  File "/home/asim/Desktop/Esper-WorkSpace/Inpainting/sturcture-inpainting/painter/inpaint_model.py", line 58, in build_inpaint_net
    x = attention(x, 4 * cnum, down_scale=2, pool_scale=2, name='attention_pooling_64')
  File "/home/asim/Desktop/Esper-WorkSpace/Inpainting/sturcture-inpainting/painter/ops.py", line 437, in attention
    x = attention_with_pooling(x, channels, down_scale=down_scale, pool_scale=pool_scale, name=name)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/contrib/framework/python/ops/arg_scope.py", line 182, in func_with_args
    return func(*args, **current_args)
  File "/home/asim/Desktop/Esper-WorkSpace/Inpainting/sturcture-inpainting/painter/ops.py", line 475, in attention_with_pooling
    x = gamma * o + x_origin
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/ops/math_ops.py", line 899, in binary_op_wrapper
    return func(x, y, name=name)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/ops/math_ops.py", line 1197, in _add_dispatch
    return gen_math_ops.add_v2(x, y, name=name)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 549, in add_v2
    "AddV2", x=x, y=y, name=name)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1770, in __init__
    control_input_ops)
  File "/home/asim/anaconda3/envs/structure_inpainting/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1610, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 124 and 125 for 'inpaint_net/attention_pooling_64/add' (op: 'AddV2') with input shapes: [1,124,124,256], [1,125,125,256].

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.