Giter Site home page Giter Site logo

aimagelab / dress-code Goto Github PK

View Code? Open in Web Editor NEW
476.0 18.0 59.0 17.12 MB

Dress Code: High-Resolution Multi-Category Virtual Try-On. ECCV 2022

License: Other

Python 100.00%
dress-code virtual-try-on computer-vision deep-learning artificial-intelligence eccv2022

dress-code's Introduction

Dress Code Dataset

This repository presents the virtual try-on dataset proposed in:

D. Morelli, M. Fincato, M. Cornia, F. Landi, F. Cesari, R. Cucchiara
Dress Code: High-Resolution Multi-Category Virtual Try-On

[Paper] [Dataset Request Form] [Try-On Demo]

IMPORTANT!

  • By making any use of the Dress Code Dataset, you accept and agree to comply with the terms and conditions reported here.
  • The dataset will not be released to private companies.
  • When filling the dataset request form, non-institutional emails (e.g. gmail.com) are not allowed.
  • The signed release agreement form is mandatory (see the dataset request form for more details). Incomplete or unsigned release agreement form are not accepted and will not receive a response. Typed signature are not allowed.

Please cite with the following BibTeX:

@inproceedings{morelli2022dresscode,
  title={{Dress Code: High-Resolution Multi-Category Virtual Try-On}},
  author={Morelli, Davide and Fincato, Matteo and Cornia, Marcella and Landi, Federico and Cesari, Fabio and Cucchiara, Rita},
  booktitle={Proceedings of the European Conference on Computer Vision},
  year={2022}
}

Dataset

We collected a new dataset for image-based virtual try-on composed of image pairs coming from different catalogs of YOOX NET-A-PORTER.
The dataset contains more than 50k high resolution model clothing images pairs divided into three different categories (i.e. dresses, upper-body clothes, lower-body clothes).

Summary

  • 53792 garments
  • 107584 images
  • 3 categories
    • upper body
    • lower body
    • dresses
  • 1024 x 768 image resolution
  • additional info
    • keypoints
    • skeletons
    • human label maps
    • human dense poses

Additional Info

Along with model and garment image pair, we provide also the keypoints, skeleton, human label map, and dense pose.

More info

Keypoints

For all image pairs of the dataset, we stored the joint coordinates of human poses. In particular, we used OpenPose [1] to extract 18 keypoints for each human body.

For each image, we provided a json file containing a dictionary with the keypoints key. The value of this key is a list of 18 elements, representing the joints of the human body. Each element is a list of 4 values, where the first two indicate the coordinates on the x and y axis respectively.

Skeletons

Skeletons are RGB images obtained connecting keypoints with lines.

Human Label Map

We employed a human parser to assign each pixel of the image to a specific category thus obtaining a segmentation mask for each target model. Specifically, we used the SCHP model [2] trained on the ATR dataset, a large single person human parsing dataset focused on fashion images with 18 classes.

Obtained images are composed of 1 channel filled with the category label value. Categories are mapped as follows:

 0    background
 1    hat
 2    hair
 3    sunglasses
 4    upper_clothes
 5    skirt
 6    pants
 7    dress
 8    belt
 9    left_shoe
10    right_shoe
11    head
12    left_leg
13    right_leg
14    left_arm
15    right_arm
16    bag
17    scarf

Human Dense Pose

We also extracted dense label and UV mapping from all the model images using DensePose [3].

Experimental Results

Low Resolution 256 x 192

Name SSIM FID KID
CP-VTON [4] 0.803 35.16 2.245
CP-VTON+ [5] 0.902 25.19 1.586
CP-VTON* [4] 0.874 18.99 1.117
PFAFN [6] 0.902 14.38 0.743
VITON-GT [7] 0.899 13.80 0.711
WUTON [8] 0.902 13.28 0.771
ACGPN [9] 0.868 13.79 0.818
OURS 0.906 11.40 0.570

Code

Due to a firm collaboration, we cannot release the code. However, we supply an empty Pytorch project to load data.

References

[1] Cao, et al. "OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields." IEEE TPAMI, 2019.

[2] Li, et al. "Self-Correction for Human Parsing." arXiv, 2019.

[3] Güler, et al. "Densepose: Dense human pose estimation in the wild." CVPR, 2018.

[4] Wang, et al. "Toward Characteristic-Preserving Image-based Virtual Try-On Network." ECCV, 2018.

[5] Minar, et al. "CP-VTON+: Clothing Shape and Texture Preserving Image-Based Virtual Try-On." CVPR Workshops, 2020.

[6] Ge, et al. "Parser-Free Virtual Try-On via Distilling Appearance Flows." CVPR, 2021.

[7] Fincato, et al. "VITON-GT: An Image-based Virtual Try-On Model with Geometric Transformations." ICPR, 2020.

[8] Issenhuth, el al. "Do Not Mask What You Do Not Need to Mask: a Parser-Free Virtual Try-On." ECCV, 2020.

[9] Yang, et al. "Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content." CVPR, 2020.

Contact

If you have any general doubt about our dataset, please use the public issues section on this github repo. Alternatively, drop us an e-mail at davide.morelli [at] unimore.it or marcella.cornia [at] unimore.it.

dress-code's People

Contributors

marcellacornia avatar omedivad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dress-code's Issues

Why even make a repo???

I don't care if you have a deal with a corporation. Don't make a repo if you aren't going to show the code from your paper. Your paper is not reproducible and therefore likely just hard coded(seeing as you have no upload clothing or model option).

Pathetic unscientific work.

mask of cloth

Hello
How are you?
Thanks for contributing to this project.
I found that your dataset does NOT support the binary mask of cloth.
We need to use it.
Could u help me?

Could we tunning model by more pictures for one cloth?

If the currently held clothing pictures are full-angle 360 ​​pictures, can try to improve the final generation effect?

The current model training uses clothing recruitment from a single angle.

The details of the generated results cannot be directly used for commercial use.

Warper and HPE training hyperparams

I noticed that the learning for both warper and hpe were fixed during training (100k/50k) iterations according to the paper. With these params I see the loss stagnating after a few thousand iterations.

Have you tried LR decay or different params? Why did you use such a high number of iterations, did you have steady decrease of loss over 50k iterations?

question about keypoint

thanks for your work!In the keypoint feature, the first two elements of each point are coordinates, and I guess the third element is confidence. What does the fourth element represent?
I originally thought that the fourth element was the index, but I came across a counterexample as shown below:
{"keypoints": [ [152.0, 0.0, 0.8896152973175049, 0.0], [172.0, 59.0, 0.9442344903945923, 1.0], [137.0, 64.0, 0.9105486273765564, 2.0], [122.0, 143.0, 0.9645923376083374, 3.0], [171.0, 182.0, 0.7813410758972168, 4.0], [211.0, 56.0, 0.9262291193008423, 5.0], [212.0, 138.0, 0.8982189297676086, 6.0], [191.0, 210.0, 0.8128728866577148, 7.0], [143.0, 194.0, 0.841970682144165, 9.0], [141.0, 316.0, 0.8266096711158752, 10.0], [140.0, 443.0, 0.8687180876731873, 11.0], [198.0, 194.0, 0.787242591381073, 12.0], [215.0, 316.0, 0.8372030258178711, 13.0], [239.0, 453.0, 0.8752306699752808, 14.0], [144.0, 0.0, 0.6165196299552917, 15.0], [163.0, 0.0, 0.6212925910949707, 16.0], [141.0, 0.0, 0.13327936828136444, 17.0], [184.0, 0.0, 0.7300758361816406, 18.0] ]}
There is no point 8 here, but point 18 appears instead.

Are keypoints taken at different resolution?

Hi! Thank you for your work on the dataset!
When doing some data exploration I encountered something peculiar.
When I visualize the skeleton defined by the openpose keypoints i get a scaled and offset skeleton relative to the image at 768x1024 resolution:
Image
When I perform the visualization on images rescaled to 384x512 the skeleton aligns perfectly:
Image

So my question is if the Keypoints were created using the 384x512 images and not the full resolution ones?

Kind Regards!

get Dataset

hi,marcellacornia,
I have already filled in the form in readme, but why haven't I received any feedback emails or download links?

dataset link

Hello, I am interested in this dataset, but have not received any reply after filling out that form, is the link still valid?

Incorrect Segmentations

Hello,

I was using the segmentations to crop the garment portion for each model image and ran across troubles for 010560_0.jpg and 018112_0.jpg. I believe these segmentations might not be correct.

image

Thanks so much for such a well put-together dataset!

dataset request form link

Hi! i'm very interested in this dataset, but i can't get into dataset request form link. Does it still work?

how to align image

Thanks for your great work. I find that the data in dataset is aligned in some method. Can you share your code about how to align it if I want to test image downloaded from web. Thanks!

Human Parsing Technique

I would like to know if I have my own dataset, where can I get the Human Parsing like this dataset?

Could you please share about that?

Details around m

Great work, thanks for sharing it.

I am trying to better understand the following part in the paper:

to produce such masked
representation, we use both the target label map to extract the clothes area and
the pose map to extract the area of the limbs. These areas are then merged to
form the mask which is then dilated to avoid the model getting information about
the target shape. Finally, all the non-modifiable areas in the image (e.g. face,
hands, hairs, etc.) are subtracted from the generated mask. The final mask is
then applied to the image I.

  • Were the masks for the limbs extracted from the densepose output?

  • What was the dilation size?

Thank you!

keypoints json file generation from person image.

Hi, Thanks for your great work.
How to get keypoints json file from a person image?
Please let me know on it. in pytorch-openpose, keypoints json has "body_pose", "had_pose" key in json format. But in the dataset, keypoints has only "keypoints" key. How to solve it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.