Giter Site home page Giter Site logo

plutoyuxie / autoencoder-ssim-for-unsupervised-anomaly-detection- Goto Github PK

View Code? Open in Web Editor NEW
121.0 4.0 25.0 333 KB

Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders

Python 100.00%
ssim-loss mvtec-ad unsupervised-anomaly-detection autoencoder anomaly-localization anomaly-segmentation

autoencoder-ssim-for-unsupervised-anomaly-detection-'s Introduction

AutoEncoder with SSIM loss

This is a third party implementation of the paper Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders.

avatar avatar avatar

Requirement

tensorflow==2.2.0
skimage

Datasets

MVTec AD datasets https://www.mvtec.com/company/research/datasets/mvtec-ad/

Code examples

Step 1. Set the DATASET_PATH variable.

Set the DATASET_PATH to the root path of the downloaded MVTec AD dataset.

Step 2. Train SSIM-AE and Test.

  • bottle object
python train.py --name bottle --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0.
python test.py --name bottle --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask W
  • cable object
python train.py --name cable --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_horizonal_flip 0. --p_vertical_flip 0.
python test.py --name cable --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500
  • capsule object
python train.py --name capsule --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_horizonal_flip 0. --p_vertical_flip 0.
python test.py --name capsule --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask W
  • carpet texture
python train.py --name carpet --loss ssim_loss --im_resize 512 --patch_size 128 --z_dim 100 --do_aug --rotate_angle_vari 10
python test.py --name carpet --loss ssim_loss --im_resize 512 --patch_size 128 --z_dim 100
  • grid texture
python train.py --name grid --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --grayscale --do_aug 
python test.py --name grid --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --grayscale
  • hazelnut object
python train.py --name hazelnut --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate_crop 0.
python test.py --name hazelnut --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask B 
  • leather texture
python train.py --name leather --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --do_aug
python test.py --name leather --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100
  • metal_nut object
python train.py --name metal_nut --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate_crop 0. --p_horizonal_flip 0. --p_vertical_flip 0.
python test.py --name metal_nut --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask B 
  • pill object
python train.py --name pill --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_horizonal_flip 0. --p_vertical_flip 0.
python test.py --name pill --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --bg_mask B
  • screw object
python train.py --name screw --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --grayscale --do_aug --p_rotate 0.
python test.py --name screw --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --grayscale --bg_mask W
  • tile texture
python train.py --name tile --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --do_aug
python test.py --name tile --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100
  • toothbrush object
python train.py --name toothbrush --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_vertical_flip 0.
python test.py --name toothbrush --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500
  • transistor object
python train.py --name transistor --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --do_aug --p_rotate 0. --p_vertical_flip 0.
python test.py --name transistor --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 
  • wood texture
python train.py --name wood --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 --do_aug --rotate_angle_vari 15
python test.py --name wood --loss ssim_loss --im_resize 256 --patch_size 128 --z_dim 100 
  • zipper object
python train.py --name zipper --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --grayscale --do_aug --p_rotate 0.
python test.py --name zipper --loss ssim_loss --im_resize 266 --patch_size 256 --z_dim 500 --grayscale 

Overview of Results

Classification
During test, I simply classify a test image as defect if there is any anomalous response on the residual map. It is strict for anomaly-free images, resulting in relatively lower accuracy in the ok column shown as below.
Please note that the threshold makes a big difference to the outcome, which should be carefully selected.

ok nok average
bottle 90.0 98.4 96.4
cable 0.0 45.7 28.0
capsule 34.8 89.6 78.0
carpet 42.9 98.9 88.9
grid 100 94.7 96.2
hazelnut 55.0 98.6 82.7
leather 71.9 92.4 87.1
metal nut 22.7 67.7 59.1
pill 11.5 75.9 65.9
screw 0.5 90.0 68.1
tile 100.0 3.6 30.8
toothbrush 83.3 100 95.2
transistor 23.3 97.5 53.0
wood 89.5 76.7 79.7
zipper 68.8 81.5 78.8
*SSIM loss, 200 epochs, different threshold

Discussion

  • SSIM + L1 metrics
    Since SSIM is a measure of similarity only between grayscale images, it cannot handle color defect in some cases. So here I use SSIM + L1 distance for anomaly segmentation.
  • VAE
    I have tried VAE, observing no performances improvements.
  • InstanceNorm
    I have also tried adding the IN layer for accelerating convergence, but the droplet artifact appears in some cases. It is also mentioned and discussed in StyleGAN-2 paper.

Supplementary materials

My notes https://www.yuque.com/books/share/8c7613f7-7571-4bfa-865a-689de3763c59?# password ixgg

References

@inproceedings{inproceedings, author = {Bergmann, Paul and Löwe, Sindy and Fauser, Michael and Sattlegger, David and Steger, Carsten}, year = {2019}, month = {01}, pages = {372-380}, title = {Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders}, doi = {10.5220/0007364503720380} }

Paul Bergmann, Michael Fauser, David Sattlegger, Carsten Steger. MVTec AD - A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection; in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019

autoencoder-ssim-for-unsupervised-anomaly-detection-'s People

Contributors

plutoyuxie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

autoencoder-ssim-for-unsupervised-anomaly-detection-'s Issues

Threshold

您好:

  利用200 epochs以及默认参数及数据集训练,并借助其最终的训练模型进行推理;针对MV Tec的每类开源数据,ssim_threshold和l1_threshold最佳阈值为多少,才能得到咱们列举的classification result.

  如果直接利用代码中推理得到的ssim_threshold和l1_threshold, 这个是否是最佳阈值? 谢谢

getting Good results for Bottle data, but worse for cable data

bottle jpg
above figure shows groundtruths and outputs obtained for bottle data. It seems good.

Untitled
above figure shows groundtruths and outputs obtained for cable data. Results are not good. I have used the same arguments as in the readme document for running train and test. Is there any solutions to make it more accurate?

Reconstructed Image

Thank you for providing code. In my case, the reconstructed images are completely gray images and residual images are completely black images. Please help me in this issue

陷入局部最优

谢谢您的开源代码!我发现在conv或者deconv后面不加batchnormalization的话,训练出来的结果都是灰色的,陷入了局部最优。加了BatchNormalization之后,就有了重建的效果,请问您在跑实验的时候会出现这样的问题吗?

求助

谢谢您的及时回复,我调试后发现是我GPU内存不够导致的,我的GPU是2080ti, 显存是11G的,我把batch_size改为20能勉强跑起来,但是感觉这样训练效果不好,所以您那有什么调试意见吗?新入行的小白,请您多多指教。

求评估指标

up主你能不能提供一下评估指标,定量地计算分类的auc和分割的auc,还有F1-score,以及aupro。
如果要由dice分割指标和二分类YoundenStat就更好了。
目前训练代码很好,但测试代码只有分割残差,无法定量评估。

求助

您好,我想问一下,在代码运行到:
autoencoder.fit(data_train, epochs=EPOCHS, validation_data=data_valid, callbacks=[checkpoint, earlystopping])
报错:buffer_size must be greater than zero;
调试了还久,没解决,想请问一下:
1.训练的数据集文件夹是怎么排的?我是直接解压了用的;
2.cfg.aug_dir路径下的文件夹怎么排,里面的train_patches需要手动放训练集数据吗?
感谢!!

Some code problems

I created the environment as required and changed the related paths and parameters, but the error "UnboundLocalError: local variable'logs' referenced before assignment" always appeared. I tried to solve the problem repeatedly and consulted some people, but the result got worse. Therefore, I would like to ask you for more detailed version information of related libraries, and whether there are other things that need attention.

是否可训练通用模型

您好,感谢您的开源实现,我在自己的数据集上进行测试效果很好,但是有个问题,我的数据集图片可能是包含多种不同纹理背景的正样本,我发现训练单一纹理背景的数据集效果不错,但仅对该纹理背景有效,尝试将不同背景的图片一起训练却很难收敛,想请问下有没有什么方法可以训练一个通用模型,或者说可以做哪方面的优化可以将 MVTec AD 数据集里leather、wood等样本一起训练?感谢!

opt.txt

------------ Options -------------
aug_dir: C:/Users/Machine/Desktop/liuQ/6.25/code1/AE_results/carpet/train_patches
augment_num: 100
batch_size: 20
chechpoint_dir: C:/Users/Machine/Desktop/liuQ/6.25/code1/AE_results/carpet/chechpoints/ssim_loss
decay: 1e-05
depress_edge_pixel: 10
depress_edge_ratio: 0.2
do_aug: False
early_stop_n: 50
epochs: 100
flc: 32
grayscale: False
im_resize: 256
input_channel: 3
l1_threshold: None
loss: ssim_loss
lr: 0.0002
name: carpet
p_crop: 1
p_horizonal_flip: 0.3
p_rotate: 0.3
p_rotate_crop: 1.0
p_vertical_flip: 0.3
patch_size: 32
percent: 98.0
rotate_angle_vari: 45.0
save_dir: C:/Users/Machine/Desktop/liuQ/6.25/code1/AE_results/carpet/reconst/ssim_l1_metric_ssim_loss
save_model_frequency: 1
save_snapshot: False
simplified: False
ssim_threshold: None
ssim_win_size: 11
stride: 32
sub_folder: ['000.png', '001.png', '002.png', '003.png', '004.png', '005.png', '006.png', '007.png', '008.png', '009.png', '010.png', '011.png', '012.png', '013.png', '014.png', '015.png', '016.png', '017.png', '018.png']
test_dir: C:/Users/Machine/Desktop/liuQ/6.25/code1/data/carpet/test/color
train_data_dir: C:/Users/Machine/Desktop/liuQ/6.25/code1/data/carpet/train/good
valid_data_ratio: 0.2
weight: 10
weight_file: None
z_dim: 512
-------------- End ----------------
这是哪个您说的那个文件,可以加一下好友吗? QQ:1103511435

is not running

hi i want to use a vae architecture for unsupervised anomaly detection and this code is not working and giving lots of errors. i suppose there are some deprecated functions used in code. I would appreciate if u could share the code for the vae as well

指标测试问题

作者你好,请问你测试过复现的代码在MVTec上的指标吗?我跑了下,比论文里面展示的估计要低个5~10个点?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.