Giter Site home page Giter Site logo

guoyang-xie / orthoad Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jnhwkim/orthoad

0.0 0.0 0.0 53 KB

Semi-Orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation

License: GNU General Public License v3.0

Python 97.92% Shell 2.08%

orthoad's Introduction

Semi-Orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation

PWC PWC PWC

We use the semi-orthogonal embedding for unsupervised anomaly segmentation. The multi-scale features from pre-trained CNNs are recently used for the localized Mahalanobis distances with significant performance. Here, we aim for robust approximation, cubically reducing the computational cost for the inverse of multi-dimensional covariance tensor. The proposed method achieves a new state-of-the-art with a significant margin for the MVTec AD (.942 and .982 for PRO and ROC, respectively), KolektorSDD, KolektorSDD2, and mSTC datasets.

Requirements

  • PyTorch 1.2 (not tested for < 1.2)
  • Install dependencies using
conda install --file requirements.txt

or

pip install -r requirements.txt
apt-get install libxrender1 libsm6 libglib2.0-0 libxext6 libgl1-mesa-glx  # for opencv

MobileNetv3

git clone https://github.com/d-li14/mobilenetv3.pytorch.git ../
ln -s ../mobilenetv3.pytorch/mobilenetv3.py mobilenetv3.py

Dataset

For the MVTec AD dataset, please download MVTec AD dataset and place under --dataroot path.

For the Kolektor Surface-Defect Dataset (KolektorSDD), please visit this site and this site.

For the ShaghaiTech Campus dataset (mSTC), the link for the official site was broken. So, please use the Baidu Disk link introduced on the MLEP github page to obtain the dataset and place them under --dataroot path. If you succesfully download the dataset, run the script by bash stc_preprocess.sh to preprocess the dataset. The script unzip .zip files under ./converted and converts the training videos (.avi format) and pixel masks (.npy format) into frames (.jpg format) under ./archive. It takes about 1.5 hour on Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz.

Training

MVTec AD

The environment variable DATA is used for the option --dataroot. For the MVTec AD dataset, source ./script/setup.sh MVTec_AD will recursively find the path to MVTec_AD directory and set the environment variable. For the KolektorSDD and KolektorSDD2, a similar approach would be working.

Please run train.py scripts with the category option, which performs the evaluation afterward. You might need 12G+ GPU memory to run this script.

python train.py --category carpet --metric auproc --fpr 0.3 # aurpoc
python train.py --category carpet --metric auroc --fpr 1.0  # auroc

KolektorSDD

The below script run for the three folds of KolektorSDD dataset and the KolektorSDD2 dataset.

./script/run_kolektor.sh

mSTC

For the preprocess, please run ./tools/stc_preprocess.sh as described above.

./script/run_stc.sh

For more options, please run:

python train.py -h

Visualization

After running train.py, run the below command to visualize the results using matplotlib. The PDF file will be located under a given path with --ckpt.

python visualizer.py --category carpet --ckpt /path/to/save

Performance

The previous work [Bergmann'19] proposes a threshold-free metric based on the per-region overlap (PRO). This metric is the area under the receiver operating characteristic curve (ROC) while it takes the average of true positive rates for each connected component in the ground truth. Because the score of a single large region can overwhelm those of small regions, the PRO promotes multiple regions' sensitivity. It calculates up to the false-positive rate of 30% (100% for ROC, of course). The ROC is a natural way to cost-and-benefit analysis of anomaly decision making.

MVTec AD

Model PRO ROC
L2-AE [Bergmann'20] .790 .820
SSIM-AE [Yi'20] - .818
Student [Bergmann'20] .857 -
VE VAE [Liu'20] - .861
VAE Proj [Dehaene'20] - .893
Patch-SVDD [Yi'20] - .957
SPADE [Cohen'20] .917 .965
PaDiM [Defard'20] .921 .979
Ours .942 .982

Notice that SSIM-AE reports are not consistent in [Bergmann'20] and [Yi'20]. ๐Ÿ˜•

Unsupervised KolektorSDD and KolektorSDD2

We use only anomaly-free images for unsupervised training. For the ResNet-18 with k=100,

Model Fold 1 Fold 2 Fold 3 Avg (Std) KolektorSDD2
Student [Bergmann'20] .904 .883 .902 .896 (.012) .950 (.005)
PaDiM [Defard'20] .939 .935 .962 .945 (.015) .956
Ours .953 .951 .976 .960 (.014) .981

mSTC

Model ROC
CAVGA-RU [Venkataramanan'19] .85
SPADE [Cohen'20] .899
PaDiM [Defard'20] .912
Ours .921

License

GNU General Public License version 3.0

orthoad's People

Contributors

jnhwkim avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.