danieltan07 / dagmm Goto Github PK

View Code? Open in Web Editor NEW

387.0 387.0 106.0 1.92 MB

My attempt at reproducing the paper Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection

Python 9.82% Jupyter Notebook 90.18%

dagmm's People

Contributors

Stargazers

Watchers

Forkers

ykwon0407 joddm bayesian-thinking gustavocarita schaelle gwuieonjin vujadeyoon tartaruszen zhuyiche yli96 crrflying xudipitt gentaman hongminwu eazyba williamceli wjm-24 genpeng chanmango dragon-dane y-bai quangnamvu junzhezhang wangjuenew luhuijun666 yue123161 igabriel85 tanmdl wangxing0608 ryutian xk97 anormaly-detection desenhuang jibanli ekfeet hameddhib swyoon dienhoa stjordanis fedoracy temibabs maksimbolonkin kasakh pkq1688 nanana710 anye137 mperezcarrasco qiansilence hugh-zpf auguscl keshava mil-hasegawa bolimath yh0903 xujinglin sjw821 gyyixr fin-warrah ivan-ge677 msinghraniyal csilentlearner droiter ellieking17 aytacpacal sean0719 justinwoo97 jimmy-inl hell-to-heaven acveah yeyangli ginger45 pd90506 edward841556 chungjunn zerlina0106 mortal12138 vickycs50 luisfredgs frank-fang youguxiao chenzh12 geonamsoil zijiandu longfeizhangcode benjamin-ky yzfxmu chinahappyking a86612 paullu-ualberta madeline271 gaohuiru carrtesy lydia-mo encorew jaturongkongmanee 3803531 pvfalcao duckbill zhn6818 ilwoof

dagmm's Issues

Can anyone run this codes?

Hello. Thanks for your great works.

I use Google Colab, and in my Colab notebook, this Error occurs.

`---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in ()
----> 1 solver = main(hyperparams(defaults))

5 frames
/content/drive/MyDrive/dagmm/main.py in main(config)
23
24 if config.mode == 'train':
---> 25 solver.train()
26 elif config.mode == 'test':
27 solver.test()

/content/drive/MyDrive/dagmm/solver.py in train(self)
95 input_data = self.to_var(input_data)
96
---> 97 total_loss,sample_energy, recon_error, cov_diag = self.dagmm_step(input_data)
98 # Logging
99 loss = {}

/content/drive/MyDrive/dagmm/solver.py in dagmm_step(self, input_data)
162 enc, dec, z, gamma = self.dagmm(input_data)
163
--> 164 total_loss, sample_energy, recon_error, cov_diag = self.dagmm.loss_function(input_data, dec, z, gamma, self.lambda_energy, self.lambda_cov_diag)
165
166 self.reset_grad()

/content/drive/MyDrive/dagmm/model.py in loss_function(self, x, x_hat, z, gamma, lambda_energy, lambda_cov_diag)
166
167 phi, mu, cov = self.compute_gmm_params(z, gamma)
--> 168 sample_energy, cov_diag = self.compute_energy(z, phi, mu, cov)
169
170 loss = recon_error + lambda_energy * sample_energy + lambda_cov_diag * cov_diag

/content/drive/MyDrive/dagmm/model.py in compute_energy(self, z, phi, mu, cov, size_average)
133 cov_inverse.append(torch.inverse(cov_k).unsqueeze(0))
134
--> 135 #det_cov.append(np.linalg.det(cov_k.data.cpu().numpy()* (2np.pi)))
136 det_cov.append((Cholesky.apply(cov_k.cpu() * (2np.pi)).diag().prod()).unsqueeze(0))
137 cov_diag = cov_diag + torch.sum(1 / cov_k.diag())

/content/drive/MyDrive/dagmm/model.py in forward(ctx, a)
10 class Cholesky(torch.autograd.Function):
11 def forward(ctx, a):
---> 12 l = torch.cholesky(a, False)
13 ctx.save_for_backward(l)
14 return l

AttributeError: module 'torch' has no attribute 'potrf'`

After I search for this error, i knew that torch.potrf removed. it was replaced with torch.cholesky.

pytorch/pytorch#50379

But, after i change torch.potrf to torch.cholesky, same error occurs.

Can anyone run this codes? help me plz.

Reconstruction Loss issue

Hi Daniel,
I executed the given code. However, the reconstruction loss of AE is not decreasing after several iterations. Instead, it keeps increasing in some epochs. would you please explain this behaviour?

Model without training works well?

Out of curiosity I just tried the test mode directly on the KDD data.
I tried to modify the threshold.
It turns out that the f1 score seems to be quite high even without training.
I didn't load any pretrained model and I set the config to test mode
Does anyone know why?

Threshold for percentile (100 - 10): 14.92182731628418
Accuracy : 0.8065, Precision : 0.9339, Recall : 0.4433, F-score : 0.6013
Threshold for percentile (100 - 12): 11.708395004272461
Accuracy : 0.8364, Precision : 0.9388, Recall : 0.5378, F-score : 0.6838
Threshold for percentile (100 - 14): 8.14496498107912
Accuracy : 0.8659, Precision : 0.9419, Recall : 0.6313, F-score : 0.7559
Threshold for percentile (100 - 16): 3.5739990234374943
Accuracy : 0.8905, Precision : 0.9397, Recall : 0.7130, F-score : 0.8108
Threshold for percentile (100 - 18): 1.8419023990631087
Accuracy : 0.9084, Precision : 0.9210, Recall : 0.7894, F-score : 0.8501
Threshold for percentile (100 - 20): 1.555919885635376
Accuracy : 0.9339, Precision : 0.9169, Recall : 0.8786, F-score : 0.8974
Best Threshold : 1.555919885635376, with Accuracy : 0.9339, Precision : 0.9169, Recall : 0.8786, F-score : 0.8974

nan values during training

Hello,

When I am trying your training code for my data, I will get nan values for the total_loss and sample_energy.

Any suggestion?

Why the predict samples of high energy as anomalies?

Dear author,
Recently, I am reading the paper of the DaGMM model and reproducing your code. A confusion is why the predict samples of high energy as anomalies? As I have known, the energy represents the likelihood of samples which should reflect how likely the samples can be modeled. Therefore, if our training dataset using normal samples, the testing samples with high energy/likelihood would be indicated as normal. Is it a correct expression?

backward not implemented error

when I run the code, it tells me that

File "/usr/local/lib/python3.5/dist-packages/torch/autograd/function.py", line 180, in backward
raise NotImplementedError
NotImplementedError

Problem with Shuffling

Isn't it kind of cheating that the samples are being randomly shuffled for testing?
It is possible that known samples (during training) are also contained on the test set.

use csv file instead of npz

how we can use csv file instead of npz ?

Hello!

dagmm/solver.py

Line 177 in 6912116

self.data_loader.dataset.mode="train"

Could you explain why this should be "train"? I think of "test".

Unable to reproduce reported performance

Hello,

thanks for the good work on this paper reproduction. I tried to run it a few time to have a better understanding of DAGMM. However, I can't manage to reproduce the performance reported in the readme/paper.

Methodology is as follow :

run in train mode with no pretrained arg
switch arg "mode" to test, and arg "pretrained_model" to "200_194"

The run outputs are in this range (minor variations, but nowhere close to 95%+) :
Threshold : -0.8012019991874695
Accuracy : 0.6735, Precision : 0.5066, Recall : 0.2911, F-score : 0.3698

I ran it 5 times (train + test) with no modification of the code. Using Pytorch 0.4 and Python 3.6. On the last run I tried to delete the folder "models" before running.

If anyone knows how to tackle this issue it would be of great help ! Thanks :)

About max_val

For stability, you compute the max_val = torch.max((exp_term_tmp).clamp(min=0), dim=1, keepdim=True)[0], which means you compute the max value for the first sample(max_val has size 1 because of [0] operation). But I think max value should be computed for each sample seperately without [0] operation, then the max_val has size of N*1. Is it right?

requesting an open-source license (eg. MIT)

Thanks for the great work. Have you considered adding a license to your code so that people can reuse the code, modify it and distribute?

Github's default license is that no one has the right to reuse, modify or redistribute your code - https://choosealicense.com/no-permission/ unless you choose a different open-source license (such as MIT).

Select random data while train and test?Is that suitable?

I am not sure about that.Because will it lead to overfitting?

Why the loss value does not decrease

I directly run the program and the train loss value maintains at a stable value since the first epoch. Why?

401.4998250745006
401.7638197436775
401.7303499831367
401.77844033782014
401.32889832172197
401.61040724921475
401.97929602554166
401.8664882699239
401.83383383210173
401.61515635067656
401.429258366221
401.7632902479663
401.8252330662049
401.62056960273037
401.67191652907536
401.90117543014054
401.84600122196156
401.89923410317334
401.4744226514679
402.3445039729482

Train loss is not converged.

Dear danieltan07,
This is vujadeyoon.

Unfortunately, the updated codes may have some issues.
The train loss is oscillatory in this version.

When I run the codes which were not be updated in July of 2018, the train loss was converged to small value.
Also, the scores including accuracy, precision, recall and F-score were reasonable.

However, in this updated version, it is difficult to trust the all scores which are obtained by this updated codes because train loss is not converged,
I think the train loss must be converged to a very tiny value.

I recommend you should downgrade the codes to its original state.
I checked the previous version which is not updated has stable performance.

If I misunderstand your intended codes, please feel free to answer it.

I really thank your shared valuable codes for the DAGMM.

Best regards,

Vujadeyoon

negative probability

Hi, I am trying to use the model to do another task and the energy turned out to be negative caused due to negative likelihood. I think it is unlikely but it also appears at original code on KDD dataset.
Can you please tell me why?

Good results even with random initialization

I managed to get great result when testing consistently even when DAGMM is untrained. The results look similar to this:

Accuracy : 0.9637, Precision : 0.9573, Recall : 0.9312, F-score : 0.9441

I only changed from the cholesky method to direct torch.det for det_cov calculation. The rest is the same as this repository.
For testing random DAGMM, I commented out the code for loading pretrained model and have all pretained models removed so I don't think there was accidental loading. The 4 mus and covs are equal to each other so I believe no training was done.

Do you have an idea where the issue might be?

This seems to contradict many issues posted here, so I am quite baffled.

run the code without GPU devices?

Hi,
Is it possible to run the Jupiter notebook example without a GPU device?

Thanks

why only attack data during training

self.train = attack_data[randIdx[:N_train]]
self.train_labels = attack_labels[randIdx[:N_train]]

trained data is generated in the data_loader.py line 38 and 39. But why only attacked data used but not normal data.

change the AE model every time when the dataset changes?

After reading the paper and do the experiments, I wonder that do I need to change the model of auto-encoder everytime the dataset changes? If so, maybe the mannual work is too big?

data_loader KDD99Loader inherit problem

In data_loader.py class KDD99Loader should inherit "Dataset" class from the tutorial of torch, but author inherit "object" still works.

Large standard deviation when reproducing experiment results

Has anyone tried this model on benchmark datasets like Arrhythmia or Thyroid ?

I use ten different seeds [000, 111, 222, ..., 999], and evaluate the performance of DAGMM (Structure of autoencoder, learning rate, batch size, are exactly the same). Below is the AUC and Precision results on Thyroid:

AUC: 0.5562 0.5546 0.9403 0.9439 0.5592 0.6733 0.9156 0.7703 0.6353 0.8264
Precision: 0.0968 0.0108 0.6129 0.4301 0.0538 0.3226 0.4731 0.2366 0.1505 0.2366

It is clear that three precision records are close to the one reported in raw paper, even better. However, the standard deviation over 10 independent trials is quite large...

I'm not sure whether there is something wrong with my experiment code, or the model is inherently unstable.

Therefore, I would like to ask that has anyone else also observed such large standard deviation.

Thanks :-)