Giter Site home page Giter Site logo

mscred's Introduction

1 It is an implement of the following paper by tensorflow:
  A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data.
  
2 How to use:
  First, run the file generation_signature_matrice.py to generate signature matrice.
  Second, run the file convlstm.py to train and test the model.
  Finally, run the file evalution.py to evalute the result.
 
 3 The demo code provide by the author: https://github.com/7fantasysz/MSCRED. 

mscred's People

Contributors

wxdang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mscred's Issues

Is the data really multivariate timeseries ?

From what I see in the data of the repository and the repository of the author the data have the following shape :

(number of time series * lenght of time series)

However a multivatiate time serie should have the following shape :

(number of time series * lenght of time series * number of features)

Is it me who have missed something or a problem in the data?

information lost

For line 44 at generation_signature_matrice.py

for t in range(win, self.signature_matrices_number):

This does not make sense since your are just looping the first 2000 data. Is the result correct if you lose so much information?

gap time between each segment

gap time between each segment is set to 10, i.e. the (start) timestamps between two adjacent signature matrices should be 10,
but when you generate each of the signature matrix, the gap time is 1

in generation_signature_matrice.py:

    for t in range(win, self.signature_matrices_number):
        raw_data_t = raw_data[:, t - win:t]
        signature_matrices[t] = np.dot(raw_data_t, raw_data_t.T) / win

    return signature_matrices

i think when computing raw_data_t, we should consider that

e.g.

for t in range(signature_matrices_number):
    raw_data_t = raw_data[:, t*gap_time:(t*gap_time + window_size)]
    signature_matrices[t] = np.dot(raw_data_t, raw_data_t.T) / window_size 
return signature_matrices

num_anom calculation may be wrong?

I suppose your code forget to add [0] into the np.where.
"num_anom = len(np.where(error > util.threhold)) " ,this will lead to all valid_anomaly_score being 2. Maybe you should modify it with num_anom = len(np.where(error > util.threhold)[0])

Reconstructing the data thats already in the input?

Hi,

Firstly very nice paper and I quite like the idea, really like to apply it to other areas. Thus Im hoping that you guys are still monitoring this repo and open for discussion.

Although I have a pretty straight forward question about the way the model was used and trained:
In Figure-2 in the paper and in the code, more specifically:
loss = tf.reduce_mean(tf.square(data_input[-1] - deconv_out))

It seems that you guys are using the tensor in the last time step of the input as the model's output tensor?
Maybe I have missed something obvious, but doesn't that simply imply that the inputs contains complete information of the output, i.e. the model can directly "see" the output in the input? Which means that by "selecting the last tensor in input" (like by setting weights for those input images to 1 and rest 0), we get a perfect estimator?

So my point is when reconstruct something shouldnt the input contain a very lossy or at least an "incomplete info version of the output", instead of containing complete information of what its suppose to reconstruct?

Im doing experiments with random walks on my own implementation of the network, and by using the last step of input as the model's output, I was still able to get very small losses ("reconstructed perfectly"). So Im suspecting that its exactly what the model is doing, i.e. by selecting one step of input as output.

In that case, my guess about why it still worked is, by "half training the model" the trainer were able to adjust the weights for the most common sample patterns, but the learning rate is not fast enough to make the model an simple "input selecting model" yet. However if you would have let the model training to converge, then this ability would be lost since the model will end up "selecting" input from inputs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.