marknzed / artimenab.jl Goto Github PK

View Code? Open in Web Editor NEW

17.0 17.0 2.0 15 KB

ARTime detector for the Numenta Anomaly Benchmark

License: GNU Affero General Public License v3.0

Julia 100.00%

artimenab.jl's People

Contributors

Stargazers

Watchers

Forkers

earthgecko aifeixuelo

artimenab.jl's Issues

Non NAB Version

Hello is there a version of ARTime that can be used outside of NAB. I am looking for online structural break detection algorithm and not necessarily use NAB

Thanks

[Q] processing larger batch - initial phase

Hi @markNZed

Firstly, congratulations on taking 1st place on the NAB scoreboard, that is quite an achievement 🎉 and great contribution.

It is quite amazing that julia code can run directly in Python, a fine testament to a community effort.

I have a few questions that perhaps you could answer for me.

The implementation in here or in https://github.com/markNZed/NAB/tree/ARTimeNAB more specifically, is aimed at running through the dataset in an iterative manner (as per NAB) to score each data point. Is it possible to process the data set in large batches? For example, could one process 90% of data in one shot for training/learning, not being concerned with anomaly scores (p) in this phase and then iterate the last 10% of the data set as per the ARTimeNAB method and determine anomaly scores (p).
If so, would that be quicker than the iteration method?

I did a test passing a values list to jl.ARTime rather than a single value and it returned an object with all the expected data, just as if it was one value and then iterated the final part of the data and did not get the expected result (an anomaly which is present in the iterative method), so that method I tried does not work, so I am wondering if there is a way to do it that will work.

Cannot find reference 'Main' in 'init.py'

Hi @markNZed

When I tried to merge ARtime into the NAB library I found that it would not work and produced some error messages》
The output from the IDE is:

The error in the code is reported as:

Does the above error report have anything to do with the python version? Or am I missing something
After I added some printing to the above code, the output of the IDE was:

Threshold of anomaly

Hi, I have a quick question about how to decide the threshold for the predicted anomaly. I see there is always a pre-defined threshold in thresholds.json for NAB. How do you decide the threshold in ARTime? Do you find it with the ROC curve or something? What do you think is a good threshold strategy (e.g., a dynamic threshold) for online settings?

Unbounded memory growth on stream data

Hi @markNZed
Congrats on taking the top spot in NAB! Your contribution pointed my interest towards this entirely new (to me) area of neuroscience, for which you have my sincerest thanks.

I took ARTime for a spin, and wanted to see how it fares in a streaming scenario (i.e. ~infinite series), but I noticed something which slightly worries me:
It seems that the internal DVFA structures are growing without any limits.

Please take a look at this snippet:

julia> using ARTime, Random, Distributions
julia> Random.seed!(123)

julia> p = ARTime.P(); ARTime.init(-2,2,210000,p)
julia> size(p.cs.art.W)[1] * size(p.cs.art.W)[2]  + size(p.cs.art.M)[1] + size(p.cs.art.Me)[1]
0 #Size before any processing

# Let's say we have a slightly noise sine wave
julia> for x in range(0, 200π, length=10000) 
    y = sin(x) + 0.1 * randn() 
    ARTime.process_sample!(y, p)
end


julia> size(p.cs.art.W)[1] * size(p.cs.art.W)[2]  + size(p.cs.art.M)[1] + size(p.cs.art.Me)[1]
2788  # Internal struct size after 10K points

# Now there's a longer period where the noise more pronounced
julia> for x in range(0, 2000π, length=100000) 
    y = sin(x) + 0.2 * randn()
    ARTime.process_sample!(y, p)
end 

julia> size(p.cs.art.W)[1] * size(p.cs.art.W)[2]  + size(p.cs.art.M)[1] + size(p.cs.art.Me)[1]
27336 # Internal struct size after 110K points

# Noise-levels are down, but the frequency of sine wave has changed
julia>  for x in range(0, 200π, length=100000) 
    y = sin(x) + 0.1 * randn() 
    ARTime.process_sample!(y, p)
end 

julia> size(p.cs.art.W)[1] * size(p.cs.art.W)[2]  + size(p.cs.art.M)[1] + size(p.cs.art.Me)[1]
237422 # Internal struct size after 210K points

julia> p.cs.art.n_categories
6983

julia> p.cs.art.n_clusters
739

(I'm aware that there are more internal state variables than W, M, Me, but they grow at similar pace so I omitted them here)

I know that this example is a bit nasty, but this is just to illustrate something that I also see on my real data i.e., that with enough time, the ARTime process will eventually run out of memory and crash (which is not the case for e.g., HTM). It seems that the DVFA never ceases to create new clusters and categories.

Is this an intentional behavior (or maybe some sort of optimization for NAB)?

Is there any way to limit the memory usage (or e.g., somehow compact the current state) without forgetting catastrophically (i.e. full state reset)?

I would like the algorithm to keep on adapting to the stream (rather than use learned state) - but it seems to have infinite appetite for memory.

Can I understand your method as an online learning method which update the model step-by-step over time?

I'm really impressed by this method and am going to use it as a compared method in my experiment.

Can I interpret it as an advanced version of the Hierarchical Temporal Memory method? Also, is this method an unsupervised and online learning method?

Julia error when running ARTime in NAB

Sorry to bother you, I got an error after run python run.py -d ARTime --detect --optimize --score --normalize --skipConfirmation.

ERROR: LoadError: setfield!: const field .name of type TypeName cannot be changed Stacktrace: [1] setproperty!(x::Core.TypeName, f::Symbol, v::Symbol) @ Base ./Base.jl:39 [2] top-level scope @ ~/.julia/packages/RedefStructs/JMYNd/src/RedefStructs.jl:138 [3] include @ ./Base.jl:419 [inlined] [4] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing) @ Base ./loading.jl:1554 [5] top-level scope @ stdin:1

Allocation problem stops running.

Hi, I really like your method and want to use it as a compared method. Currently, I'm running the ARTime under NAB. I think I have set everything correctly and tried to run the following command:

python run.py -d ARTime --detect --optimize --score --normalize --windowsFile labels/combined_windows_new.json

However, I encounter the following error, which I have spent a lot of time on it and don't know how to fix it. Hope you can provide some guidance.

Running detection step
0: Beginning detection with ARTime for realAWSCloudwatch/ec2_cpu_utilization_77c1ca.csv
2: Beginning detection with ARTime for realAWSCloudwatch/ec2_network_in_5abac7.csv
1: Beginning detection with ARTime for realAWSCloudwatch/ec2_disk_write_bytes_1ef3de.csv

signal (11): Segmentation fault
in expression starting at none:0
Allocations: 2494562 (Pool: 2493386; Big: 1176); GC: 2
Segmentation fault (core dumped)

marknzed / artimenab.jl Goto Github PK

artimenab.jl's People

Contributors

Stargazers

Watchers

Forkers

artimenab.jl's Issues

Non NAB Version

[Q] processing larger batch - initial phase

Cannot find reference 'Main' in 'init.py'

Threshold of anomaly

Unbounded memory growth on stream data

Can I understand your method as an online learning method which update the model step-by-step over time?

Julia error when running ARTime in NAB

Allocation problem stops running.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent