numenta / nab Goto Github PK
View Code? Open in Web Editor NEWThe Numenta Anomaly Benchmark
License: GNU Affero General Public License v3.0
The Numenta Anomaly Benchmark
License: GNU Affero General Public License v3.0
Of the detectors posted to the scoreboard, it may be a nice feature to include a breakdown of which do best/worst for the scoring metrics TP, FP, and FN.
Similarly, we may wish to offer several scoreboards, one for each scoring profile.
The readme specifies NuPIC is only needed if running the Numenta detector, yet run.py always runs from nab.detectors.numenta.numenta_detector import NumentaDetector
, which imports from NuPIC.
Need to have a scoreboard of detector results posted.
We received a new data file with a real anomaly. Get the file "9f78a90223174506a8ad262b0.csv" from Subutai and add to dataset.
This is a subtask of #132
Pull some examples of customer data and add to NAB (needs to be hand labeled as well).
These tests should replicate the last parts of run_tests.sh. They should output everything to temporary directories and check the results. Then they should explicitly pass or fail.
Need to test new installation of the NAB repo, and update readme accordingly.
All FPs are scored against the total, yet only one TP per window is added to the total. While we are estimating x% of a data file is anomalous (i.e. the anomaly windows), and (1-x)% of the data file is normal data, then TPs are limited to x% of the data, while FPs exist in the other (1-x)%. Thus scaling the FPs by 1/x will level out the contributions made to the total score by TPs and FPs. E.g. if x=10, then the FP weight should be multiplied by 0.10.
Also add this to the writeup.
As a reproducible case:
python run_anomaly.py --inputFile data/artificialWithAnomaly/art_daily_flatmiddle.csv --detector cla
The results file for that command differs between the two platforms. They should be identical.
F1 score is regarded the standard metric for evaluating algorithms. It would be nice to have nab calculate this as well.
Larger window sizes (>10%) are preferred, and windows centered about the true anomaly. This is to reward early detection, even if humans can't see the anomaly visually, and still reward late detection b/c it's better to identify an anomaly late rather than not at all.
Add to the writeup instructions (flowchart, diagram) as to how a new detector can be added to NAB -- the three methods.
The readme should point to this diagram.
Because of anomalies close to the probationary period, we may consider removing "realAWSCloudwatch/iio_us-east-1_i-a2eb1cd9_NetworkIn.csv" from the benchmark corpus in the future. For now (NAB v0.1) we're manually setting the windows so they don't overlap the probationary period.
Need an organized directory for the labels files, including a naming convention for the individual labelers.
Output warning if there are datafiles without any corresponding entries in the label file. Code should not crash in this case, nor should it silently ignore those files. After giving the warning the code should just ignore that datafile.
Currently we only have the standard profile, where all metrics (TP, FP, FN) are set to 1. There should be at least two more -- cover Type I and Type II errors.
The wiki should have the final versions of the NAB writeup and labeling instructions docs.
Need tests for optimizer routines
Users should be able to easily set parameters, particularly which individual detectors and profiles to run because they take a while. They can do this manually in the code, but it would be much better from command line (with supporting instructions and examples in the readme).
For reference we need to include each human created label file in the labels directory. The ground truth file is constructed from these files. These human files should be versioned in the repository, perhaps under labels/human_labels/
Cleaning up the code throughout NAB is needed. This includes, but not limited to:
Plotting the anomaly detection results is helpful for debugging, and could also be included in a "NAB Data Analysis" section of the repo, along with the data visualizer. The visualizer does give results plots, but they're not very intuitive (confusing) and not too useful (don't show the ground truth windows).
Scale/normalize the final scores such that adding files to the corpus will not (necessarily) decrease the scores, i.e. appear worse. E.g. the current best DUT may have a NAB score of -7, and then with a new datafile added to the corpus, score a -8, and appear to be performing worse.
We may also want to modify the scoring profiles such that a perfect TP detection actually yields a 1; currently it is 0.98661.
We need a means of both defining anomaly windows and calculating the subsequent relaxed windows such that the relaxed windows do not overlap.
Currently detectors get the entire dataset so that min/max can be computed. Instead it should get min/max passed into it. We can put the min/max values in the label files themselves, as another field. Or we could have the Datafile compute it when it reads in a data file.
For debugging the labeler, it would be helpful to print out the raw labels that do not qualify (beat the threshold) for the combined labels.
Need tests for LabelCombiner class
For the data files which we know the anomaly cause(s, we should manually label the windows. This manual labeling should also be noted in the writeup.
Should have separate unit tests just for validating the scoring function.
Trying to run NAB, I ran into a problem:
nupic_py27mmm@mmm-U2442:~/nupic/NAB$ python run.py
{'dataDir': 'data',
'detect': False,
'detectors': ['numenta'],
'labelDir': 'labels',
'numCPUs': None,
'optimize': False,
'probationaryPercent': 0.15,
'profilesPath': 'config/user_profiles.yaml',
'resultsDir': 'results',
'score': False,
'thresholdPath': 'config/threshold_config.yaml'}
Proceed? (y/n): y
/home/mmm/nupic/env_py27/local/lib/python2.7/site-packages/pandas/core/frame.py:1771: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
"DataFrame index.", UserWarning)
Traceback (most recent call last):
File "run.py", line 152, in
main(args)
File "run.py", line 65, in main
runner.initialize()
File "/home/mmm/nupic/NAB/nab/runner.py", line 90, in initialize
self.corpusLabel.initialize()
File "/home/mmm/nupic/NAB/nab/labeler.py", line 124, in initialize
self.getLabels()
File "/home/mmm/nupic/NAB/nab/labeler.py", line 158, in getLabels
labels["label"].values[indices] = 1
IndexError: unsupported iterator index
I've tried pulling #15 which mentions some bugfix to labeling, but that gives me another error (so I can't tell w/o a vanilla running)
Label combiner should handle overlapping relaxed windows. If two relaxed windows overlap, they should be merged together (per discussion with Jeff).
Create [email protected] contact, as well as mailing list.
Anomalies that differ by a few time steps should not be thrown away when combining labels, but a requirement is then a ground truth start time must be determined.
Running run.py only uses the Numenta detector, but should include the Skyline and random detectors too.
The results directory in the repo is empty. Once we have finalized scoring, this should be populated with results for the three detectors and three profiles.
If combined labels overlap, merge them into one anomaly. Justification is (i) multiple anomalies don't occur in the same window, and (ii) although anomalies close together in time may appear distinct, they're likely(?) correlated.
We may wish to set Numenta detector parameters specific to scoring profiles.
Modify aggregate report to add number of windows that were missed. Currently it outputs false negatives as the total number of records, however this value is not used in the scoring function. We only count the number of missed windows.
As of PR #97 checkWindows() in labeler.py deletes a window which overlaps with the probationary period; code for throwing a ValueError is commented out. We should decide between throwing out the overlapping window, throwing out the datafile, or some other means of solving the issue.
Details should be added to the NAB writeup.
Label combiner should not allow relaxed windows to be in probationary period. If a window creeps into the probationary period, it should get truncated so that no part of the window goes into probationary period.
Should be using simplejson rather than json for faster loading/dumping. Not significant now with such small json files, but can only help when they grow -- more user profiles.
After running NAB you get a _scores.csv file with some useful information. This file should contain most (if not all) the information needed to manually calculate the score. For example, the fn column contains the total number of records. Instead it should contain the number of anomaly windows that were unlabeled because that is what the score is based on.
Test detector performance with parameter optimization -- resolution, averaging window for anomaly likelihood, etc.
Includes #93
Add feature to run.py to report statistics for each datafile as well as aggregates: number of records, number of anomalies, min, max, etc.
Need to be consistent using either simplejson or JSON in the nab files. Convention is a try/except clause.
Need tests for CorpusLabel class
Implement a versioning system for NAB. This will regulate NAB updates including additions to the benchmark dataset, the scoreboard, and any code changes.
The Numenta detector picks up two FPs in "artificialWithAnomaly/art_increase_spike_density.csv" that are clearly not anomalous. This could be a good file for debugging the detector.
Need tests for Corpus class
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.