Comments (8)
This is likely the same issue as #374
Would you be able to drop in the cut.rs from the current PR (just changing that one file should not affect anything else)? Or is this arising in the current PR itself? The current PR performs the check against the length.
from random-cut-forest-by-aws.
And one of the main instigations for the new PR was to not have panics but at least enable the capability of propagating errors (but there are still ways to go in terms of messages and error details -- hopefully they can be handled iteratively)
from random-cut-forest-by-aws.
Amusingly, I just converted the logic from the Java version. I'll try your update.
The RCF v4 update is interesting, I'm trying to understand everything you are saying. We currently struggle with 10gb heap running many multi dimensional trcf's. The way I read that is that we can have a single forest handling many?
from random-cut-forest-by-aws.
"we can have a single forest handling many" -> yes, with a price. If the time series are very different then precision/recall (measured via implanted anomalies) will reduce because the model can get confounded. This is ameliorated slightly based on the late detection (relative_index field) -- but again, mileage may vary.
from random-cut-forest-by-aws.
I get a new error now, could there be some dependent code elsewhere?
deleting wrong node; looking for 4511 found 5311
want 1.1760913 found 2.0043213
want 0 found 0
want 0 found 0
want 0 found 0
want 0.90309 found 0.69897
want 0 found 0
want 0 found 0
want 0 found 0
want 1.2787536 found 1.1760913
want 0 found 0
want 0 found 0
want 0 found 0
want 1.0791812 found 0.7781513
want 0 found 0
want 0 found 0
want 0 found 0
want 1.7160033 found 1.8129133
want 0 found 0
want 0 found 0
want 0 found 0
want 0.47712123 found 0.60206
want 0 found 0
want 0 found 0
want 0 found 0
want 1.6627579 found 1.6232493
want 0 found 0
want 0 found 0
want 0 found 0
want 0.845098 found 0.845098
want 0 found 0
want 0 found 0
want 0 found 0
thread panicked at 'explicit panic', deps/rcf/Rust/src/samplerplustree/randomcuttree.rs:222:17
from random-cut-forest-by-aws.
What is happening is that the code is looking for (want) a specific value to delete and finding something else in that position
that explains "deleting wrong node; looking for 4511 found 5311. This would happen if the trees are corrupted (that is, violate the assumption that every point on left is less or equal in that splitting coordinate, and everything on right is greater). If the cut subroutine produces a cut that does not separate the bounding box into two parts (when called from the add subroutine), then the tree could get corrupted. Are you seeing this error using the restored check-pointed/serialized forests or from new ones?
from random-cut-forest-by-aws.
I upgraded to your rcfv4 branch. I'm pleased to report:
- no more crashes
- uses 1/3 of the heap
- runs significantly faster
well done.
from random-cut-forest-by-aws.
Thanks and thanks for trying the new branch. I am happy that it worked for you.
from random-cut-forest-by-aws.
Related Issues (20)
- RCF 4.0
- C, Ruby, and PHP libraries + standalone CLI HOT 1
- Enable multicentroid clustering in Rust HOT 2
- Thresholded Random Cut Forest not detecting some anomalies with small gap HOT 2
- rust summarize_list error HOT 1
- rust summarize_list error HOT 3
- Clarification regarding Shingle size, number of samples per tree and threshold HOT 7
- Error when updating tree HOT 7
- How can I serialize the object RandomCutForest to array bytes? HOT 2
- Sample Size & Rust HOT 1
- Rust serialization HOT 2
- Performance regression in 3.5.1 when restoring state HOT 3
- Remove restrictions from outputAfter setting HOT 1
- Make pastValues independent of forecasts
- Reduce noise from streaming normalization HOT 2
- Incorrect foreast cast result HOT 2
- is there any plans to support more language such as Python? HOT 2
- Revisit calibration in RCFCaster to improve forecasts near boundaries (and handle physical infeasibility, such as -ve values, etc.) HOT 1
- Addressing hyper-sensitivity for RCFs with homogenous observations HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from random-cut-forest-by-aws.