Comments (9)
@magrenimish This is likely a memory issue - XGBoost lives in memory outside of JVM so h2o and xgboost compete for the same memory. Please see the documentation to find out how to limit h2o's memory so that XGBoost fits in the memory.
from h2o-3.
@tomasfryda As mentioned in the documentation, I allowed less than 2/3 of the total available RAM to H2O, leaving the rest for XGB. Available memory to XGB is well beyond 100gb.
from h2o-3.
@magrenimish Thanks for adding the available memory information. I don't see any obvious reason why it should fail like this. Would you be able to provide us with logs? Please make sure there are no confidential data in the logs (the log might contain user name, column names, loaded file names etc).
from h2o-3.
@tomasfryda Here is the log:
automodeler.log
from h2o-3.
Thank you @magrenimish . Unfortunately that's not the H2O (backend) log. Please see https://docs.h2o.ai/h2o/latest-stable/h2o-docs/logs.html to find out how to get the H2O (backend) logs.
from h2o-3.
Hi @tomasfryda I tried to access the H2O logs zip folder and after downloading it, I only see a 'nohup.out' file as attached here:
automodeler_h2o_logs (1).zip
from h2o-3.
That's exactly the kind of log that what we need, thank you @magrenimish .
It looks like the failure occurs during the data load so it doesn't even get to AutoML.
The log ends in the middle of a line which I think might be due to OOM error but it's weird because you the file should be much smaller than available memory. @wendycwong I think this is a bug related to parquet parser.
The end of the log:
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-43 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 122370
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-113 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 123275
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-105 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 121590
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-87 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 126722
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-47 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 125185
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-19 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 125131
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-43 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 122371
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-113 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 123276
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-105 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value: 121591
12-26 21:21:23.722 127.0.0.1:16822 9972 FJ-3-87 DEBUG org.apache.parquet.hadoop.InternalParquetRecordReader: read value
from h2o-3.
Hi @tomasfryda @wendycwong, were you able to confirm if this was an error related to parquet parsing?
from h2o-3.
Hi Nimish:
I don't have your parquet file, so I created one for myself. I started my backend using this command:
java -Xmx50g -jar build/h2o.jar
I ran the following code. Please change the directory path to your path if you want to run my code:
fr = h2o.create_frame(rows=163481, cols=851, real_fraction=1.0, categorical_fraction=0, has_response=True,
response_factors=2, seed=12345, missing_fraction=0.0)
h2o.export_file(fr, "/Users/wendycwong/temp/gh_16011.parquet", header=True, format="parquet") # export as parquet file
h2o.remove_all()
fr = h2o.import_file("/Users/wendycwong/temp/gh_16011.parquet")
m = H2OXGBoostEstimator(ntrees=10, seed=1234)
m.train(x=list(range(1, fr.ncol)), y="response", training_frame=fr)
print("Done")
The code run okay for me. So, the file size is not an issue here (I was worried about that).
So, without having access to your parquet code, I cannot debug what the problem is with your file. If you can change your parquet file to .csv, perhaps that may run for you.
Thanks,
Wendy
from h2o-3.
Related Issues (20)
- AstMatch does not work with multinode
- Installation docs lacking for dist / h2o-docs
- get_params() not working with XGBoost and gridsearch HOT 2
- Fix H2OFrame.isin
- Upgrade com.fasterxml.jackson.core to version 2.15 HOT 1
- h2o.H2OFrame.as_data_frame() leads to OSError HOT 5
- UpliftDRF - Cross validation, add more metrics
- Add git user info into release pipeline
- UpliftDRF - fix find best split point
- Add GLM Ordinal regression loglikelihood and AIC calculation.
- XGBoost support all parameters available for booster=gblinear
- Rename loglikelihood to negative_loglikelihood when it actually means the -log(likelihood)
- xgboost extension fails to initialize on JDK 17 due to attempt to use reflection to load native library HOT 2
- UpliftDRF MLI - Implement Shapley values
- The explain function is not working with UpliftDRF model
- Reimplement the explain function to support uplift models
- h2o 3.44.0.3 does not support JDK/Java 21 HOT 1
- Address CVE-2023-35116 in h2o-steam.jar
- Add newer R versions on jenkins for automated tests
- Improve perRow metric calculation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from h2o-3.