Giter Site home page Giter Site logo

dask-lightgbm's Introduction

Dask

Build Status Coverage status Documentation Status Discuss Dask-related things and ask for help Version Status NumFOCUS

Dask is a flexible parallel computing library for analytics. See documentation for more information.

LICENSE

New BSD. See License File.

dask-lightgbm's People

Contributors

cotterpl avatar jacobtomlinson avatar jameslamb avatar jsignell avatar mlemainque avatar sfinxcz avatar shcherbin avatar striajan avatar strikerrus avatar thomasjpfan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dask-lightgbm's Issues

Migrate CI to GitHub Actions

Due to changes in the Travis CI billing, the Dask org is migrating CI to GitHub Actions.

This repo contains a .travis.yml file which needs to be replaced with an equivalent .github/workflows/ci.yml file.

See dask/community#107 for more details.

Cannot pass the tests using latest dask/distributed

Using latest dask, distributed and it looks like I cannot pass any unit test.

The first error I encounter is @gen_cluster does not accept check_new_threads parameter anymore. After I remove that part, all tests are failing.

appnope==0.1.0
asn1crypto==0.24.0
atomicwrites==1.3.0
attrs==19.2.0
backcall==0.1.0
bokeh==1.3.4
certifi==2019.9.11
cffi==1.12.3
chardet==3.0.4
Click==7.0
cloudpickle==1.2.2
cryptography==2.7
cytoolz==0.10.0
dask==2.5.0
dask-glm==0.2.0
-e git+https://github.com/dask/dask-lightgbm.git@15c44e7d441ac4f13053fe64543e3868574589fb#egg=dask_lightgbm
dask-ml==1.0.0
decorator==4.4.0
distributed==2.5.1
fsspec==0.5.1
future==0.17.1
greenlet==0.4.15
HeapDict==1.0.1
idna==2.8
importlib-metadata==0.23
ipython==7.8.0
ipython-genutils==0.2.0
jedi==0.14.1
Jinja2==2.10.1
joblib==0.14.0
lightgbm==2.2.3
llvmlite==0.29.0
locket==0.2.0
MarkupSafe==1.1.1
more-itertools==7.2.0
msgpack==0.6.2
multipledispatch==0.6.0
neovim==0.3.1
numba==0.45.1
numpy==1.17.2
olefile==0.46
packaging==19.2
pandas==0.25.1
parso==0.5.1
partd==1.0.0
pexpect==4.7.0
pickleshare==0.7.5
Pillow==6.2.0
pluggy==0.13.0
prompt-toolkit==2.0.9
psutil==5.6.3
ptyprocess==0.6.0
py==1.8.0
pycparser==2.19
Pygments==2.4.2
pynvim==0.3.2
pyOpenSSL==19.0.0
pyparsing==2.4.2
PySocks==1.7.1
pytest==5.2.0
python-dateutil==2.8.0
python-jsonrpc-server==0.2.0
python-language-server==0.28.3
pytz==2019.2
PyYAML==5.1.2
requests==2.22.0
scikit-learn==0.21.3
scipy==1.3.1
six==1.12.0
sortedcontainers==2.1.0
sparse==0.8.0
tblib==1.4.0
toolz==0.10.0
tornado==6.0.3
traitlets==4.3.2
urllib3==1.25.6
wcwidth==0.1.7
zict==1.0.0
zipp==0.6.0

KeyError in build_network_params

2020-11-21 09:05:51,923 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/data/jon/h2oai.fullcondatest3/h2oaicore/models.py", line 1739, in dask_fit
2020-11-21 09:05:51,924 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     func(X, y, **kwargs_dask)
2020-11-21 09:05:51,925 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jon/minicondadai/lib/python3.6/site-packages/dask_lightgbm/core.py", line 187, in fit
2020-11-21 09:05:51,926 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     model = train(client, X, y, params, model_factory, sample_weight, **kwargs)
2020-11-21 09:05:51,926 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jon/minicondadai/lib/python3.6/site-packages/dask_lightgbm/core.py", line 131, in train
2020-11-21 09:05:51,927 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     results = client.gather(futures_classifiers)
2020-11-21 09:05:51,927 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/client.py", line 1974, in gather
2020-11-21 09:05:51,928 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     asynchronous=asynchronous,
2020-11-21 09:05:51,928 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/client.py", line 824, in sync
2020-11-21 09:05:51,929 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
2020-11-21 09:05:51,929 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/utils.py", line 339, in sync
2020-11-21 09:05:51,930 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     raise exc.with_traceback(tb)
2020-11-21 09:05:51,930 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/utils.py", line 323, in f
2020-11-21 09:05:51,931 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     result[0] = yield future
2020-11-21 09:05:51,931 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jon/minicondadai/lib/python3.6/site-packages/tornado/gen.py", line 735, in run
2020-11-21 09:05:51,932 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     value = future.result()
2020-11-21 09:05:51,932 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jon/minicondadai/lib/python3.6/site-packages/distributed/client.py", line 1833, in _gather
2020-11-21 09:05:51,933 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |     raise exception.with_traceback(traceback)
2020-11-21 09:05:51,933 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jenkins/minicondadai/lib/python3.6/site-packages/dask_lightgbm/core.py", line 60, in _train_part
2020-11-21 09:05:51,934 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   |   File "/home/jenkins/minicondadai/lib/python3.6/site-packages/dask_lightgbm/core.py", line 36, in build_network_params
2020-11-21 09:05:51,934 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   | KeyError: 'tcp://172.16.2.192:43141'
2020-11-21 09:05:51,935 C:  3% D:42.6GB  M:76.9GB  NODE:LOCAL2      20010  DATA   | ].

Any usage examples?

Came across this but unclear how to implement. Thanks. Does it also work for the cross validation component of LightGBM?

Final release

In #30 we added a warning that dask-lightgbm is deprecated. It would be great to issue one final release with this change before archiving this repo.

@SfinxCZ I'm not sure what your availability is for pushing out such a release. If you'd prefer for me to handle the release, I'm happy to do so, could you add me to the dask-lightgbm project on PyPI? My PyPI username is the same as my GitHub username

Install via pip

Will this be provided as a pip package similar to dask_ml.xgboost?

Please recompile with CMake option -DUSE_GPU=1

[LightGBM] [Fatal] GPU Tree Learner was not enabled in this build.
Please recompile with CMake option -DUSE_GPU=1
distributed.worker - WARNING -  Compute Failed

If I try to use GPUs I get this message, even though my base lightgbm was compiled with GPUs and non-dask models work fine.

Is this expected?

Move default branch from "master" -> "main"

@jrbourbeau and I are in the process of moving the default branch for this repo from master to main.

  • Changed in GitHub
  • Merged PR to change branch name in code (xref #28)

What you'll see

Once the name on github is changed (the first box above is Xed, or this issue closed), when you try to git pull you'll get

Your configuration specifies to merge with the ref 'refs/heads/master'
from the remote, but no such ref was fetched.

What you need to do

First: head to your fork and rename the default branch there
Then:

git branch -m master main
git fetch origin
git branch -u origin/main main

MPI support

As LightGBM supports MPI itself, is that possible to use MPI for high-performance communication in dask?

[LightGBM] [Warning] Set TCP_NODELAY failed

What happened:

d_classif = dlgbm.LGBMClassifier(n_estimators=50, local_listen_port=12400)
d_classif.fit(dX, dy)
Parameter tree_learner not set or set to incorrect value (None), using "data" as default
[LightGBM] [Warning] Set TCP_NODELAY failed
[LightGBM] [Info] Trying to bind port 12401...
[LightGBM] [Info] Binding port 12401 succeeded
[LightGBM] [Info] Listening...
[LightGBM] [Warning] Set TCP_NODELAY failed
[LightGBM] [Info] Trying to bind port 12400...
[LightGBM] [Info] Binding port 12400 succeeded
[LightGBM] [Warning] Set TCP_NODELAY failed
[LightGBM] [Info] Listening...
[LightGBM] [Warning] Set TCP_NODELAY failed
[LightGBM] [Warning] Set TCP_NODELAY failed
[LightGBM] [Info] Connected to rank 1
[LightGBM] [Warning] Set TCP_NODELAY failed
[LightGBM] [Info] Local rank: 0, total number of machines: 2
[LightGBM] [Info] Connected to rank 0
[LightGBM] [Info] Local rank: 1, total number of machines: 2
[LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1
[LightGBM] [Warning] num_threads is set=1, n_jobs=-1 will be ignored. Current value: num_threads=1

import os

import dask.dataframe as dd
from dask.distributed import Client
import dask_lightgbm.core as dlgbm

client = Client() 

data = dd.read_csv('/data/platform/code/dask-lightgbm/system_tests/data/*.gz', compression='gzip', blocksize=None)

dX = data.iloc[:, :-1]
dy = data.iloc[:, -1]

d_classif = dlgbm.LGBMClassifier(n_estimators=50, local_listen_port=12400)
d_classif.fit(dX, dy)

Anything else we need to know?:

Environment:

  • Dask version:
    dask 2.27.0
    dask-glm 0.2.0
    dask-lightgbm 0.1.0
    dask-ml 1.7.0

  • Python version:
    3.6

  • Operating System:
    centos7

Local cluster fails. LightGBMError: Machine list file doesn't contain the local machine

What happened:
In the model.fit() call fails with the following error:
LightGBMError: Machine list file doesn't contain the local machine

What you expected to happen:
I expected the dask-lightgbm wrapper to handle the machine file / setup

Minimal Complete Verifiable Example:
X_train / y_train are both dask.dataframe.core.DataFrame

from dask.distributed import Client
import dask.dataframe as dd
client = Client()

X_train, y_train = get_data()

model = dlgbm.LGBMRegressor(n_estimators=50, tree_learner='data')
model.fit(
    X_train, 
    y_train
)

Anything else we need to know?:

---------------------------------------------------------------------------
LightGBMError                             Traceback (most recent call last)
<ipython-input-22-71879a8a97e6> in <module>
      2 model.fit(
      3     X_train,
----> 4     y_train
      5 )

d:\thor-dask\env\lib\site-packages\dask_lightgbm\core.py in fit(self, X, y, sample_weight, client, **kwargs)
    219         model_factory = lightgbm.LGBMRegressor
    220         params = self.get_params(True)
--> 221         model = train(client, X, y, params, model_factory, sample_weight, **kwargs)
    222 
    223         self.set_params(**model.get_params())

d:\thor-dask\env\lib\site-packages\dask_lightgbm\core.py in train(client, data, label, params, model_factory, weight, **kwargs)
    129                            for worker, list_of_parts in worker_map.items()]
    130 
--> 131     results = client.gather(futures_classifiers)
    132     results = [v for v in results if v]
    133     return results[0]

d:\thor-dask\env\lib\site-packages\distributed\client.py in gather(self, futures, errors, direct, asynchronous)
   1991                 direct=direct,
   1992                 local_worker=local_worker,
-> 1993                 asynchronous=asynchronous,
   1994             )
   1995 

d:\thor-dask\env\lib\site-packages\distributed\client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    832         else:
    833             return sync(
--> 834                 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    835             )
    836 

d:\thor-dask\env\lib\site-packages\distributed\utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
    337     if error[0]:
    338         typ, exc, tb = error[0]
--> 339         raise exc.with_traceback(tb)
    340     else:
    341         return result[0]

d:\thor-dask\env\lib\site-packages\distributed\utils.py in f()
    321             if callback_timeout is not None:
    322                 future = asyncio.wait_for(future, callback_timeout)
--> 323             result[0] = yield future
    324         except Exception as exc:
    325             error[0] = sys.exc_info()

d:\thor-dask\env\lib\site-packages\tornado\gen.py in run(self)
    733 
    734                     try:
--> 735                         value = future.result()
    736                     except Exception:
    737                         exc_info = sys.exc_info()

d:\thor-dask\env\lib\site-packages\distributed\client.py in _gather(self, futures, errors, direct, local_worker)
   1850                             exc = CancelledError(key)
   1851                         else:
-> 1852                             raise exception.with_traceback(traceback)
   1853                         raise exc
   1854                     if errors == "skip":

d:\thor-dask\env\lib\site-packages\dask_lightgbm\core.py in _train_part()
     69     try:
     70         model = model_factory(**params)
---> 71         model.fit(data, label, sample_weight=weight)
     72     finally:
     73         _safe_call(_LIB.LGBM_NetworkFree())

d:\thor-dask\env\lib\site-packages\lightgbm\sklearn.py in fit()
    741                                        verbose=verbose, feature_name=feature_name,
    742                                        categorical_feature=categorical_feature,
--> 743                                        callbacks=callbacks)
    744         return self
    745 

d:\thor-dask\env\lib\site-packages\lightgbm\sklearn.py in fit()
    598                               verbose_eval=verbose, feature_name=feature_name,
    599                               categorical_feature=categorical_feature,
--> 600                               callbacks=callbacks)
    601 
    602         if evals_result:

d:\thor-dask\env\lib\site-packages\lightgbm\engine.py in train()
    226     # construct booster
    227     try:
--> 228         booster = Booster(params=params, train_set=train_set)
    229         if is_valid_contain_train:
    230             booster.set_train_data_name(train_data_name)

d:\thor-dask\env\lib\site-packages\lightgbm\basic.py in __init__()
   1707                                      local_listen_port=params.get("local_listen_port", 12400),
   1708                                      listen_time_out=params.get("listen_time_out", 120),
-> 1709                                      num_machines=params.get("num_machines", num_machines))
   1710                     break
   1711             # construct booster object

d:\thor-dask\env\lib\site-packages\lightgbm\basic.py in set_network()
   1838                                          ctypes.c_int(local_listen_port),
   1839                                          ctypes.c_int(listen_time_out),
-> 1840                                          ctypes.c_int(num_machines)))
   1841         self.network = True
   1842         return self

d:\thor-dask\env\lib\site-packages\lightgbm\basic.py in _safe_call()
     43     """
     44     if ret != 0:
---> 45         raise LightGBMError(decode_string(_LIB.LGBM_GetLastError()))
     46 
     47 

LightGBMError: Machine list file doesn't contain the local machine

Environment:

  • Dask version: 2.28.0
  • Python version: 3.6
  • Operating System: windows 10
  • Install method (conda, pip, source): pip

Program blocks for a long time

`import os
import dask.dataframe as dd
from dask.distributed import Client
import dask_lightgbm.core as dlgbm
import logging

logging.basicConfig(format='%(asctime)s, %(message)s', level=logging.INFO)
listen_port = 12400

def make_client():
return Client('127.0.0.1:51127')

def test_classify_newsread(client, listen_port):
logging.info("Read data ...")
data = dd.read_csv('./system_tests/data/*.gz', compression='gzip', blocksize=None)
dX = data.iloc[:, :-1]
dy = data.iloc[:, -1]

logging.info("Train classifier ...")
d_classif = dlgbm.LGBMClassifier(n_estimators=50, local_listen_port=listen_port)
d_classif.fit(dX, dy) 

logging.info("Predict ...")
dy_pred = d_classif.predict(dX, client=client)

logging.info("Evaluate metrics ...")
acc_score = (dy == dy_pred).sum() / len(dy)
acc_score = acc_score.compute()
print(acc_score)

assert acc_score > 0.8 

test_classify_newsread(make_client(), listen_port)`

I launched a dask.distributed.Client object in '127.0.0.1:51127'. But the program cannot come to an end.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.