Giter Site home page Giter Site logo

Comments (7)

strani avatar strani commented on August 21, 2024

Sta robba funziona se usata correttamente ;-)

You are missing to translate the XGBoost model into an RTEnsemble model (the one that models an ensemble of regression trees in RankEval).

For example, take a look at the following notebook showing how to use the RTEnsemble model in conjunction with LightGBM (XGBoost is almost the same).

So to summarize:

  • train the model with the xgboost API
  • save the model on a file in textual format (not binary). This is done with the dump_model method of the XGBoost API.
  • load the model from the aforementioned file with the RTEnsemble class
  • score/predict/analyze/do whatever you want with this model, from now on in the RankEval format

P.S. Also the dataset has to be in the RankEval format. It is not allowed to use a raw numpy array, but you can create a RankEval dataset from a numpy array easily (using X, y and query_ids)

from rankeval.

paulperry avatar paulperry commented on August 21, 2024

Closer, I'm not able to reload the model. Do I need to worry about how the model was trained?

filename = 'xgb.model'
xgb_model.get_booster().dump_model(filename)
rankeval_model = RTEnsemble(filename, name="XGB model", format="XGBoost")

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-728-5975ee1d8cc2> in <module>()
      6 rankeval_model = None
      7 xgb_model.get_booster().dump_model(filename)
----> 8 rankeval_model = RTEnsemble(filename, name="XGB model", format="XGBoost")

~/rankeval/rankeval/rankeval/model/rt_ensemble.py in __init__(self, file_path, name, format, base_score, learning_rate, n_trees)
    124         elif format == "XGBoost":
    125             from rankeval.model import ProxyXGBoost
--> 126             ProxyXGBoost.load(file_path, self)
    127         elif format == "ScikitLearn":
    128             from rankeval.model import ProxyScikitLearn

~/rankeval/rankeval/rankeval/model/proxy_XGBoost.py in load(file_path, model)
    104                     node_id = int(match_leaf.group(1).strip()) + root_node
    105                     leaf_value = float(match_leaf.group(2).strip())
--> 106                     model.trees_nodes_value[node_id] = leaf_value
    107 
    108                 if match_node or match_leaf:

IndexError: index 344 is out of bounds for axis 0 with size 344

from rankeval.

strani avatar strani commented on August 21, 2024

I feel like your problem is related to the open issue #12 I recently discovered. It occurs when XGBoost, after fitting a tree, prunes out some nodes before going on with the boosting phase...

I'll try to fix it tomorrow...your code seems to be correct.

from rankeval.

strani avatar strani commented on August 21, 2024

The problem has been solved. The solution is in the develop branch. It will be merged soon in the master.

from rankeval.

paulperry avatar paulperry commented on August 21, 2024

I'm able to build/run master, but am not able to run develop:
import rankeval

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-1-c66b5899c31b> in <module>()
----> 1 import rankeval

~/anaconda3/lib/python3.6/site-packages/rankeval-0.7.2-py3.6-macosx-10.7-x86_64.egg/rankeval/__init__.py in <module>()
      9 
     10 __version__ = io.open(os.path.join(cur_dir, '..', 'VERSION'),
---> 11                       encoding='utf-8').read().strip()

FileNotFoundError: [Errno 2] No such file or directory: '/Users/paulperry/anaconda3/lib/python3.6/site-packages/rankeval-0.7.2-py3.6-macosx-10.7-x86_64.egg/rankeval/../VERSION'

I just copied the VERSION file over and it worked. I'm now past this problem and the model now loads.

from rankeval.

strani avatar strani commented on August 21, 2024

Could you kindly retry? This bug should already have been fixed in the master, so I merged all the PRs made there on the develop branch as well. Let me know.

from rankeval.

paulperry avatar paulperry commented on August 21, 2024

I'm passed this issue, so closing.

from rankeval.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.