Giter Site home page Giter Site logo

poloclub / timbertrek Goto Github PK

View Code? Open in Web Editor NEW
125.0 4.0 8.0 37.74 MB

Explore and compare 1K+ accurate decision trees in your browser!

Home Page: https://poloclub.github.io/timbertrek

License: MIT License

JavaScript 0.94% CSS 3.28% HTML 0.20% Svelte 12.67% SCSS 14.81% TypeScript 61.31% Python 6.80%
decision-tree interactive-visualizations interpretability visualization rashomon

timbertrek's Introduction

TimberTrek

Github Actions Status license Binder Lite pypi arxiv badge DOI:10.1109/VIS54862.2022.00021

Curate decision trees that align with your knowledge and values!

๐Ÿš€ Live Demo ๐Ÿ“บ Demo Video ๐Ÿ‘จ๐Ÿปโ€๐Ÿซ Conference Talk ๐Ÿ“– Research Paper

Web Demo

For a live web demo, visit: https://poloclub.github.io/timbertrek.

You can use the web demo to explore your own Rashomon Sets! You just need to choose the my own set tab below the tool and upload a JSON file containing all decision paths in your Rashomon Set.

Check out this example notebook to see how to generate the whole Rashomon Set and the JSON file.

Notebook Demos

You can directly use TimberTrek in your favorite computational notebooks (e.g. Jupyter Notebook/Lab, Google Colab, and VS Code Notebook).

Check out three live notebook demos below.

Jupyter Lite Binder Google Colab
Lite Binder Open In Colab

Install

To use TimberTrek in a notebook, you would need to install TimberTrek with pip:

pip install timbertrek

Development

Clone or download this repository:

git clone [email protected]:poloclub/timbertrek.git

Install the dependencies:

npm install

Then run TimberTrek:

npm run dev

Navigate to localhost:3000. You should see TimberTrek running in your browser :)

Credits

Led by Jay Wang, TimberTrek is a result of a collaboration between ML and visualization researchers from Georgia Tech, Duke University, Fujitsu Laboratories, and University of British Columbia. TimberTrek is created by Jay Wang, Chudi Zhong, Rui Xin, Takuya Takagi, Zhi Chen, Polo Chau, Cynthia Rudin, and Margo Seltzer.

Citation

To learn more about TimberTrek, please read our research paper (published at IEEE VIS 2022). To learn more about the algorithm to generate the whole Rashomon set of sparse decision trees, please read our TreeFARMS paper (published at NeurIPS'22). If you find TimberTrek useful for your research, please consider citing our paper. Thanks!

@inproceedings{wangTimberTrekExploringCurating2022,
  title = {{{TimberTrek}}: {{Exploring}} and {{Curating Trustworthy Decision Trees}} with {{Interactive Visualization}}},
  booktitle = {2022 {{IEEE Visualization Conference}} ({{VIS}})},
  author = {Wang, Zijie J. and Zhong, Chudi and Xin, Rui and Takagi, Takuya and Chen, Zhi and Chau, Duen Horng and Rudin, Cynthia and Seltzer, Margo},
  year = {2022}
}

License

The software is available under the MIT License.

Contact

If you have any questions, feel free to open an issue or contact Jay Wang.

timbertrek's People

Contributors

xiaohk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

timbertrek's Issues

Integration with Qualtrics?

Hi there!

Thank you once again for developing this amazing tool.

Would it be possible to embed the interactive in a Qualtrics Survey?

Thanks!

Issue with timbertrek.transform_trie_to_rules

Trying to convert a trie calculated by the treeFARMS package into a rules JSON for timbertrek using code suggested in another ticket (#2), but getting an error. The trie has 80739 trees according to the message.

Code:

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split
from treefarms.model.threshold_guess import compute_thresholds, cut
from treefarms import TREEFARMS
from treefarms.model.model_set import ModelSetContainer
import timbertrek

X = data
# h = list(X.columns)
for met in metric_cols:
    y = df[met]
    config = {
    "regularization": 0.02,  # regularization penalizes the tree with more leaves. We recommend to set it to relative high value to find a sparse tree.
    "rashomon_bound_multiplier": 0.25, "depth_budget": 0}  # rashomon bound multiplier indicates how large of a Rashomon set would you like to get}
    model = TREEFARMS(config)
    print('configed')
    model.fit(X, y)
    print('fitted')
    # Get the rashomon in a trie structure
    trie = model.model_set.to_trie()
    print('trie-d')
    df = model.dataset
    # Convert the trie to decision paths
    feature_names = df.columns
    decision_paths = timbertrek.transform_trie_to_rules(trie,df,feature_names=feature_names)
    # Save the decision paths in a JSON file
    dump(decision_paths, open('tree_for_'+str(met)+'.json', 'w'))

Error:

IndexError Traceback (most recent call last)
/tmp/ipykernel_16824/365095990.py in
26 trie,
27 df,
---> 28 feature_names=feature_names,
29 )
30 # Save the decision paths in a JSON file

/opt/conda/lib/python3.7/site-packages/timbertrek/timbertrek.py in transform_trie_to_rules(trie, data_df, feature_names, feature_description)
683 # Construct trees
684 decision_rule_hierarchy, tree_map = get_decision_rule_hierarchy_dict(
--> 685 trie, keep_position=False
686 )
687 new_tree_map = get_tree_map_hierarchy(tree_map)

/opt/conda/lib/python3.7/site-packages/timbertrek/timbertrek.py in get_decision_rule_hierarchy_dict(trie, keep_position)
483 for i in tree_map["map"]:
484 cur_string = tree_map["map"][i][0]
--> 485 all_rules = get_decision_rules(cur_string)
486
487 # Iterate the set and build the hierarchy dict

/opt/conda/lib/python3.7/site-packages/timbertrek/timbertrek.py in get_decision_rules(tree_strings)
238 cur_feature, pre_features = working_queue.popleft()
239
--> 240 cur_string = tree_strings[i]
241 cur_string_split = cur_string.split()
242

IndexError: list index out of range

Including screenshots.
image
image
image

Link to example notebook for "generate yout own JSON" is broken

Source URL: https://github.com/poloclub/timbertrek/blob/master/README.md

You can use the web demo to explore your own Rashomon Sets! You just need to choose the my own set tab below the tool and upload a JSON file containing all decision paths in your Rashomon Set.

Check out this example notebook to see how to generate this JSON file.

This link is broken. I liked the paper, and wanted to see if I could use my own data, but it is unclear how to do this. On the timbertek github.io page, in the widget that lets you pick your own data also, the link to how to do this is broken. it redirects to the readme and the link in the readme is broken.

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.