Giter Site home page Giter Site logo

benbo / interactive-weak-supervision Goto Github PK

View Code? Open in Web Editor NEW
30.0 3.0 4.0 47 KB

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling

License: MIT License

Jupyter Notebook 33.70% Python 66.30%
weak-supervision data-programming data-labeling active-weak-supervision interactive-weak-supervision machine-learning training-data

interactive-weak-supervision's Introduction

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling

Code for text data experiments in:

Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling
Benedikt Boecking, Willie Neiswanger, Eric Xing, and Artur Dubrawski
International Conference on Learning Representations (ICLR), 2021
arXiv:2012.06046

In brief: Please check out the IWS.ipynb notebook. The notebook will walk you through an example application of IWS from start to finish. You can choose to perform the experiment yourself, or to simulate an oracle which responds to queries about proposed labeling functions.

Dependencies

If you have access to GPUs and want to ensure they can be used, please first check the correct way to install PyTorch on your system here: https://pytorch.org

Once you have installed pytorch (or if you don't care about using a GPU for now) you can install all requirements by running:

pip install -r requirements/requirements_pip.txt
# or
conda install -c conda-forge -c pytorch --file requirements/requirements_conda.txt

Data

To download all data for the text experiments, run:

cd datasets
wget https://ndownloader.figshare.com/files/25732838?private_link=860788136944ad107def -O iws_datasets.tar.gz
tar -xzvf iws_datasets.tar.gz
rm iws_datasets.tar.gz

Please see datasets/README.md for links and references to the original data sources and please cite the original sources where appropriate.

Running the IWS Notebook

To run text data experiments, please see the IWS.ipynb notebook.

This notebook will walk you through a full example of interactive weak supervision (IWS), from start to finish. It allows you to choose a text dataset, generate a family of labeling functions (LFs), and then run IWS on this family of LFs, either with an automated oracle or by querying you directly for feedback on LFs. It then trains a downstream classifier via weak supervision methods, using the LFs learned during IWS.

License

The code in this repository is shared under the MIT license, available in the LICENSE file.

Citation

Please cite our paper if you use code from this repo:

@inproceedings{boecking2021interactive,
  title={Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling},
  author={Boecking, Benedikt and Neiswanger, Willie and Xing, Eric and Dubrawski, Artur},
  booktitle={International Conference on Learning Representations},
  year={2021}
}

interactive-weak-supervision's People

Contributors

benbo avatar willieneis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

interactive-weak-supervision's Issues

Question regarding usage

Hi, first of all thanks for the great paper and the implementation. I tried running the ipynb on IMDB dataset, when I run the user interactive (not oracle), I get the first 4 random questions but then it does not move forward, any idea what might be happening. Thanks again!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.