Giter Site home page Giter Site logo

ayushbits / robust-aggregate-lfs Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 2.0 33.37 MB

Source code of our ACL 2022 paper 'Learning to robustly aggregate labeling functions for semi-supervised data programming'

Home Page: https://arxiv.org/abs/2109.11410

Python 64.11% Jupyter Notebook 35.89%
data-programming machine-learning semi-supervision

robust-aggregate-lfs's Introduction

How to reproduce results

  1. CUDA_LAUNCH_BLOCKING=0 python3 gpu_rewt_ss_generic.py /tmp l1 0 l3 l4 0 l6 qg 5 <dataset_path> <num_class> nn 0 <batch_size> <lr_learning_rate> <gm_learning_rate> normal f1
  • <dataset_path> is the path to the directory of the stored LFs
  • <num_class> is number of classes in the dataset (for eg, TREC has 6 classes and SMS has 2 classes)
  • <batch_size> is kept sa 32 in all our experiments
  • <lr_learning_rate> is set as 0.0003
  • <gm_learning_rate> is set as 0.01
  • last argument can be either f1 or accuracy where f1 refers to macro-F1.

How to automatically generate LFs

  1. cd reef/
  2. python generate_human_lfs.py dataset(imdb/trec/sms/youtube) count/lemma savetype(dict/lemma)
  • 1st argument is dataset name (i.e imdb/trec/sms/youtube/sst5/twitter)
  • 2nd argument generation of raw (count) or lemmatized feature (lemma)
  • 3rd argument is path of the directory to save the generated LFs

Generate LFs from snuba

  1. cd reef/
  2. python generic_generate_labels.py youtube normal dt 1 26 yt_val2.5_sup5_dt1 count
  • 1st argument is dataset name (i.e imdb/trec/sms/youtube/sst5/twitter)
  • 2nd argument is prefix of generated pkl files
  • 3rd argument is number of LFs per step
  • 4th argument is number of epochs
  • 5th argument is storage path (LFs/data/youtube/<storage_path>) where pkl files will be stored
  • 6th argument is type of features

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.