Giter Site home page Giter Site logo

mccws's Introduction

MCCWS

Works on MCCWS.

Installation

# Some datasets are compressed by rar.
apt install unrar

# Python dependency.
pipenv install

Development Installation.

pipenv install --dev

Execution

# First run preprocessing.
python -m src.preprocess \
  --dev_ratio 0.1 \
  --exp_name my_pre_exp \
  --model_name bert-base-chinese \
  --max_len 60 \
  --seed 42 \
  --use_dset as \
  --use_dset cityu \
  --use_dset msr \
  --use_dset pku \
  --use_width_norm 1 \
  --use_num_norm 1 \
  --use_alpha_norm 1 \
  --use_mix_alpha_num_norm 1 \
  --use_unc 1

# Then use the preprocessing result to train MCCWS.
python -m src.train_mccws \
  --batch_size 64 \
  --ckpt_step 5000 \
  --exp_name my_model_exp \
  --gpu 0 \
  --log_step 1000 \
  --lr 2e-5 \
  --max_norm 10.0 \
  --pre_exp_name my_pre_exp \
  --p_drop 0.1 \
  --seed 42 \
  --total_step 200000 \
  --use_unc 1 \
  --warmup_step 50000 \
  --weight_decay 0.0

# After training, generate model inference on dev sets.
python -m src.infer_mccws \
  --batch_size 512 \
  --exp_name my_dev_infer_exp \
  --first_ckpt 100000 \
  --gpu 0 \
  --last_ckpt 200000 \
  --model_exp_name my_model_exp \
  --seed 42 \
  --split dev \
  --use_unc 0

# Run dev F1 evaluation and find the checkpoint with highest dev F1.
python -m src.eval_mccws_f1 \
  --exp_name my_dev_infer_exp \
  --first_ckpt 100000 \
  --gpu 0 \
  --last_ckpt 200000 \
  --split dev \
  --use_unc 0

# Use the checkpoint with highest F1 on dev to inference on test sets.
python -m src.infer_mccws \
  --batch_size 512 \
  --exp_name my_test_infer_exp \
  --first_ckpt 175000 \
  --gpu 0 \
  --last_ckpt 200000 \
  --model_exp_name my_model_exp \
  --seed 42 \
  --split test \
  --use_unc 0

# Run test F1 evaluation.
python -m src.eval_mccws_f1 \
  --exp_name my_test_infer_exp \
  --first_ckpt 175000 \
  --gpu 0 \
  --last_ckpt 200000 \
  --split test \
  --use_unc 0

# Inference with UNC.
python -m src.infer_mccws \
  --batch_size 512 \
  --exp_name my_test_unc_infer_exp \
  --first_ckpt 175000 \
  --gpu 0 \
  --last_ckpt 200000 \
  --model_exp_name my_model_exp \
  --seed 42 \
  --split test \
  --use_unc 1

# Run test F1 evaluation with UNC.
python -m src.eval_mccws_f1 \
  --exp_name my_test_unc_infer_exp \
  --first_ckpt 175000 \
  --gpu 0 \
  --last_ckpt 200000 \
  --split test \
  --use_unc 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.