Giter Site home page Giter Site logo

hiyouga / dual-contrastive-learning Goto Github PK

View Code? Open in Web Editor NEW
147.0 6.0 25.0 5.95 MB

Code for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation"

Home Page: https://arxiv.org/abs/2201.08702

License: MIT License

Python 100.00%
contrastive-learning text-classification transformers bert deep-learning neural-networks natural-language-processing

dual-contrastive-learning's Introduction

Dual-Contrastive-Learning

GitHub

PWC

PWC

PWC

A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation".

You can download the paper via: [ArXiv] [PapersWithCode].

One-Sentence Summary

This paper proposes a novel contrastive learning framework for supervised classification tasks by simultaneously learning the features of input samples and the parameters of classifiers in the same space.

method

Abstract

Contrastive learning has achieved remarkable success in representation learning via self-supervision in unsupervised settings. However, effectively adapting contrastive learning to supervised learning tasks remains as a challenge in practice. In this work, we introduce a dual contrastive learning (DualCL) framework that simultaneously learns the features of input samples and the parameters of classifiers in the same space. Specifically, DualCL regards the parameters of the classifiers as augmented samples associating to different labels and then exploits the contrastive learning between the input samples and the augmented samples. Empirical studies on five benchmark text classification datasets and their low-resource version demonstrate the improvement in classification accuracy and confirm the capability of learning discriminative representations of DualCL.

Requirement

  • Python = 3.7
  • torch = 1.11.0
  • numpy = 1.17.2
  • transformers = 4.19.2

Preparation

Clone

git clone https://github.com/hiyouga/Dual-Contrastive-Learning.git

Create an anaconda environment:

conda create -n dualcl python=3.7
conda activate dualcl
pip install -r requirements.txt

Usage

python main.py --method dualcl

Citation

If this work is helpful, please cite as:

@article{chen2022dual,
  title={Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation},
  author={Qianben Chen and Richong Zhang and Yaowei Zheng and Yongyi Mao},
  journal={arXiv preprint},
  year={2022}
}

Contact

hiyouga [AT] buaa [DOT] edu [DOT] cn

License

MIT

dual-contrastive-learning's People

Contributors

chenqianben avatar hiyouga avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dual-contrastive-learning's Issues

Some questions with baselines

Your work is very good and effective. But I have some questions about the baseline approach. I tried different hyperparameters to adjust supervised contrastivelearning or unsupervised contrastive learning to fine-tune BERT, and then to classify. But I've never been able to do anything better than just Cross-Entropy. I wonder what I didn't take into account? I've seen a lot of papers that contrastive learning can help improve classification results, but here I always get the opposite. Maybe I want to know the hyperparameters you set when you ran the comparison.

Some logical problems

Using the calculation comparison loss method in the source code, the calculated loss may be negative, My input is(batchsizedim, batchsizeclass_num*dim, class_num),And Lz and Lθ may be negative at the same time

Why mess with the order of tags when using DualCL?

It's a great job. But I have a question, why do you use DualCL to perform out-of-order operation specifically for labels? This operation will not change the real label in binary classification, but it will change the real label in multi-classification. I don't understand the significance of this.

In fact, I followed this setup and then trained it on my own dataset, a binery classification task like dialogue intention recognition, and trained it for 30 epochs using Roberta, with very poor results, isn't DualCl suitable for this kind of task? I hope you can help me to point out my misunderstanding.

Issue regarding to the evaluation procedure

Hi thank you for your exciting work. I've noticed a potential problem regarding to the evaluation procedure. To best of my knowledge currently the best model is selected based on the test data. However this is not desirable since in real conditions it is not possible to chose the model based on the testing data. One probable issue rather than getting comparable performances is possibility for overfitting. Altough test data is not used for gradient updates, model is chosen based on the best performing test data. Therefore, we have no way of knowing if the proposed model is just better at leaking the information via model selection. One extreme case is if you randomly guess enough times on test set you can get 100%. That's generally why the validation split is used in prior works 1.

about chinese dataset

hello author
If the model runs on the Chinese dataset,What parts need to be modified and What should be paid attention to?
thank you!

tSNE plot visualization

Hi there!
I think these code and paper awesome!
when I run this code, I can see increasing accuracy.

but I want to see moving that the class representation and sentence feature representation too.
Could you please upload the tSNE visualization code to github as well?

have a good day.

Problem when saving the model !

Hi, thank you for your exciting work.
When I want to save model get this error:
Transformer object has no "save_pretrained"

How to save the model after we train it with your code?
In fact, I want to save the model and upload it in Huggingface so that I can load and use it later.

Why did the Dual gradient collapse on my own Chinese dataset?

Dear author, your framework is valid on the English dataset, but when I used dual-loss deficiency on my Chinese dataset, gradient collapse occurred. My Chinese label is two characters, is it related to this? Or do I have to adjust somewhere? Thank you very much. Look forward to hearing from you soon

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.