Giter Site home page Giter Site logo

ennengyang / adatask Goto Github PK

View Code? Open in Web Editor NEW
19.0 1.0 1.0 15 KB

AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning. AAAI, 2023.

Home Page: https://arxiv.org/abs/2211.15055

Python 100.00%
multi-task-learning multi-objective-optimization negative-transfer adaptive-learning-rate

adatask's Introduction

AdaTask

AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning. AAAI, 2023.

Abstract

Multi-task learning (MTL) models have demonstrated impressive results in computer vision, natural language processing, and recommender systems. Even though many approaches have been proposed, how well these approaches balance different tasks on each parameter still remains unclear. In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter. Specifically, we compute the total updates by the exponentially decaying Average of the squared Updates (AU) on a parameter from the corresponding task. Based on this novel metric, we observe that many parameters in existing MTL methods, especially those in the higher shared layers, are still dominated by one or several tasks. The dominance of AU is mainly due to the dominance of accumulative gradients from one or several tasks. Motivated by this, we propose a Task-wise Adaptive learning rate approach, AdaTask in short, to separate the accumulative gradients and hence the learning rate of each task for each parameter in adaptive learning rate approaches (e.g., AdaGrad, RMSProp, and Adam). Comprehensive experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks, resulting SOTA average task-wise performance. Analysis on both synthetic and real-world datasets shows AdaTask balance parameters in every shared layer well.

Citation

If you find our paper or this resource helpful, please consider cite:

@article{AdaTask_AAAI2023,
  title={AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning},
  author={{Yang, Enneng and Pan, Junwei and Wang, Ximei and Yu, Haibin and Shen, Li and Chen, Xihua and Xiao, Lei and Jiang, Jie and Guo, Guibing},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={37},
  number={9},
  pages={10745-10753},
  year={2023}
}

DataSet

  • Download CityScapes dataset and put it in the dataset directory.

Train and Evaluate Method

  python3  main_cityscapes.py --method=adam
  python3  main_cityscapes.py --method=adam_with_adatask

adatask's People

Contributors

ennengyang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

zhanglangjd

adatask's Issues

How to Define Shared and Task-Specific Parameters in the PLE Framework

Thank you for sharing your very meaningful work and sharing your codes on this repository. I have a question regarding parameter definition about the PLE framework when using AdaTask.

Could you please provide guidance or examples on how to:

  1. Define shared parameters that are used across all tasks (just shared experts?).
  2. Define task-specific parameters that are unique to each task (task-specific experts and towers?).

I really appreciate your help!!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.