Giter Site home page Giter Site logo

Comments (8)

CPFLAME avatar CPFLAME commented on May 19, 2024

graph确实存在这个问题, 采用自定义的lr, 必须要定义_generate_conf_for_graph, 而这个又需要调用C++的接口. 会导致用户完全定义不了自己的LR.
但是现在libai更多的是支持graph模式, 这就导致用户用libai 只能用oneflow内置的lr_scheduler

from libai.

L1aoXingyu avatar L1aoXingyu commented on May 19, 2024

我们可以通过一个 try catch 机制去捕获这个 error,不用重新 copy 一份一样的代码

from libai.

rentainhe avatar rentainhe commented on May 19, 2024

graph确实存在这个问题, 采用自定义的lr, 必须要定义_generate_conf_for_graph, 而这个又需要调用C++的接口. 会导致用户完全定义不了自己的LR. 但是现在libai更多的是支持graph模式, 这就导致用户用libai 只能用oneflow内置的lr_scheduler

可以的,写一个build来解决这个问题好了

from libai.

rentainhe avatar rentainhe commented on May 19, 2024

我们可以通过一个 try catch 机制去捕获这个 error,不用重新 copy 一份一样的代码

OKKK

from libai.

dangkai4u avatar dangkai4u commented on May 19, 2024

先统计一下预计支持哪几种scheduler、目前可以支持哪几种(graph+eager)、暂不支持哪几种,列个清单。如果常用的都支持,个别可以不管。如果某些必要的scheduler目前不支持,那就在oneflow里添加一下吧。

from libai.

rentainhe avatar rentainhe commented on May 19, 2024

先统计一下预计支持哪几种scheduler、目前可以支持哪几种(graph+eager)、暂不支持哪几种,列个清单。如果常用的都支持,个别可以不管。如果某些必要的scheduler目前不支持,那就在oneflow里添加一下吧。

暂时计划是先实现两个,WarmupMultiStepLRWarmupCosineLR,满足目前训练的需求,后续再开PR添加其他的,可以统计一下,因为有些LR Scheduler可能几乎都用不上,不常见的让用户自己添加即可

from libai.

dangkai4u avatar dangkai4u commented on May 19, 2024

WarmupMultiStepLR是什么更新规则,感觉这些不够啊。transformer模型一般会用Inverse square root scheduler训练,用polynomial decay scheduler微调,这两个都支持warmup。咱们的模型主要是transformer,所以这两个应该也支持一下。

from libai.

rentainhe avatar rentainhe commented on May 19, 2024

WarmupMultiStepLR是什么更新规则,感觉这些不够啊。transformer模型一般会用Inverse square root scheduler训练,用polynomial decay scheduler微调,这两个都支持warmup。咱们的模型主要是transformer,所以这两个应该也支持一下。

可以,这些都很好支持,inverse square这个恐怕是要oneflow内部支持

from libai.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.