Giter Site home page Giter Site logo

训练 teacher prob 的问题 about docee HOT 7 CLOSED

spico197 avatar spico197 commented on June 9, 2024
训练 teacher prob 的问题

from docee.

Comments (7)

Spico197 avatar Spico197 commented on June 9, 2024

这里是直接沿用了Doc2EDAG的做法,使用了scheduled sampling的方法对前一步的预测结果做了采样,使其作为teacher model指导下游模块更新。这里他们也做了消融实验,如果去掉scheduled sampling,最终指标是会降低的。

from docee.

xxllp avatar xxllp commented on June 9, 2024

好的 理解了 这块之前我没接触过~~

from docee.

kelvennn avatar kelvennn commented on June 9, 2024

这里是直接沿用了Doc2EDAG的做法,使用了scheduled sampling的方法对前一步的预测结果做了采样,使其作为teacher model指导下游模块更新。这里他们也做了消融实验,如果去掉scheduled sampling,最终指标是会降低的。

大大请问有没有调试过teacher_prob 的最小值? 效果如何? 最终性能的评估结果似乎会有波动

from docee.

Spico197 avatar Spico197 commented on June 9, 2024

这里是直接沿用了Doc2EDAG的做法,使用了scheduled sampling的方法对前一步的预测结果做了采样,使其作为teacher model指导下游模块更新。这里他们也做了消融实验,如果去掉scheduled sampling,最终指标是会降低的。

大大请问有没有调试过teacher_prob 的最小值? 效果如何? 最终性能的评估结果似乎会有波动

这个倒是没调过,如果您有什么发现的话可以说一下交流交流~

from docee.

kelvennn avatar kelvennn commented on June 9, 2024

这里是直接沿用了Doc2EDAG的做法,使用了scheduled sampling的方法对前一步的预测结果做了采样,使其作为teacher model指导下游模块更新。这里他们也做了消融实验,如果去掉scheduled sampling,最终指标是会降低的。

大大请问有没有调试过teacher_prob 的最小值? 效果如何? 最终性能的评估结果似乎会有波动

这个倒是没调过,如果您有什么发现的话可以说一下交流交流~

波动指的是 采样概率 teacher_prob似乎使模型训练出来的最终性能不稳定,我还以为搞错了,因此跑多了几次 发现最终性能出现过79.4,78.9,79.1,79.3,

from docee.

kelvennn avatar kelvennn commented on June 9, 2024

还是说每次训练完毕,要清理训练的生成的文件?是读错了?

from docee.

xxllp avatar xxllp commented on June 9, 2024

每次不一样很正常吧 因为有一点随机性

from docee.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.