Giter Site home page Giter Site logo

Comments (13)

brightmart avatar brightmart commented on May 30, 2024

你好,config_path应该也改一改吧,毕竟配置是不一样的,否则的话,有些参数就加载不上了吧。

from fewclue.

PineappleWill avatar PineappleWill commented on May 30, 2024

你好,config_path应该也改一改吧,毕竟配置是不一样的,否则的话,有些参数就加载不上了吧。

你好,config_path也是改了的

from fewclue.

wellinxu avatar wellinxu commented on May 30, 2024

@PineappleWill
你好,我们之前并没有用large模型测试过ptuning,所以我刚刚使用第0个数据集测试了下large模型,在第一个epoch确实像随机,后面就不断变好,结果如下。如果你后面的epoch也是随机结果,那检查下模型下载有没有问题(可以用该模型测试下其他分类问题),以及代码有无其他改动,或者一些其他特殊情况。
image

from fewclue.

PineappleWill avatar PineappleWill commented on May 30, 2024

@PineappleWill 你好,我们之前并没有用large模型测试过ptuning,所以我刚刚使用第0个数据集测试了下large模型,在第一个epoch确实像随机,后面就不断变好,结果如下。如果你后面的epoch也是随机结果,那检查下模型下载有没有问题(可以用该模型测试下其他分类问题),以及代码有无其他改动,或者一些其他特殊情况。 image

感谢您的回复

我换成在CSLDCP数据集上测试,结果依然不正常,预训练模型也验证了没有问题。

image

请问可否提供一下您测试的环境,包括cuda版本和tf版本

from fewclue.

wellinxu avatar wellinxu commented on May 30, 2024

@PineappleWill cuda 10.1,其他环境你可以看下https://github.com/CLUEbenchmark/FewCLUE/blob/main/baselines/models_keras/ptuning_origin/README.md

另外你可以贴下tnews任务上的训练结果吗?

from fewclue.

PineappleWill avatar PineappleWill commented on May 30, 2024

@PineappleWill cuda 10.1,其他环境你可以看下https://github.com/CLUEbenchmark/FewCLUE/blob/main/baselines/models_keras/ptuning_origin/README.md

另外你可以贴下tnews任务上的训练结果吗?

好的,我的cuda版本为10.2,此外环境全部一致,超参数也用的ptuning_tnews.py里默认的。
这是我在tnews第0个数据集上的结果,和您贴出来的差了很多
image

from fewclue.

PineappleWill avatar PineappleWill commented on May 30, 2024

@wellinxu

下方我用base的结果,前几个epoch就能接近论文里的结果,因此我比较怀疑是预训练模型的问题。
image

我使用的模型是chinese_roberta_wwm_large_ext_L-24_H-1024_A-16,下载的地址为https://github.com/ymcui/Chinese-BERT-wwm,尝试过重新下载和用在别处都没有问题。
image

请问您使用的large模型是和我一样吗?方便的话您可以把模型发到我邮箱[email protected]让我来试一试,谢谢

from fewclue.

wellinxu avatar wellinxu commented on May 30, 2024

@PineappleWill
你可以到这里下载模型
链接: https://pan.baidu.com/s/1m9SDYBhS1KHRqD8SqqaMoA 密码: e7j1
顺便问一下,你跑的那个文件,应该是https://github.com/CLUEbenchmark/FewCLUE/blob/main/baselines/models_keras/ptuning_origin/ptuning_tnews_old.py,
看你之前说运行ptuning_tnews.py文件。

from fewclue.

PineappleWill avatar PineappleWill commented on May 30, 2024

您好,我运行的是
https://github.com/CLUEbenchmark/FewCLUE/tree/main/baselines/models_keras/ptuning/ptuning_tnews.py
另外,用了您的模型还是效果很差......我又尝试了iflytex和csldcp数据集,结果分别为1.15%和2.02%。尝试调参最多只涨1%左右的正确率......

from fewclue.

wellinxu avatar wellinxu commented on May 30, 2024

@PineappleWill 运行这个文件
https://github.com/CLUEbenchmark/FewCLUE/blob/main/baselines/models_keras/ptuning_origin/ptuning_tnews_old.py

ptuning/文件夹中,是参考的苏剑林大佬的代码,跟论文方法有一些区别,ptuning_origin/文件夹中,是我们自己复现的论文方法,跟论文中的方法区别更小

from fewclue.

PineappleWill avatar PineappleWill commented on May 30, 2024

@PineappleWill 运行这个文件 https://github.com/CLUEbenchmark/FewCLUE/blob/main/baselines/models_keras/ptuning_origin/ptuning_tnews_old.py

ptuning/文件夹中,是参考的苏剑林大佬的代码,跟论文方法有一些区别,ptuning_origin/文件夹中,是我们自己复现的论文方法,跟论文中的方法区别更小

问题解决了,非常感谢T_T

from fewclue.

caiyuchen-ustc avatar caiyuchen-ustc commented on May 30, 2024

您好,我是**科学技术大学计算机学院的研究生,目前研1,最近在使用chinese-roberta-wwm-ext-large跑数据集的时候也遇到了和您类似的问题,跑了10个epoch后准确率一直上不去。想请问下您是如何解决的呢?

from fewclue.

PineappleWill avatar PineappleWill commented on May 30, 2024

from fewclue.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.