Giter Site home page Giter Site logo

Comments (8)

zezhishao avatar zezhishao commented on May 26, 2024

抱歉,我没有遇到过这个问题,您对代码有什么修改吗?

from step.

stanli124 avatar stanli124 commented on May 26, 2024

image
image
我只修改了图片中的两个地方,第一个是传入的函数参数名错误了,第二个是要修改和服务器gpu的数量一致,不然会报错。
就这两个地方,其他没有修改

from step.

zezhishao avatar zezhishao commented on May 26, 2024

按道理应该没有问题,我无法复现这个问题,难以提供实质的帮助。建议您从以下两个方面排查:
1 重新严格地按照STEP的readme安装依赖,推荐使用conda创建一个新环境。
2 debug一下看看程序是在哪里终止的。

from step.

stanli124 avatar stanli124 commented on May 26, 2024

感谢!我再试试,程序没有终止,只是卡在了图片中的位置一直没有继续运行

from step.

stanli124 avatar stanli124 commented on May 26, 2024

感谢作者,我解决了。重新装了环境,然后可能是easy-torch对多gpu的支持有问题,我改成一个gpu跑就能整正常训练了

from step.

zezhishao avatar zezhishao commented on May 26, 2024

感谢!我再试试,程序没有终止,只是卡在了图片中的位置一直没有继续运行

如果您是通过后台的显存占用来判断程序是否还在运行的话,可能会有问题。
多卡运行的时候,程序意外退出,可能会在0卡以外的其他卡仍然保留未被释放的显存。但实际上程序已经停止了。

from step.

zezhishao avatar zezhishao commented on May 26, 2024

感谢作者,我解决了。重新装了环境,然后可能是easy-torch对多gpu的支持有问题,我改成一个gpu跑就能整正常训练了

easytorch对多卡的支持还可以啊,平时我也是用多卡跑,而且我在很多个多卡的机器上都测试过。

建议您还是debug一下看看问题出在哪里。最好还是在多卡加持下训练STEP,比较快~

from step.

stanli124 avatar stanli124 commented on May 26, 2024

谢谢,我后面debug看一下 ╰(°▽°)╯

from step.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.