Comments (3)
收到,我们尝试复现一下问题。
from libai.
您好,请问【多机训练失败】是手动CTRL + C结束程序,还是代码异常报错失败呢?
我这里基于:https://libai.readthedocs.io/en/latest/tutorials/get_started/quick_run.html 的bert demo跑了一下2机的,CTRL + C以后,master(node0)结束后,node1的程序是可以正常终止的。
from libai.
代码异常报错失败哈
from libai.
Related Issues (20)
- 多机训练报错 HOT 13
- 关于benchmark实验结果的疑问 HOT 2
- [Bug]libai test error:File exists: './data_test/bert_data' HOT 3
- 微信群满了 HOT 3
- CI test 失效
- 纯tensor并行训练,4卡和8卡使用的集合通信算子不同 HOT 2
- TypeError: __init__() got an unexpected keyword argument 'flags' HOT 5
- GLM libai推理报错 HOT 2
- MT5和T5的区别 HOT 4
- [多机多卡][MT5]failed to connect to all addresses HOT 1
- GPT2预训练,libai的throughput和以前的数据不匹配 HOT 1
- 测试并行框架,张量并行结果与官网所给数据不一致
- GLM 10B CN推理加速耗时 HOT 1
- 运行教程的bash tools/train.sh tools/train_net.py configs/vit_imagenet.py 8 命令报错
- Project下的MAE多卡训练报错
- 运行GLM示例报错 module 'oneflow._C' has no attribute 'fused_multi_head_attention_inference_v2' HOT 1
- 建议requirements 中涉及requests指定一下具体版本
- 单机多卡跑gpt2_pretrain.py遇到如下问题
- LLaMA-7B SFT died with <Signals.SIGABRT: 6>
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libai.