感谢你们的开源工作，我用自己的模型转化之后可以成功用lightseq成功预测，加速效果的确明显，但是vocab size只能为非常小的值，一旦过大，就会爆显存，实例模型的v

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。 </blockquo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

cuda版本是11.0 在创建容器时会提示tensorrt server的提示3090不支持该容器。 <a target="_blank" rel="noo

编译的时候失败了，使用的cuda11环境。 <a target="_blank" rel="noopener noreferrer nofollow" href="

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。 <p dir="a

pytorch GPT模型显存OOM，cuda11不能运行 about lightseq HOT 11 CLOSED

bytedance commented on May 20, 2024

pytorch GPT模型显存OOM，cuda11不能运行

from lightseq.

Comments (11)

bigprince97 commented on May 20, 2024 2

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。

from lightseq.

bigprince97 commented on May 20, 2024 1

您指的max_step应该是位置向量的最大长度吧，这改成100，的确能扩大vocab size到50257，感谢！

from lightseq.

Taka152 commented on May 20, 2024 1

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。

@bigprince97 您好，我也遇到了lightseq预测结果和pytorch版本的gpt2（https://github.com/yangjianxin1/GPT2-chitchat）
不一致的问题，百思不解中幸运地找到了您的解释，请问您说的pytorch的参数矩阵转置的问题，是出现在torch.load()中的吗？能否给一些详细的信息呢？感谢

@Majokiki pytorch的weight很多都是[out_dim, in_dim]的方式存储的，lightseq中需要[in_dim, out_dim]的方式存储

from lightseq.

Taka152 commented on May 20, 2024

@bigprince97 感谢你使用lightseq并成功应用在自己的模型上面。回答一下你提出的两个问题：

1.vocab设置过大显存会爆。
显存占用除了vocab size会影响之外，max_batch_size和max_step的设置也会影响到显存的占用。建议可以适当调小这两个参数来提供更多的空间给更大的vocab size。

2.3090使用lightseq。
目前的使用并没有限制显卡的具体型号，只限制了cuda>=10.1，所以理论上是可以在3090上成功使用的。方便分享更多的信息以查看具体原因吗，包括报错，cuda版本等信息。

from lightseq.

bigprince97 commented on May 20, 2024

cuda版本是11.0
在创建容器时会提示tensorrt server的提示3090不支持该容器。

运行实例代码时，报错如下：

from lightseq.

bigprince97 commented on May 20, 2024

另外我将pytorch版本的gpt2的参数根据proto转换成了对应的模型文件，lightseq可以正常推理，但是预测结果和pytorch版本的gpt2有很大差异，pytorch版本是按照gpt2论文的结构，可能和lightseq里面的gpt模型结构有细微差异，这一块具体的模型结构，能够麻烦提供具体的pytorch或者tf版本的实现吗？

from lightseq.

Taka152 commented on May 20, 2024

很高兴看到你的问题得到解决，剩下一个是3090上的运行。这个问题的主要原因在于build所依赖的Nvidia Triton inference server镜像不支持3090。

我们前几天更新了CMake的编译方法，解决了对Triton inference server镜像的依赖，欢迎你尝试一下doc/build.md里提到的方法进行编译，应该可以解决在3090或者说是cuda11下的运行问题。

from lightseq.

bigprince97 commented on May 20, 2024

编译的时候失败了，使用的cuda11环境。

from lightseq.

Taka152 commented on May 20, 2024

项目里有submodule，尝试git submodule update --init

from lightseq.

Majokiki commented on May 20, 2024

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。

@bigprince97 您好，我也遇到了lightseq预测结果和pytorch版本的gpt2（https://github.com/yangjianxin1/GPT2-chitchat）
不一致的问题，百思不解中幸运地找到了您的解释，请问您说的pytorch的参数矩阵转置的问题，是出现在torch.load()中的吗？能否给一些详细的信息呢？感谢

from lightseq.

YINGPENGZH commented on May 20, 2024

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。

@bigprince97 您好，我也遇到了lightseq预测结果和pytorch版本的gpt2（https://github.com/yangjianxin1/GPT2-chitchat）
不一致的问题，百思不解中幸运地找到了您的解释，请问您说的pytorch的参数矩阵转置的问题，是出现在torch.load()中的吗？能否给一些详细的信息呢？感谢

@Majokiki pytorch的weight很多都是[out_dim, in_dim]的方式存储的，lightseq中需要[in_dim, out_dim]的方式存储

您好，我也遇到了pytorch和lightseq的gpt2不一致的问题，请问具体是怎么解决的呢？在pytorch模型转化之前做什么吗？

from lightseq.

pytorch GPT模型显存OOM，cuda11不能运行 about lightseq HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent