First of all, thank you very much for your outstanding work. In my task, I successfull

triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 254208, Hardware limit: 101376. about mamba HOT 6 OPEN

xypjq commented on August 15, 2024

triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 254208, Hardware limit: 101376.

from mamba.

Comments (6)

hhhhpaaa commented on August 15, 2024 3

This is triton's problem. Please uninstall triton and install triton-nigntly. Referenceissues/438 @xypjq @zzzendurance

from mamba.

tridao commented on August 15, 2024 1

Can you try reducing d_state (e.g. <= 128) and chunk_size (e.g. try 128).

from mamba.

xypjq commented on August 15, 2024

Can you try reducing d_state (e.g. <= 128) and chunk_size (e.g. try 128).

Thank you very much for your answer. This is my manba2 block parameter. I have reduced the d_state and chunk_Size, but I found that the CUDA occupancy has not changed. I still require 254208. And this issue is not present in forward, only in backward, but if I use Mamba1's network without the above problem.

from mamba.

zzzendurance commented on August 15, 2024

您能否尝试减少 d_state (例如 <= 128) 和 chunk_size (例如尝试 128)。

非常感谢你的回答，这是我的manba2 block参数，我把d_state和chunk_Size都调小了，但是发现CUDA占用率没变，还是需要254208，而且这个问题不是forward才有，是backward才有，但是如果我用Mamba1的网络就没有上面的问题了。

I have met the same problem with you. The bug occurred when I tried to use Mamba2Simple module. May I ask if you have found a solution to the problem

from mamba.

zzzendurance commented on August 15, 2024

This is triton's problem. Please uninstall triton and install triton-nigntly. Referenceissues/438 @xypjq @zzzendurance

Thank you very much, hahaha, I found your reply after I left a message here, I have solved the problem, thank you again

from mamba.

Anri-Lombard commented on August 15, 2024

This is triton's problem. Please uninstall triton and install triton-nigntly. Referenceissues/438 @xypjq @zzzendurance

Thank you!

from mamba.

Recommend Projects

triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 254208, Hardware limit: 101376. about mamba HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent