please let me know where is wrony

I made the mistake of doing: <div class="snippet-clipboard-content notranslate pos

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time. about sam HOT 6 CLOSED

davda54 commented on August 21, 2024

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

from sam.

Comments (6)

davda54 commented on August 21, 2024 3

Yeah exactly, you shouldn't use the same loss variable for both forward passes -- in your code, loss is the sum of loss_ts from both cycles, so autograd fails to backpropagate "through the graph a second time" :)

from sam.

AntBlo commented on August 21, 2024 3

I made the mistake of doing:

loss = criterion(outputs, labels)
loss.backward()
optimizer.first_step(zero_grad=True)

criterion(outputs, labels).backward()
optimizer.second_step(zero_grad=True)

But yeah, just make sure you don't reuse model outputs. If you follow davda54's direction and use criterion(model(inputs), labels) on the second backward pass, you should be good. Just as a heads-up if anyone else stumbles on it as I did.

Thank you davda54 for implementing this, by the way! I hope I'm not being rude in asking, but are we allowed to use this code in Kaggle contests? I saw there weren't any license for the code, so just curious.

from sam.

jeongHwarr commented on August 21, 2024 1

I had a similar problem.
Here is my code that made the mistake.

output, logit_map = model(image)

loss = 0
for t in range(num_tasks):
    loss_t, acc_t = get_loss(output, target, t, device, cfg)
    loss += loss_t
    loss_sum[t] += loss_t.item()
    acc_sum[t] += acc_t.item()

optimizer.zero_grad()
loss.backward()
optimizer.first_step(zero_grad=True)
output, logit_map = model(image)

for t in range(num_tasks):
    loss_t, acc_t = get_loss(output, target, t, device, cfg)
    loss += loss_t
    loss_sum[t] += loss_t.item()
    acc_sum[t] += acc_t.item()
loss.backward()
optimizer.second_step(zero_grad=True)

I think it is probably due to this issue

The loss variable for the first backward and the loss variable for the second backward must be different.
I forgot to initialize the variables.

Or check if you forgot to let the model predict the results again

from sam.

davda54 commented on August 21, 2024 1

Of course, feel free to use it wherever you want :) Thanks for asking, I've added an MIT license to the repo to make it clear.

from sam.

davda54 commented on August 21, 2024

Hi, could you please provide a minimal working example of the code causing this exception? Are you doing two separate forward-backward passes as illustrated in the README file? This error suggest that you do a single forward pass followed by multiple backward passes.

from sam.

want2333 commented on August 21, 2024

Hello Is there a connection between these two questions?

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2048, 196]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

 for i, data in enumerate(tqdm(trainloader)):
            if set == 'CUB':
                images, labels, _, _ = data
            else:
                images, labels = data
            images, labels = images.cuda(), labels.cuda()

        optimizer.zero_grad()


        proposalN_windows_score, proposalN_windows_logits, indices, \
        window_scores, _, raw_logits, local_logits, _ = model(images, epoch, i, 'train')

        def closure():

            raw_loss = criterion(raw_logits, labels)
            local_loss = criterion(local_logits, labels)
            windowscls_loss = criterion(proposalN_windows_logits,
                                        labels.unsqueeze(1).repeat(1, proposalN).view(-1))
            if epoch < 2:
                total_loss = raw_loss
            else:
                total_loss = raw_loss + local_loss + windowscls_loss
            total_loss.backward()             ###problem line
            return total_loss


        raw_loss = criterion(raw_logits, labels)
        local_loss = criterion(local_logits, labels)
        windowscls_loss = criterion(proposalN_windows_logits,
                           labels.unsqueeze(1).repeat(1, proposalN).view(-1))

        if epoch < 2:
            total_loss = raw_loss
        else:
            total_loss = raw_loss + local_loss + windowscls_loss

        total_loss.backward(retain_graph=True)

        optimizer.step(closure)

    scheduler.step()

from sam.

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time. about sam HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent