Giter Site home page Giter Site logo

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time. about sam HOT 6 CLOSED

davda54 avatar davda54 commented on August 21, 2024
RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

from sam.

Comments (6)

davda54 avatar davda54 commented on August 21, 2024 3

Yeah exactly, you shouldn't use the same loss variable for both forward passes -- in your code, loss is the sum of loss_ts from both cycles, so autograd fails to backpropagate "through the graph a second time" :)

from sam.

AntBlo avatar AntBlo commented on August 21, 2024 3

I made the mistake of doing:

loss = criterion(outputs, labels)
loss.backward()
optimizer.first_step(zero_grad=True)

criterion(outputs, labels).backward()
optimizer.second_step(zero_grad=True)

But yeah, just make sure you don't reuse model outputs. If you follow davda54's direction and use criterion(model(inputs), labels) on the second backward pass, you should be good. Just as a heads-up if anyone else stumbles on it as I did.

Thank you davda54 for implementing this, by the way! I hope I'm not being rude in asking, but are we allowed to use this code in Kaggle contests? I saw there weren't any license for the code, so just curious.

from sam.

jeongHwarr avatar jeongHwarr commented on August 21, 2024 1

I had a similar problem.
Here is my code that made the mistake.

output, logit_map = model(image)

loss = 0
for t in range(num_tasks):
    loss_t, acc_t = get_loss(output, target, t, device, cfg)
    loss += loss_t
    loss_sum[t] += loss_t.item()
    acc_sum[t] += acc_t.item()

optimizer.zero_grad()
loss.backward()
optimizer.first_step(zero_grad=True)
output, logit_map = model(image)

for t in range(num_tasks):
    loss_t, acc_t = get_loss(output, target, t, device, cfg)
    loss += loss_t
    loss_sum[t] += loss_t.item()
    acc_sum[t] += acc_t.item()
loss.backward()
optimizer.second_step(zero_grad=True)

I think it is probably due to this issue

The loss variable for the first backward and the loss variable for the second backward must be different.
I forgot to initialize the variables.

Or check if you forgot to let the model predict the results again

from sam.

davda54 avatar davda54 commented on August 21, 2024 1

Of course, feel free to use it wherever you want :) Thanks for asking, I've added an MIT license to the repo to make it clear.

from sam.

davda54 avatar davda54 commented on August 21, 2024

Hi, could you please provide a minimal working example of the code causing this exception? Are you doing two separate forward-backward passes as illustrated in the README file? This error suggest that you do a single forward pass followed by multiple backward passes.

from sam.

want2333 avatar want2333 commented on August 21, 2024

Hello Is there a connection between these two questions?

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2048, 196]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

 for i, data in enumerate(tqdm(trainloader)):
            if set == 'CUB':
                images, labels, _, _ = data
            else:
                images, labels = data
            images, labels = images.cuda(), labels.cuda()

        optimizer.zero_grad()


        proposalN_windows_score, proposalN_windows_logits, indices, \
        window_scores, _, raw_logits, local_logits, _ = model(images, epoch, i, 'train')

        def closure():

            raw_loss = criterion(raw_logits, labels)
            local_loss = criterion(local_logits, labels)
            windowscls_loss = criterion(proposalN_windows_logits,
                                        labels.unsqueeze(1).repeat(1, proposalN).view(-1))
            if epoch < 2:
                total_loss = raw_loss
            else:
                total_loss = raw_loss + local_loss + windowscls_loss
            total_loss.backward()             ###problem line
            return total_loss


        raw_loss = criterion(raw_logits, labels)
        local_loss = criterion(local_logits, labels)
        windowscls_loss = criterion(proposalN_windows_logits,
                           labels.unsqueeze(1).repeat(1, proposalN).view(-1))

        if epoch < 2:
            total_loss = raw_loss
        else:
            total_loss = raw_loss + local_loss + windowscls_loss

        total_loss.backward(retain_graph=True)

        optimizer.step(closure)

    scheduler.step()

from sam.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.