Giter Site home page Giter Site logo

If the positions "x_coor" and "y_coor" should be swapped in Line 466 and 468 det3d/models/necks/rpn_transformer? about centerformer HOT 7 CLOSED

Castiel-Lee avatar Castiel-Lee commented on June 20, 2024
If the positions "x_coor" and "y_coor" should be swapped in Line 466 and 468 det3d/models/necks/rpn_transformer?

from centerformer.

Comments (7)

edwardzhou130 avatar edwardzhou130 commented on June 20, 2024

Thanks for reporting this error. I think the position is correct. I generate the index in the 2D matrix in this way:

feat_id = (
neighbor_coords[:, :, :, 1] * (W // (2**i))
+ neighbor_coords[:, :, :, 0]
) # pixel id [B, 500, k]

Here neighbor_coords[:, :, :, 1] comes from the y_coord and neighbor_coords[:, :, :, 0] comes from the x_coord.

However, the bug is probably this:

neighbor_coords = torch.clamp(
neighbor_coords, min=0, max=H // (2**i) - 1
) # prevent out of bound

The coordinates should be clamped separately using their own size rather than just assuming H = W.

I don't have a machine at hand to test. Can you check if changing this can fix your error?

from centerformer.

Castiel-Lee avatar Castiel-Lee commented on June 20, 2024

Hello,

I notice that, the bug reporting can disappear through x_coor and y_coor being clamped within [0, H] and [0, W]. But it could probably leave the fact behind that x_coor = order // W and y_coor = order % W. I did a small test:
image
According to the picture, the revised version seems correctly make x_coor = 7 and y_coor =4.

Could you think this over and recheck the implementation when you are available? Of course, if "transposing" or some other purposes exist, I would apologize for my misunderstanding and bothering you.

from centerformer.

edwardzhou130 avatar edwardzhou130 commented on June 20, 2024

Oh, I see. So the x_coor and y_coor here do not mean the row and col indexes of this value in the 2D matrix (like [7,4]). Conversely, it means the indexes on the height dimension (y_coor) and width dimension (x_coor), which are common to describe the coordinates of a pixel in an image.

I will recheck the code just in case there are some other bugs.

from centerformer.

Castiel-Lee avatar Castiel-Lee commented on June 20, 2024

Oh, I see. So the x_coor and y_coor here do not mean the row and col indexes of this value in the 2D matrix (like [7,4]). Conversely, it means the indexes on the height dimension (y_coor) and width dimension (x_coor), which are common to describe the coordinates of a pixel in an image.

I will recheck the code just in case there are some other bugs.

Since height dimension (y_coor) and width dimension (x_coor), in Line 469, it should be torch.stack([y_coor, x_coor], dim=2), right?

from centerformer.

edwardzhou130 avatar edwardzhou130 commented on June 20, 2024

The order of the x_coor and y_coor variables does not matter in this case, as long as you remember the correct dimension that each variable represents. The deformable attention layer also requires storing the reference point position in the same way.

from centerformer.

Castiel-Lee avatar Castiel-Lee commented on June 20, 2024

Oh, I see. I will go through it again and recheck it. Thank you so much for the clarification.

from centerformer.

edwardzhou130 avatar edwardzhou130 commented on June 20, 2024

Great! If you have any further questions or concerns, feel free to reopen the issue.

from centerformer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.