Giter Site home page Giter Site logo

Comments (8)

abduallahmohamed avatar abduallahmohamed commented on July 17, 2024

Hi,

For 1.
We used the relative position because there's no global reference system across different datasets. This approach of using the delta is common and you can notice it in previous work such as Social-LSTM/GAN. It's kinda of normalization technique for the positions. Then, the adjacency matrix A holds the similarity between the position(relative ). I hope this clarifies your confusion.

For 2.
I believe you are correct in using permute instead of view.
I developed the following test:

v = torch.zeros(1,5,3,2) # batch,features,time,pedestrian number
v[:,:,0,:] = 10  #Mark first time step 
v[:,:,1,:] = 20  #Mark second time step 
v[:,:,2,:] = 30 #Mark thirdtime step  


v[:,:,:,0] += 1  #Identify time step relative to first pedestrian 
v[:,:,:,1] += 2  #Identify time step relative to second pedestrian 

print(v)
tensor([[[[11., 12.],
          [21., 22.],
          [31., 32.]],

         [[11., 12.],
          [21., 22.],
          [31., 32.]],

         [[11., 12.],
          [21., 22.],
          [31., 32.]],

         [[11., 12.],
          [21., 22.],
          [31., 32.]],

         [[11., 12.],
          [21., 22.],
          [31., 32.]]]])

Using view:

v.view(v.shape[0],v.shape[2],v.shape[1],v.shape[3])
tensor([[[[11., 12.],
          [21., 22.],
          [31., 32.],
          [11., 12.],
          [21., 22.]],

         [[31., 32.],
          [11., 12.],
          [21., 22.],
          [31., 32.],
          [11., 12.]],

         [[21., 22.],
          [31., 32.],
          [11., 12.],
          [21., 22.],
          [31., 32.]]]])

Using permute:

v.permute(0,2,1,3)
tensor([[[[11., 12.],
          [11., 12.],
          [11., 12.],
          [11., 12.],
          [11., 12.]],

         [[21., 22.],
          [21., 22.],
          [21., 22.],
          [21., 22.],
          [21., 22.]],

         [[31., 32.],
          [31., 32.],
          [31., 32.],
          [31., 32.],
          [31., 32.]]]])

Nonetheless, When you train using view or permute you will obtain similar results but using view will produce slightly better results as it will have features from all relative time steps unlike permute which in the previous example didn't have features from the 3rd time step for the first pedestrian. The main goal of TXPCNN is to produce the time dimension as a feature channel.

from social-stgcnn.

hunterbobo avatar hunterbobo commented on July 17, 2024

Thank you. It is a interesting phenomenon that view () make it better.
And I have another question. Maybe I have wrong understanding of the code. But I have read the code of yours and the code of sgan for many times. I feel it seems there is a little difference between you computation of ADE and sgan when choosing the best trajectory to calculate ADE. For example, as shown in the following picture, there are two trajectories in the scene, the red one and the green one. If we predict 20 times of them, we get 20 predictions of the red one and 20 predictions of the green one. Your code will choose the best prediciton of the red and the best prediction of green to compute the ADE. But I think sgan is different, they select the best in the scene level not the trajectory level. sgan will compute the error sum of the two trajectories in the scene, and then select the best scene of the 20 predictions to compute the ADE. Could you please tell me whether I am right or wrong?
traj

from social-stgcnn.

abduallahmohamed avatar abduallahmohamed commented on July 17, 2024

A property to note regards TXP-CNN layer that it is not a permutation invariant as changes in the graph embedding right before TXP-CNN leads to different results. Other than this, if the order of pedestrians is permutated starting from the input to Social-STGCNN then the predictions are invariant.

You are right regards how the ADE is done on my side. For S-GAN I'm not sure about their implementation for ADE.
image

What I notice from SGAN https://github.com/agrimgupta92/sgan/blob/master/scripts/evaluate_model.py#L81 that they use raw FDE and then they take the minimum using https://github.com/agrimgupta92/sgan/blob/master/scripts/evaluate_model.py#L53. TBH, it doesn't make sense in my case to take minimum over the scene as we try to sample from a distribution. For each pedestrain and for each time step we have a distribution, I treat the distribution related to each pedestrian as whole and sample, take the minimum. In SGAN, they don't produce a distribution they produce an expectation, so their case is slight different from ours.

Let me know if something doesn't sound right.

from social-stgcnn.

hunterbobo avatar hunterbobo commented on July 17, 2024

In order to evaluate fairly, I think you should use the same evaluation method. I feel that you could treat the distribution related to every pedestrian in the same scene as whole and sample which means sampling the trajectories in the scene level and selecting the best scene, then you could get a comparable ADE with sgan. Or you could calculate the ADE of sgan by selecting the best trajectory. Because no matter how to predict the trajectories, selecting the best trajectory will always get a better ADE than selecting the best scene.
And could you please tell me how the other methods listed in Table 2 calculate ade? Many of them don't release their code, so I am not sure how they calculate the ADE.
image

from social-stgcnn.

abduallahmohamed avatar abduallahmohamed commented on July 17, 2024

Iā€™m not sure how they calculated their ADE also, I grabbed numbers from their papers.

Are you sure that SGAN do a scene level? If, so, we are closer to Social LSTM as regress against bi-variate distribution like ours. In a sense SGAN geneates a whole sample of all trajectories, so they are correlated unlike ours in which we generate a distribution parameters that we need to sample afterwards.

from social-stgcnn.

golnazhabibi3 avatar golnazhabibi3 commented on July 17, 2024

Hi, I am reading the comments and it is still not clear for me how ADE is calculated. Could you confirm if this is correct: for each time step, you same 20 points from the distribution, select the one with minimum error, then continue the estimation from that point (treat it as new starting point, predict again, sample from distribution, select the best point and so on) Thanks!

from social-stgcnn.

abduallahmohamed avatar abduallahmohamed commented on July 17, 2024

Hi @golnazhabibi3 if you followed the code https://github.com/abduallahmohamed/Social-STGCNN/blob/master/test.py#L95
Per each user, I sample 20 trajectories and select the minimum for this user. I sample the whole trajectory as is per user, there's not continuation per point as our model generates the distribution in a single inference step.

I hope this is clear

from social-stgcnn.

golnazhabibi3 avatar golnazhabibi3 commented on July 17, 2024

Thanks for your explanation and clarification!

from social-stgcnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.