michigancog / gaze-attention Goto Github PK

View Code? Open in Web Editor NEW

22.0 3.0 5.0 5.46 MB

Integrating Human Gaze into Attention for Egocentric Activity Recognition (WACV 2021)

License: MIT License

Python 100.00%

wacv wacv2021 egocentric-action-recognition direct-optimization gaze-input gtea-gaze-dataset

gaze-attention's People

Contributors

Stargazers

Watchers

Forkers

cvlinks silicx daiguangzhao daringpig liuyyy111

gaze-attention's Issues

Reproducing results

Hi and thanks for the great work.

I have difficulties reproducing the result reported on the EGTEA Gaze+ dataset. I'm using your provided trained weights and following the guide on code usage I get this number on different splits:

test_split1.txt : acc: 36.85, 49.21 / 0:22:27
test_split2.txt :acc: 47.65, 57.44 / 0:15:22
test_split3.txt :acc: 50.41, 60.14 / 0:15:07

How should I reproduce 69.73%?

I'm using parameters as default:

parser.add_argument('--mode', default='test', help='train | test')
parser.add_argument('--crop', type=int, default=224, help='for spatial cropping')
parser.add_argument('--trange', type=int, default=24, help='temporal range')
parser.add_argument('--stride', type=int, default=8, help='pooling stride for gaze prediction')
parser.add_argument('--b', type=int, default=1, help='batch size')
parser.add_argument('--wd', type=float, default=4e-5, help='weight decay')
parser.add_argument('--it1', type=int, default=8000, help='first decay point')
parser.add_argument('--it2', type=int, default=15000, help='second decay point')
parser.add_argument('--iters', type=int, default=18000, help='number of max iterations for training')
parser.add_argument('--lr', type=float, default=0.032, help='learning rate')
parser.add_argument('--ngpu', type=int, default=1, help='number of GPUs to use')
parser.add_argument('--eps', type=float, default=1000, help='epsilon for the gradient estimator')
parser.add_argument('--anneal', type=float, default=1e-3, help='anneal rate for epsilon')

parser.add_argument('--datapath', default='dataset', help='path to dataset')
parser.add_argument('--datasplit', type=int, default=1, help='data split for the cross validation')
parser.add_argument('--weight', default='weights/i3d_iga_best1_base.pt', help='path to the weight file for the base network')
parser.add_argument('--seed', type=int, default=1, help='random seed')
parser.add_argument('--test_sparse', action='store_true', help='whether to test sparsely for fast evaluation')

Reproduction trouble: how to prepare images_flow

Hello, I'm trying to reproduce your research results.

When I download EGTEA Gaze+ dataset, I found that it doesn't contain rgb images & flow images.
So I created them by using denseflow, which is reccommended to use in another discussion.

After creating images in dataset/images_flow or images_rgb, I ran the code below as instructed.
python3 main.py --mode test

However, the error occured and failed to reproduce your result.

datasplit:     1
weight:        weights/i3d_iga_best1_base.pt
mode:          test
test_sparse:   False
loading weight file: weights/i3d_iga_best1_base.pt
loading weight file: weights/i3d_iga_best1_gaze.pt
loading weight file: weights/i3d_iga_best1_attn.pt
run on cuda
[ WARN:[email protected]] global loadsave.cpp:244 findDecoder imread_('dataset/images_flow/P05-R01-PastaSalad-160540-162131-F003848-F003896/u/0048.jpg'): can't open/read file: check file path/integrity
[ WARN:[email protected]] global loadsave.cpp:244 findDecoder imread_('dataset/images_flow/P05-R01-PastaSalad-160540-162131-F003848-F003896/v/0048.jpg'): can't open/read file: check file path/integrity
[ WARN:[email protected]] global loadsave.cpp:244 findDecoder imread_('dataset/images_flow/P05-R01-PastaSalad-169741-171463-F004069-F004120/u/0051.jpg'): can't open/read file: check file path/integrity
[ WARN:[email protected]] global loadsave.cpp:244 findDecoder imread_('dataset/images_flow/P05-R01-PastaSalad-169741-171463-F004069-F004120/v/0051.jpg'): can't open/read file: check file path/integrity
[ WARN:[email protected]] global loadsave.cpp:244 findDecoder imread_('dataset/images_flow/P04-R06-GreekSalad-682170-683940-F016368-F016419/u/0051.jpg'): can't open/read file: check file path/integrity
[ WARN:[email protected]] global loadsave.cpp:244 findDecoder imread_('dataset/images_flow/P04-R06-GreekSalad-682170-683940-F016368-F016419/v/0051.jpg'): can't open/read file: check file path/integrity
[ WARN:[email protected]] global loadsave.cpp:244 findDecoder imread_('dataset/images_flow/P04-R06-GreekSalad-767250-769130-F018410-F018464/u/0054.jpg'): can't open/read file: check file path/integrity
[ WARN:[email protected]] global loadsave.cpp:244 findDecoder imread_('dataset/images_flow/P04-R06-GreekSalad-767250-769130-F018410-F018464/v/0054.jpg'): can't open/read file: check file path/integrity
Traceback (most recent call last):
  File "main.py", line 261, in <module>
    main()
  File "main.py", line 85, in main
    test(test_loader, model_base, model_gaze, model_attn, num_action)
  File "main.py", line 222, in test
    for i, (rgb, flow, label) in enumerate(test_loader, 1):
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 681, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.8/dist-packages/torch/_utils.py", line 461, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/hayashide/catkin_ws/src/third_party/Gaze-Attention/dataset.py", line 67, in __getitem__
    fimg = np.concatenate((fimgu[..., np.newaxis], fimgv[..., np.newaxis]), -1)
TypeError: 'NoneType' object is not subscriptable

Please give me some advice to resolve this issue.
Thanks,

In the dataloader, I noticed gaze data is read from npy files. There should be an intermediate step where you preprocessed gaze data from the text file in the original labels. Is there any instructions on how to do that? I read in the paper you're using one hot encoded approach where value "1" is stored on the x-y grid where gaze is pointing at. I just need a clarification that if my understanding is true.

EGTEA preparation

Hi, I have a problem with the dataset preparation. In particular your code takes as input single rgb images but the Egetea dataset provides only videos (which they call cropped clips)as you can see in the readme of the EGTEA Gaze + dataset:

Can you explain how you obtained such images? Or can you provide a link to that images?
Thanks

What is pmaps and where do you introduce the gaze data

Hi,
I went through the dataset.py code and I can't find exactly where you make use of the gaze information provided EGTEA Gaze +. (A sample output of gaze_data

What's pmaps and how to generate these numpy files.

Gaze-Attention/dataset.py

Line 50 in 1e80952

    
           path_pmap = os.path.join(self.datapath, 'pmaps', cn, '%04d.npy' % (start_idx+i*self.stride))

Thanks,

michigancog / gaze-attention Goto Github PK

gaze-attention's People

Contributors

Stargazers

Watchers

Forkers

gaze-attention's Issues

Reproducing results

Reproduction trouble: how to prepare images_flow

Gaze data preparation

EGTEA preparation

What is pmaps and where do you introduce the gaze data

Whats pmap?

grad-cam generation

Data preparation

EGTEA Gaze+ Optical Flow dataset

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent