rohitgirdhar / cater Goto Github PK
View Code? Open in Web Editor NEWCATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Home Page: https://rohitgirdhar.github.io/CATER/
License: Apache License 2.0
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Home Page: https://rohitgirdhar.github.io/CATER/
License: Apache License 2.0
There are several corrupted video files in the downloadable zip datasets, e.g. 'videos/CATER_new_004798.avi' in the
max2action task (see screenshot). We downloaded the archive multiple times and the same files are corrupted.
How should we handle this?
Code to see the issue:
wget https://cmu.box.com/shared/static/jgbch9enrcfvxtwkrqsdbitwvuwnopl0.zip
# extract one corrupted video
unzip videos.zip videos/CATER_new_004798.avi
# open in a media player:
# vlc videos/CATER_new_004798.avi
# mplayer videos/CATER_new_004798.avi
Hi, when I'm visualizing the results of the tracking method (e.g. turn on DEBUG_STORE_VIDEO), I found that the gt location of spl does not place the snitch, but the ball. Btw, I can reproduce the number 33.9 exactly.
You can see in the following figure and videos, the gt (in blue rectangle) refers to the yellow ball, but not the snitch. I check the gt number is the same as shown in the visualization.
Hi Authors,
Thanks for your work.
I tried generating 3 videos to test the dataset.
while actions_order_dataset seems to return frame, label and classes,
The output file (train.txt) under folder action_order_uniq contains no information about it.
It contains input something like
/images/CLEVR_new_000002.avi 53,54,60,69,70,71,72,74,77,78,81,83,129,138,144,153,155,156,157,161,162,165,167,173,179,187,188,195,197,198,200,203,204,207,209,257,263,264,265,270,272,279,281,282,284,287,288,291,292,293,381,382,383,387,389,390,392,396,398,405,407,408,410,411,412,413,414,415,417,419,423,425,430,431,432,434,438,440,447,449,450,452,455,456,459,460,461,465,471,474,480,489,490,491,492,495,497,498,501,502,509,515,518,524,532,533,536,539,545,549,551,555,557,558,560,564,565,566,573,575,576,577,578,580,581,582,585,586,587
How can i get frame by frame actions and classes?
Hello Authors,
For our work we require the pixel coordinates of all the objects. We could read the world (3d_coords) of the objects from the json files. Is there a way to convert that directly to pixel_coords??
We tried to compute the homographic transformation matrix using corresponding initial 3d_coords and pixel_coords of objects available in the json files, which doesn't produce accurate transformation.
function "get_camera_coords" in utils.py seems to do the job but we donot have access to the parameters to run the function.
Thank You
Hi, I appreciate your wonderful work. However, when I unzip the pre-generated data downloaded from the direct links, it raises that some files inside are broken. I confirm that the size of the zip file is right and a simple retry doesn't get it right. How could I address it?
Hi,
Is there any instruction for the json format data in the pre-generated scene set?
Thanks.
Hi, I want to run some segmentation codes (e.g. Mask RCNN) so I need the gt masks of objects in the scene. CLEVRER has provided them but CATER does not. I want to know do you plan to release the masks of objects? If not, how can we get them?
Hello Authors,
Thanks for such a great work.
I could not find the labels corresponding to 301 classes of Task 2.
Can you please point out that list?
Thank You
Dear Authors @rohitgirdhar ,
I tried to download and use the VM Spec but the downloaded spec_v0.img file is corrupted.
May I ask whether the link to download the VM spec is updated?
Could you kindly share the working URL to download the VM spec for CATER?
I used the download link in this webpage: https://cmu.box.com/s/krg7ehliaidruxjk21nfxsa0gge2uf2o, as provided in CATER-master/generate/README.md. However, when I tried to open the downloaded 3.0 GB spec_v0.img file, it says “The disc image file is corrupted”. If I specify the directory of the downloaded spec_v0.img file in launch.py and run launch.py, it prompts “singularity: not found”, as if there weren’t any spec_v0.img file. I have tried downloading and opening the spec_v0.img file on MacOS, Windows and Ubuntu, but all led to the same issue. I have tried using “wget” command to download the spec_v0.img file, but I didn’t find the specific URL for downloading.
I'm interested in your work and would like to learn from it.
Thank you very much!
Hello Rohit, I ran into yet another error which i could not solve after repeatedly trying everything i could in the last 4 days.
In file included from /home/u1698461/Downloads/CATER-master/pytorch_once_again/pytorch/caffe2/video/customized_video_io.cc:25:0:
/home/u1698461/Downloads/CATER-master/pytorch_once_again/pytorch/caffe2/video/customized_video_io.h:30:10: fatal error: caffe/proto/caffe.pb.h: No such file or directory
#include "caffe/proto/caffe.pb.h"
^~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
caffe2/CMakeFiles/torch_cpu.dir/build.make:6166: recipe for target 'caffe2/CMakeFiles/torch_cpu.dir/video/customized_video_io.cc.o' failed
make[2]: *** [caffe2/CMakeFiles/torch_cpu.dir/video/customized_video_io.cc.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/u1698461/Downloads/CATER-master/pytorch_once_again/pytorch/caffe2/video/customized_video_input_op.cc:25:0:
/home/u1698461/Downloads/CATER-master/pytorch_once_again/pytorch/caffe2/video/customized_video_input_op.h:43:10: fatal error: caffe2/utils/thread_pool.h: No such file or directory
#include "caffe2/utils/thread_pool.h"
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
caffe2/CMakeFiles/torch_cpu.dir/build.make:6153: recipe for target 'caffe2/CMakeFiles/torch_cpu.dir/video/customized_video_input_op.cc.o' failed
make[2]: *** [caffe2/CMakeFiles/torch_cpu.dir/video/customized_video_input_op.cc.o] Error 1
CMakeFiles/Makefile2:9446: recipe for target 'caffe2/CMakeFiles/torch_cpu.dir/all' failed
make[1]: *** [caffe2/CMakeFiles/torch_cpu.dir/all] Error 2
Makefile:159: recipe for target 'all' failed
make: *** [all] Error 2
Steps Followed:
I think the reason for this error is that the provided caffe2_customized_ops/video is the modified version of older caffe2/video which has since been modified in the official Pytorch repository.
I tried reaching Xiaolong Wang through mail, but didn't get any reply from him.
It would be really helpful if you could spare out just a couple of hours over the weekend in the issue.
The work that i have been doing on the CATER dataset over the last 4 months would be really affected if i cannot run R3D.
Please let me know if you need to know anything.
Thank You
The data generation script doesn't have the singularity/spec_v0.img file
Hello Rohit,
While trying to run the following command for testing:
"python launch.py -c configs_cater/001_I3D_NL_localize_imagenetPretrained_32f_8SR.yaml"
i am getting the following error:
Running
PYTHONPATH=pwd
/lib/:$PYTHONPATH PYTHONPATH=pwd
/external_lib/average-precision/python:$PYTHONPATH python tools/test_net_video.py --config_file configs_cater/001_I3D_NL_localize_imagenetPretrained_32f_8SR.yaml CHECKPOINT.DIR outputs/configs_cater/001_I3D_NL_localize_imagenetPretrained_32f_8SR.yaml TEST.TEST_FULLY_CONV True 2>&1 | tee outputs/configs_cater/001_I3D_NL_localize_imagenetPretrained_32f_8SR.yaml/log_test_net_video.py.txt
/home/u1698461/Downloads/CATER-master/baselines/video-nonlocal-net/lib/core/config.py:349: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
yaml_config = AttrDict(yaml.load(fopen))
Ignoring @/caffe2/caffe2/contrib/nccl:nccl_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops as it is not a valid file.
Ignoring @/caffe2/caffe2/contrib/gloo:gloo_ops_gpu as it is not a valid file.
Traceback (most recent call last):
File "tools/test_net_video.py", line 351, in
main()
File "tools/test_net_video.py", line 336, in main
cfg_from_file(args.config_file)
File "/home/u1698461/Downloads/CATER-master/baselines/video-nonlocal-net/lib/core/config.py", line 350, in cfg_from_file
merge_dicts(yaml_config, __C)
File "/home/u1698461/Downloads/CATER-master/baselines/video-nonlocal-net/lib/core/config.py", line 333, in merge_dicts
type(dict_b[key]), type(value), key)
ValueError: Type mismatch (<class 'bytes'> vs. <class 'str'>) for config key: DATADIR
Initially i was getting similar error while generating LMDB running : python process_data/cater/gen_lmdbs.py
Traceback (most recent call last):
File "create_video_lmdb.py", line 116, in
main()
File "create_video_lmdb.py", line 111, in main
create_an_lmdb_database(args.list_file, args.dataset_dir)
File "create_video_lmdb.py", line 70, in create_an_lmdb_database
video_tensor.string_data.append(video_data)
TypeError: '/home/u1698461/Downloads/CATER-master/max2action/videos/CATER_new_005360.avi' has type str, but expected one of: bytes
which i resolved by changing the following line in create_video_lmdb.py,
from:
def create_an_lmdb_database(list_file, output_file, use_local_file=True):
to:
def create_an_lmdb_database(list_file, output_file, use_local_file=False):
But i am not able to resolve the error at the top.
Can you please guide to the solution or perhaps explain what to look for to solve the error?
I have installed Caffe2 in python=3.6 environment using : "conda install pytorch-nightly-cpu -c pytorch"
Thank You..
Hi Rohit,
Awesome work. Thanks for contributing to the society, and for sharing the code with the community.
Can you share the model weights on non-local-network that you fined-tuned on CATER?
Thanks,
Nour
In table 3 on paper, apparently you used 1 or 3 frames for TSN experiments.
What does it mean? Why did you train using 3 frames and test using 250 frames?
The task 3 is really challenging and it doesn't make sense to solve it using only 3 frames. I must have been mistaken about the setup.
Does it mean you sampled 3 frames per segment? Then how many segments are used and how many total frames are seen on training time?
Also, what is the detailed setup for the TSN+LSTM? It appears that you used 10 clips for the LSTM on 3D models, but using TSN did you still use 10 "frames"? Or how did you set it up for the TSN?
Lastly, do you have any plan for releasing the TSN code?
Thanks a lot for your awesome research!!
Hi. I download the dataset.
But the .txt file in "actions_order_uniq" and "actions_present" seems same.
It's not clear how to install for generation purposes. The instructions say: "all CLEVR requirements" but that doesn't really work because the CLEVR repo also doesn't have clear instructions.
Would be better to list in detail how to install or provide an install script.
In LSTM code, I notice that 'To run the LSTM code, first extract the features using the TSN trained models'
Actually, I can't understand how to use this code. Can you provide some details or some h5/pkl files mentiond below?
def read_data(data_dir):
if osp.exists(args.data_dir + '_val_feats.h5'):
print('This looks like TSN outputs, reading it so.')
val_data = read_data_tsn(args.data_dir + '_val_feats.h5')
train_data = read_data_tsn(args.data_dir + '_train_feats.h5')
elif osp.exists(osp.join(
args.data_dir, 'results_probs_test_fullLbl.pkl')):
print('This looks like NL outputs, reading it so.')
assert args.lbl_dir is not None, (
'lbl_dir must be set for NL models, since the labels are not '
'stored in the PKL file.')
val_data = read_data_nl(
osp.join(args.data_dir, 'results_probs_test_fullLbl.pkl'),
osp.join(args.lbl_dir, 'val.txt'))
train_data = read_data_nl(
osp.join(args.data_dir, 'results_probs_train_fullLbl.pkl'),
osp.join(args.lbl_dir, 'train.txt'))
else:
raise NotImplementedError('Dunno how to read data directory {}'.format(
data_dir))
return train_data, val_data
Thanks!
Hi,
Thanks for your interesting work.
Could you please explain the spatial relationships data?:
F = json.load(open('max2action/scenes/CATER_new_004617.json'))
len(F['relationships']['behind']) = 56
The number of frames: 40
The original spatial relations should be: if j is in F['relationships']['behind'][i] then object j is behind of object i (see here). However, here the index i is not an object (there aren't 56 objects in the scene). Could you clarify it?.
Thanks,
Originally posted by @roeiherz in #3 (comment)
I'm working on making CATER one of the available dataset in NVIDIA NeMo. To do so, I need to be able to download all data programmatically through static download links. As I noticed, only Scenes and Videos directory have static download links. The Lists directory can only be downloaded manually from box.com. Can you also add static download links for the Lists directory? Thank you.
Hello Rohit,
I am getting the following error while running-
"python launch.py -c configs_cater/001_I3D_NL_localize_imagenetPretrained_32f_8SR.yaml -t test"
Traceback (most recent call last):
File "tools/test_net_video.py", line 351, in
main()
File "tools/test_net_video.py", line 346, in main
store_vis=args.store_vis)
File "tools/test_net_video.py", line 294, in test_net
store_vis=store_vis)
File "tools/test_net_video.py", line 110, in test_net_one_section
test_model.build_model()
File "/home/u1698461/Downloads/CATER-master/baselines/video-nonlocal-net/lib/models/model_builder_video.py", line 119, in build_model
train=self.train, force_fw_only=self.force_fw_only
File "/home/u1698461/Downloads/CATER-master/baselines/video-nonlocal-net/lib/models/model_builder_video.py", line 230, in create_data_parallel_model
use_nccl=not cfg.DEBUG, # org: True
File "/home/u1698461/anaconda3/envs/last/lib/python2.7/site-packages/caffe2/python/data_parallel_model.py", line 39, in Parallelize_GPU
Parallelize(*args, **kwargs)
File "/home/u1698461/anaconda3/envs/last/lib/python2.7/site-packages/caffe2/python/data_parallel_model.py", line 236, in Parallelize
input_builder_fun(model_helper_obj)
File "/home/u1698461/Downloads/CATER-master/baselines/video-nonlocal-net/lib/models/model_builder_video.py", line 207, in add_video_input
batch_size=batch_size,
File "/home/u1698461/Downloads/CATER-master/baselines/video-nonlocal-net/lib/models/model_builder_video.py", line 171, in AddVideoInput
data, label = model.net.CustomizedVideoInput(
File "/home/u1698461/anaconda3/envs/last/lib/python2.7/site-packages/caffe2/python/core.py", line 2205, in getattr
",".join(workspace.C.nearby_opnames(op_type)) + ']'
AttributeError: Method CustomizedVideoInput is not a registered operator. Did you mean: []
Can you please guide me to the procedure you followed to install caffe2??
I used the following command to install caffe2 with python=2.7:
conda install pytorch-nightly-cpu -c pytorch
@rohitgirdhar
We have uploaded the dataset to Baidu Cloud and now share it with people from Mainland China as follows.
Baidu Cloud link
code:d9nk
Hello Rohit,
xiaolonw's caffe2 repository is missing some modules, so i cannot build caffe2 using that.
The official version of caffe2 (merged with Pytorch) doesn't support Python = 2.7, so building from the official repository is also not an option.
It seems the only way to use your code is if it can be ported to python=3.6.
So, is there any way your code can be ported to Python=3.6 ?
It will be really helpful if you have any other suggestions.
Thank You
Hello Rohit,
The yaml file in the repository is corresponding to Task 3.
Can you please share the yaml file for Task 1?
Thank You
Hello Authors,
Can you please give an idea if the R3D network would be useful for localizing the actions as well ?
As mentioned in the paper, the actions are restricted to occur within one of the 10 slots of 30 frames each.
So instead of feeding in the whole video (12.5 secs) if we split the videos in 10 parts(1.25 secs each), and feed to the R3D model, would the accuracy would be close to 98% (reported)?
Thank You
Hi, I'd like to get the names of the classes in gt files in task 1 and task2, which should be 14 classes and 301 classes in total. Can you help me?
Hi,
Is there any way to extract the bounding box annotations per frame?.
I succeeded to extract (cx,cy) per box but how can I calculate the width and height of the box?
Many thanks,
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.