ddshan / hand_object_detector Goto Github PK

View Code? Open in Web Editor NEW

220.0 220.0 61.0 777 KB

Project and dataset webpage:

Home Page: https://fouheylab.eecs.umich.edu/~dandans/projects/100DOH/

License: MIT License

Python 68.23% MATLAB 0.35% C++ 2.80% Cuda 13.45% Shell 0.12% C 12.37% Cython 2.69%

cvpr2020 dataset handobjectdetection interactiondetection

hand_object_detector's People

Contributors

Stargazers

Watchers

Forkers

rkotha82 wuxiaolianggit liuguoyou alliedel happycodingsusan penincillin hucui2022 caihaunqai xypb ericwang0701 carlosedubarreto dumbpy gintsuki9349 adeshkadambi axinletter tan5o zahraanam forever208 derrick-xwp vailatuts sea-salt-biscuit alena-z hyokong faderani emily1314 dywsjtu kkjsk albchim haonan-duan romerobarata seoyoon130 sidriaz wangjiejie2022 miyamotost ryanpiao star-xing1 junyaoshi vida-nyu chiwang3322 mornydew ajonnavittula twotwo2 federicasmeriglio98 dkguo yesandy honglinchu chenshaoyu1995 jiahexu relh egeozguroglu larsrpe peihaochen oliviaylee mimic-labs ccxguo ojh6404 myfatemi04

hand_object_detector's Issues

Error AtomicAdd running python setup.py build develop

I am trying to build faster_rcnn with ```python setup.py build develop``. I am getting this error (every time atomicAdd is used):

error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (double *, double)
          detected during instantiation of "void RoIAlignBackwardFeature(int, const T *, int, T, int, int, int, int, int, int, T *, const T *) [with T=double]"

The final error log is:

Traceback (most recent call last):
  File "setup.py", line 59, in <module>
    setup(
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
    return distutils.core.setup(**attrs)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
    self.run_command(cmd)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
    super().run_command(command)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
    self.run_command(cmd_name)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
    self.distribution.run_command(command)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
    super().run_command(command)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
    self.build_extensions()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
    build_ext.build_extensions(self)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
    self._build_extensions_serial()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
    self.build_extension(ext)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
    _build_ext.build_extension(self, ext)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
    objects = self.compiler.compile(
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 586, in unix_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1487, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension```

I had the same environment as you mentioned, one possible issue could be with CUDA driver.
Do you have any idea on how to fix it?

PS: it builds perfectly when I am not using GPU

nvidia semantic segmentation auto-label

Hello, I saw that you also asked about auto-label under the segmentation project of nvidia. I also have some questions in this regard. Would you please leave a contact information? I want to ask you for advice, thank you

RGB & BGR

Hi @ddshan

Thanks for your awesome work.

The demo.py support both image input and webcam input. However, you use imread to read images, which return RGB; and use cv2 to read video, which should return BGR. Is this wrong?

hand_object_detector/demo.py

Lines 254 to 270 in 6b46cb0

    
           # Get image from the webcam 
        
           if webcam_num >= 0: 
        
             if not cap.isOpened(): 
        
               raise RuntimeError("Webcam could not open. Please check connection.") 
        
             ret, frame = cap.read() 
        
             im_in = np.array(frame) 
        
           # Load the demo image 
        
           else: 
        
             im_file = os.path.join(args.image_dir, imglist[num_images]) 
        
             im_in = np.array(imread(im_file)) 
        
             # resize 
        
             # im_in = np.array(Image.fromarray(im_in).resize((640, 360))) 
        
           if len(im_in.shape) == 2: 
        
             im_in = im_in[:,:,np.newaxis] 
        
             im_in = np.concatenate((im_in,im_in,im_in), axis=2) 
        
           # rgb -> bgr 
        
           im = im_in[:,:,::-1]

Test imgaes with batching

Thanks for sharing the excellent project. I wander if I can test images with batching directly or modify some files。Thanks again~

colab notebook : simple test with youtube

I put your simple test into a quick and dirty colab notebook to process youtube videos.
https://colab.research.google.com/drive/1UISXHFOsbwJSs-STLXjvytu4AF6clUGl

How can I make annotations for my own dataset?

Hi @ddshan , thank you for your excellent code. I want to know how to make annotations for my own dataset so I can possibly fine-tune on it?
I already know how to make bbox annotations for hands and objects, but I don't know how to annotate hand_side, contact_state and offset, could you please tell me how to make these annotations? Thank you!

Size mismatch error with pre-trained model

Hi, I got error message: "size mismatch for extension_layer.hand_contact_state_layer.3.weight: copying a param with shape torch.Size([5, 32]) from checkpoint, the shape in current model is torch.Size([4, 32])." when running your code with provided pre-trained model. A temporary work around would be checkout to previous version.

How can i get ego datasets?

i would like to use 100DOH+ego dataset to train my own hand detection model, how can i get ego dataset?

License

Thanks for releasing this wonderful work ! It works really well.

I want to use part of the code in my own project but I want the license is missing ?

Would you mind add the license if feasible ?

Thanks !

Why is the O label hardcoded?

obj_state = state_map2[int(obj_dets[i, 5])] will never map to O, and in draw_obj_mask() the O label is hardcoded. Why is that?

Cannot run demo

Hi, great work!

I can't run the demo.
Whenever I run the demo, I get this error while loading the checkpoint weights to the network.

*** RuntimeError: CUDA error: unknown error

If I try to load the weights in the python interactive mode, I have no problem.
Some dependencies related to these imports seem to cause the trouble, but not sure.

from model.roi_layers import nms
from model.faster_rcnn.vgg16 import vgg16
from model.faster_rcnn.resnet import resnet

Please help me.

RuntimeError: Not compiled with GPU support

Hi @ddshan , I tried to setup the v-env based on the instructions in the README.md, but on running both demo.py as well as test_net.py I get this error of RuntimeError: Not compiled with GPU support. Not sure what is going wrong during installation.

Do you happen to know the root cause?

(handobj_new) kavitshah@frerd001:~/fair/hand_object_detector$ CUDA_VISIBLE_DEVICES=0 python test_net.py --model_name=handobj_100K --save_name=handobj_100K --cuda --checkepoch=8 --checkpoint=132028
Called with args:
Namespace(cfg_file='cfgs/resnet101.yml', checkepoch=8, checkpoint=132028, checksession=1, class_agnostic=False, cuda=True, dataset='pascal_voc', large_scale=False, load_dir='models', mGPUs=False, model_name='handobj_100K', net='res101', parallel_type=0, save_name='handobj_100K', set_cfgs=None, thresh_hand=0.1, thresh_obj=0.1, vis=False)
Using config:
{'ANCHOR_RATIOS': [0.5, 1, 2],
 'ANCHOR_SCALES': [8, 16, 32, 64],
 'CROP_RESIZE_WITH_MAX_POOL': False,
 'CUDA': False,
 'DATA_DIR': '~/fair/hand_object_detector/data',
 'DEDUP_BOXES': 0.0625,
 'EPS': 1e-14,
 'EXP_DIR': 'res101',
 'FEAT_STRIDE': [16],
 'GPU_ID': 0,
 'MATLAB': 'matlab',
 'MAX_NUM_GT_BOXES': 20,
 'MOBILENET': {'DEPTH_MULTIPLIER': 1.0,
               'FIXED_LAYERS': 5,
               'REGU_DEPTH': False,
               'WEIGHT_DECAY': 4e-05},
 'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]),
 'POOLING_MODE': 'align',
 'POOLING_SIZE': 7,
 'RESNET': {'FIXED_BLOCKS': 1, 'MAX_POOL': False},
 'RNG_SEED': 3,
 'ROOT_DIR': '~/fair/hand_object_detector',
 'TEST': {'BBOX_REG': True,
          'HAS_RPN': True,
          'MAX_SIZE': 1000,
          'MODE': 'nms',
          'NMS': 0.3,
          'PROPOSAL_METHOD': 'gt',
          'RPN_MIN_SIZE': 16,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 300,
          'RPN_PRE_NMS_TOP_N': 6000,
          'RPN_TOP_N': 5000,
          'SCALES': [600],
          'SVM': False},
 'TRAIN': {'ASPECT_GROUPING': False,
           'BATCH_SIZE': 128,
           'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
           'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
           'BBOX_NORMALIZE_TARGETS': True,
           'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
           'BBOX_REG': True,
           'BBOX_THRESH': 0.5,
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0.0,
           'BIAS_DECAY': False,
           'BN_TRAIN': False,
           'DISPLAY': 20,
           'DOUBLE_BIAS': False,
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'GAMMA': 0.1,
           'HAS_RPN': True,
           'IMS_PER_BATCH': 1,
           'LEARNING_RATE': 0.001,
           'MAX_SIZE': 1000,
           'MOMENTUM': 0.9,
           'PROPOSAL_METHOD': 'gt',
           'RPN_BATCHSIZE': 256,
           'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'RPN_CLOBBER_POSITIVES': False,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 8,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POSITIVE_WEIGHT': -1.0,
           'RPN_POST_NMS_TOP_N': 2000,
           'RPN_PRE_NMS_TOP_N': 12000,
           'SCALES': [600],
           'SNAPSHOT_ITERS': 5000,
           'SNAPSHOT_KEPT': 3,
           'SNAPSHOT_PREFIX': 'res101_faster_rcnn',
           'STEPSIZE': [30000],
           'SUMMARY_INTERVAL': 180,
           'TRIM_HEIGHT': 600,
           'TRIM_WIDTH': 600,
           'TRUNCATED': False,
           'USE_ALL_GT': True,
           'USE_FLIPPED': True,
           'USE_GT': False,
           'WEIGHT_DECAY': 0.0001},
 'USE_GPU_NMS': True}

--------> dataset path = ~/fair/hand_object_detector/data/VOCdevkit2007_handobj_100K

Loaded dataset `voc_2007_test` for training
Set proposal method: gt
voc_2007_test
Preparing training data...
voc_2007_test gt roidb loaded from ~/fair/hand_object_detector/data/cache_handobj_100K/voc_2007_test_gt_roidb.pkl
done

--------> dataset path = ~/fair/hand_object_detector/data/VOCdevkit2007_handobj_100K

9983 roidb entries

 ---------> which model = models/res101_handobj_100K/pascal_voc/faster_rcnn_1_8_132028.pth

load checkpoint models/res101_handobj_100K/pascal_voc/faster_rcnn_1_8_132028.pth
load model successfully!

---------> det score thres_hand = 0.1


---------> det score thres_obj = 0.1

Traceback (most recent call last):
  File "test_net.py", line 245, in <module>
    rois_label, loss_list = fasterRCNN(im_data, im_info, gt_boxes, num_boxes, box_info)
  File "~/miniconda3/envs/hos/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/fair/hand_object_detector/lib/model/faster_rcnn/faster_rcnn.py", line 57, in forward
    rois, rpn_loss_cls, rpn_loss_bbox = self.RCNN_rpn(base_feat, im_info, gt_boxes, num_boxes)
  File "~/miniconda3/envs/hos/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/fair/hand_object_detector/lib/model/rpn/rpn.py", line 77, in forward
    rois = self.RPN_proposal((rpn_cls_prob.data, rpn_bbox_pred.data,
  File "~/miniconda3/envs/hos/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/fair/hand_object_detector/lib/model/rpn/proposal_layer.py", line 147, in forward
    keep_idx_i = nms(proposals_single, scores_single.squeeze(1), nms_thresh)
RuntimeError: Not compiled with GPU support

import _C.nms causes undefined symbol error

maybe some useful build flags are missing

RuntimeError: CUDA error: device-side assert triggered (can't train the model if batch is not 1)

it seems that the batch_size can only be 1,
when I set the batch_size = 4 or 8 during training, the error occurs:

Traceback (most recent call last):
File "trainval_net.py", line 321, in
rois_label, loss_list = fasterRCNN(im_data, im_info, gt_boxes, num_boxes, box_info)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/Hand-Object-Interaction-detection/lib/model/faster_rcnn/faster_rcnn.py", line 62, in forward
roi_data = self.RCNN_proposal_target(rois, gt_boxes, num_boxes, box_info)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/Hand-Object-Interaction-detection/lib/model/rpn/proposal_target_layer_cascade.py", line 52, in forward
rois_per_image, self._num_classes, box_info)
File "/content/Hand-Object-Interaction-detection/lib/model/rpn/proposal_target_layer_cascade.py", line 146, in _sample_rois_pytorch
fg_inds = torch.nonzero(max_overlaps[i] >= cfg.TRAIN.FG_THRESH).view(-1)
RuntimeError: CUDA error: device-side assert triggered

Questions about AP on H+O

Hello，I run the model on the testset of 100K dataset. But handobj_100K AP on H+O can not reach 46.9, which is only about 41. Other APs can reach what showed in the table and the paper. Do I have anything wrong or miss anything?

The dataset 100DOH

Run time for extracting images.

I used demo.py to process a set of 54 test images. On my CPU the process takes about 2 seconds per image. On my a40, for some reason it takes 8 seconds per image.

I'm only interested in the detection values not the image generation. Is this a normal time for extraction per image? I was curious if there the rate should be faster and maybe I'm doing something wrong, as it would take about 14 months to process SomethingsomethingV2 dataset at this rate.

fatal error: ATen/ceil_div.h: No such file or directory

I got this error executing "python setup.py build develop":
... something else ...
fatal error: ATen/ceil_div.h: No such file or directory
#include <ATen/ceil_div.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

My environment: Ubuntu 22.04, GCC 11.4.0. I failed install opencv-python using "pip install -r requirements.txt", so I manually installed opencv-python = 4.3.0.38 for python=3.6 environment.

I tried to copy missing "ceil_div.h" from other places in anaconda3 directory, it stopped reporting missing files, but went into a lot of other problems, and they are weird:
/home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:56:24: error: ‘conditional_t’ does not name a type
56 | using forwarded_type = conditional_t<
| ^~~~~~~~~~~~~
/home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:62:1: error: ‘forwarded_type’ does not name a type
62 | forwarded_type<T, U> forward_like(U &&u) {
| ^~~~~~~~~~~~~~
/home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:68:22: error: ‘make_caster’ does not name a type; did you mean ‘type_caster’?
68 | using key_conv = make_caster;
| ^~~~~~~~~~~
| type_caster
/home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:70:15: error: ‘handle’ has not been declared
70 | bool load(handle src, bool convert) {
| ^~~~~~
/home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:85:12: error: ‘handle’ does not name a type
85 | static handle cast(T &&src, return_value_policy policy, handle parent) {

what can I do for it?

Link to 100K dataset not working

How to test on CharadesEgo?

Hi,

Thanks for sharing your excellent work!

I plan to develop an activity recognition method with your hand-object module integrated and conduct experiments on CharadesEgo. I noticed that you have already tested this method on CharadesEgo by reading your paper, but there are no instructions on how to test on CharadesEgo in this repo. Could you give me some advice (especially on how to organize the corresponding folder structure)? Thank you!

Re-train the network

Dear Dandan,

Thanks for presenting this project!

I am trying to train your network with some additional hand images. However, I am not sure how to feed my dataset to the model. Sorry for the simple question. I am a beginner on neural network.

Many thanks in advance! @ddshan

Need some assistance to Run the Demo

Hi Guys! Before I was facing trouble with compiling that I could resolve now. But I am having trouble understanding how to run the demo.py on my images, I was wondering if I can connect with you guys to get a deeper understanding of implementation, please.

How do I define CUDA_PATH?

Hello,

Thank you for your great project! I am trying to build to use GPU. However, I got an error since I am using Conda virtual environment. The CUDA_PATH is None and when I define the CUDA_PATH by using the path under conda environment. However, since NVCC is not there, I get the following error.

/home/user_name/miniconda3/pkgs/cudatoolkit-11.0.221-h6bb024c_0/bin/nvcc: not found

CUDA is not available in the system. Under this circumstances, is it possible to solve this problem?

Thank you and best regards.

Why is there no input for the network structure after exporting the provided weights to onnx

I changed the input of the model from 5 to 2, with the other 3 inputs written in the model initialization. But no matter how I modify it or input the original 5 parameters, the exported ONNX model just doesn't have any input

Here are the areas for code changes:
`
class _fasterRCNN(nn.Module):
""" faster RCNN """
def init(self, classes, class_agnostic):
super(_fasterRCNN, self).init()
self.classes = classes
self.n_classes = len(classes)
self.class_agnostic = class_agnostic
# loss
self.RCNN_loss_cls = 0
self.RCNN_loss_bbox = 0

    # define rpn
    self.RCNN_rpn = _RPN(self.dout_base_model)
    self.RCNN_proposal_target = _ProposalTargetLayer(self.n_classes)

    # self.RCNN_roi_pool = _RoIPooling(cfg.POOLING_SIZE, cfg.POOLING_SIZE, 1.0/16.0)
    # self.RCNN_roi_align = RoIAlignAvg(cfg.POOLING_SIZE, cfg.POOLING_SIZE, 1.0/16.0)

    self.RCNN_roi_pool = ROIPool((cfg.POOLING_SIZE, cfg.POOLING_SIZE), 1.0/16.0)
    self.RCNN_roi_align = ROIAlign((cfg.POOLING_SIZE, cfg.POOLING_SIZE), 1.0/16.0, 0)
    self.extension_layer = extension_layers.extension_layer()
    self.gt_boxes = torch.tensor([[[0., 0., 0., 0., 0.]]], device='cuda:0')
    self.num_boxes = torch.tensor([0], device='cuda:0')
    # self.gt_boxes = torch.tensor([[[0., 0., 0., 0., 0.]]])
    # self.num_boxes = torch.tensor([0])
    self.box_info = torch.tensor([[[0., 0., 0., 0., 0.]]])

def forward(self, im_data, im_info):
    batch_size = im_data.size(0)
    gt_boxes, num_boxes, box_info = self.gt_boxes, self.num_boxes, self.box_info

    im_info = im_info.data
    gt_boxes = gt_boxes.data
    num_boxes = num_boxes.data
    box_info = box_info.data

    # feed image data to base model to obtain base feature map
    base_feat = self.RCNN_base(im_data)
    # feed base feature map tp RPN to obtain rois
    rois, rpn_loss_cls, rpn_loss_bbox = self.RCNN_rpn(base_feat, im_info, gt_boxes, num_boxes)


    # if it is training phrase, then use ground trubut bboxes for refining
    if self.training:
        roi_data = self.RCNN_proposal_target(rois, gt_boxes, num_boxes, box_info)
        rois, rois_label, rois_target, rois_inside_ws, rois_outside_ws, box_info = roi_data
        rois_label_retain = Variable(rois_label.long())
        box_info = Variable(box_info)
        rois_label = Variable(rois_label.view(-1).long())
        rois_target = Variable(rois_target.view(-1, rois_target.size(2)))
        rois_inside_ws = Variable(rois_inside_ws.view(-1, rois_inside_ws.size(2)))
        rois_outside_ws = Variable(rois_outside_ws.view(-1, rois_outside_ws.size(2)))

    else:
        rois_label = None
        rois_target = None
        rois_inside_ws = None
        rois_outside_ws = None
        rpn_loss_cls = 0
        rpn_loss_bbox = 0
    

    rois = Variable(rois)
    rois_padded = Variable(self.enlarge_bbox(im_info, rois, 0.3))

    # do roi pooling based on predicted rois
    if cfg.POOLING_MODE == 'align':
        pooled_feat = self.RCNN_roi_align(base_feat, rois.view(-1, 5))
        pooled_feat_padded = self.RCNN_roi_align(base_feat, rois_padded.view(-1, 5))
    elif cfg.POOLING_MODE == 'pool':
        pooled_feat = self.RCNN_roi_pool(base_feat, rois.view(-1,5))
        pooled_feat_padded = self.RCNN_roi_pool(base_feat, rois_padded.view(-1,5))


    # feed pooled features to top model
    pooled_feat = self._head_to_tail(pooled_feat)
    pooled_feat_padded = self._head_to_tail(pooled_feat_padded)

    # compute bbox offset
    bbox_pred = self.RCNN_bbox_pred(pooled_feat)
    if self.training and not self.class_agnostic:
        # select the corresponding columns according to roi labels
        bbox_pred_view = bbox_pred.view(bbox_pred.size(0), int(bbox_pred.size(1) / 4), 4)
        bbox_pred_select = torch.gather(bbox_pred_view, 1, rois_label.view(rois_label.size(0), 1, 1).expand(rois_label.size(0), 1, 4))
        bbox_pred = bbox_pred_select.squeeze(1)

    # compute object classification probability
    cls_score = self.RCNN_cls_score(pooled_feat)
    cls_prob = F.softmax(cls_score, 1)
    # object_feat = pooled_feat[rois_label==1,:]
    # result = self.lineartrial(object_feat)  
    # extension layer
    RCNN_loss_cls = 0
    RCNN_loss_bbox = 0
    loss_list = []

    if self.training:
        # classification loss
        RCNN_loss_cls = F.cross_entropy(cls_score, rois_label)

        # bounding box regression L1 loss
        RCNN_loss_bbox = _smooth_l1_loss(bbox_pred, rois_target, rois_inside_ws, rois_outside_ws)

        # auxiliary layer
    #     loss_list = self.extension_layer(pooled_feat, pooled_feat_padded, rois_label_retain, box_info)
    # else:
    #     loss_list = self.extension_layer(pooled_feat, pooled_feat_padded, None, box_info)
        l1, l2, l3 = self.extension_layer(pooled_feat, pooled_feat_padded, rois_label_retain, box_info)
    else:
        l1, l2, l3 = self.extension_layer(pooled_feat, pooled_feat_padded, None, box_info)
    cls_prob = cls_prob.view(batch_size, rois.size(1), -1)
    bbox_pred = bbox_pred.view(batch_size, rois.size(1), -1)

    if self.training:
        return rois, cls_prob, bbox_pred, rpn_loss_cls, rpn_loss_bbox, RCNN_loss_cls, RCNN_loss_bbox, rois_label, l1[0], l2[0], l3[0]
    else:
        return rois, cls_prob, bbox_pred, l1[0], l2[0], l3[0]

def enlarge_bbox(self, im_info, rois, ratio=0.5):
    rois_width, rois_height = (rois[:,:,3]-rois[:,:,1]), (rois[:,:,4]-rois[:,:,2])
    rois_padded = rois.clone()
    rois_padded[:,:,1] = rois_padded[:,:,1] - ratio*rois_width
    rois_padded[:,:,2] = rois_padded[:,:,2] - ratio*rois_height
    rois_padded[:,:,1][rois_padded[:,:,1] < 0] = 0
    rois_padded[:,:,2][rois_padded[:,:,2] < 0] = 0
    
    rois_padded[:,:,3] = rois_padded[:,:,3] + ratio*rois_width
    rois_padded[:,:,4] = rois_padded[:,:,4] + ratio*rois_height
    rois_padded[:,:,3][rois_padded[:,:,3] > im_info[:,0]] = im_info[:,0]
    rois_padded[:,:,4][rois_padded[:,:,4] > im_info[:,1]] = im_info[:,1]
    return rois_padded

def _init_weights(self):
    def normal_init(m, mean, stddev, truncated=False):
        """
        weight initalizer: truncated normal and random normal.
        """
        # x is a parameter
        if truncated:
            m.weight.data.normal_().fmod_(2).mul_(stddev).add_(mean) # not a perfect approximation
        else:
            m.weight.data.normal_(mean, stddev)
            m.bias.data.zero_()

    normal_init(self.RCNN_rpn.RPN_Conv, 0, 0.01, cfg.TRAIN.TRUNCATED)
    normal_init(self.RCNN_rpn.RPN_cls_score, 0, 0.01, cfg.TRAIN.TRUNCATED)
    normal_init(self.RCNN_rpn.RPN_bbox_pred, 0, 0.01, cfg.TRAIN.TRUNCATED)
    normal_init(self.RCNN_cls_score, 0, 0.01, cfg.TRAIN.TRUNCATED)
    normal_init(self.RCNN_bbox_pred, 0, 0.001, cfg.TRAIN.TRUNCATED)

def create_architecture(self):
    self._init_modules()
    self._init_weights()

The following is the code for exporting onnx:

`
import numpy as np
import torch

from demo import _get_image_blob, parse_args
from model.utils.config import cfg, cfg_from_file, cfg_from_list
from model.faster_rcnn.resnet import resnet

cfg_from_file('cfgs/res101.yml')
cfg.USE_GPU_NMS = True

pascal_classes = np.asarray(['background', 'targetobject', 'hand'])
fasterRCNN = resnet(pascal_classes, 101, pretrained=False, class_agnostic=False)

fasterRCNN.create_architecture()

load_name = 'faster_rcnn_1_8_132028.pth'

print("load checkpoint %s" % (load_name))
checkpoint = torch.load(load_name)
fasterRCNN.load_state_dict(checkpoint['model'])
if 'pooling_mode' in checkpoint.keys():
cfg.POOLING_MODE = checkpoint['pooling_mode']

fasterRCNN.eval()

print('load model successfully!')

im_data = torch.randn(1, 3, 600, 600)
size = im_data.size()
im_info = torch.tensor([[size[2], size[3], 1.1719]])

print(im_data.size(), im_info)

onnx_path = "faster_rcnn.onnx"
torch.onnx.export(fasterRCNN, (im_data, im_info), onnx_path, opset_version=11, verbose=True)
`

Hand Mesh Reconstruction

I couldn't find the code for the self-consistent hand mesh evaluation network that can distinguish between god/bad meshes. Is it present in one of the other project repositories?

Failing with Cuda Detection

I want to use your code for some hand detection in ego-centric videos but am failing at the very beginning.
It is unable to read the nvcc from the folder, can you help me with this or atleast suggest what to edit in code.

API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
unable to execute '/is/software/nvidia/cuda-10.2/bin/nvcc': No such file or directory
error: command '/is/software/nvidia/cuda-10.2/bin/nvcc' failed with exit status 1

/is/software ... is folder in my home directory with several cuda versions and I have it added in the path too.

Download not working

How can I download the full dataset (3.0T in total)?

When I click on this, nothing happens. Any suggestions?

	# Get image from the webcam
	if webcam_num >= 0:
	if not cap.isOpened():
	raise RuntimeError("Webcam could not open. Please check connection.")
	ret, frame = cap.read()
	im_in = np.array(frame)
	# Load the demo image
	else:
	im_file = os.path.join(args.image_dir, imglist[num_images])
	im_in = np.array(imread(im_file))
	# resize
	# im_in = np.array(Image.fromarray(im_in).resize((640, 360)))
	if len(im_in.shape) == 2:
	im_in = im_in[:,:,np.newaxis]
	im_in = np.concatenate((im_in,im_in,im_in), axis=2)
	# rgb -> bgr
	im = im_in[:,:,::-1]