Giter Site home page Giter Site logo

ddshan / hand_object_detector Goto Github PK

View Code? Open in Web Editor NEW
220.0 220.0 61.0 777 KB

Project and dataset webpage:

Home Page: https://fouheylab.eecs.umich.edu/~dandans/projects/100DOH/

License: MIT License

Python 68.23% MATLAB 0.35% C++ 2.80% Cuda 13.45% Shell 0.12% C 12.37% Cython 2.69%
cvpr2020 dataset handobjectdetection interactiondetection

hand_object_detector's People

Contributors

ddshan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

hand_object_detector's Issues

Error AtomicAdd running python setup.py build develop

I am trying to build faster_rcnn with ```python setup.py build develop``. I am getting this error (every time atomicAdd is used):

error: no instance of overloaded function "atomicAdd" matches the argument list
            argument types are: (double *, double)
          detected during instantiation of "void RoIAlignBackwardFeature(int, const T *, int, T, int, int, int, int, int, int, T *, const T *) [with T=double]" 

The final error log is:

Traceback (most recent call last):
  File "setup.py", line 59, in <module>
    setup(
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
    return distutils.core.setup(**attrs)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
    self.run_command(cmd)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
    super().run_command(command)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
    self.run_command(cmd_name)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
    self.distribution.run_command(command)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
    super().run_command(command)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
    self.build_extensions()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
    build_ext.build_extensions(self)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
    self._build_extensions_serial()
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
    self.build_extension(ext)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
    _build_ext.build_extension(self, ext)
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
    objects = self.compiler.compile(
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 586, in unix_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1487, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "/itet-stor/sruffino/net_scratch/conda_envs/handobjdet/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension```

I had the same environment as you mentioned, one possible issue could be with CUDA driver.
Do you have any idea on how to fix it?

PS: it builds perfectly when I am not using GPU

nvidia semantic segmentation auto-label

Hello, I saw that you also asked about auto-label under the segmentation project of nvidia. I also have some questions in this regard. Would you please leave a contact information? I want to ask you for advice, thank you

RGB & BGR

Hi @ddshan

Thanks for your awesome work.

The demo.py support both image input and webcam input. However, you use imread to read images, which return RGB; and use cv2 to read video, which should return BGR. Is this wrong?

# Get image from the webcam
if webcam_num >= 0:
if not cap.isOpened():
raise RuntimeError("Webcam could not open. Please check connection.")
ret, frame = cap.read()
im_in = np.array(frame)
# Load the demo image
else:
im_file = os.path.join(args.image_dir, imglist[num_images])
im_in = np.array(imread(im_file))
# resize
# im_in = np.array(Image.fromarray(im_in).resize((640, 360)))
if len(im_in.shape) == 2:
im_in = im_in[:,:,np.newaxis]
im_in = np.concatenate((im_in,im_in,im_in), axis=2)
# rgb -> bgr
im = im_in[:,:,::-1]

Test imgaes with batching

Thanks for sharing the excellent project. I wander if I can test images with batching directly or modify some files。Thanks again~

How can I make annotations for my own dataset?

Hi @ddshan , thank you for your excellent code. I want to know how to make annotations for my own dataset so I can possibly fine-tune on it?
I already know how to make bbox annotations for hands and objects, but I don't know how to annotate hand_side, contact_state and offset, could you please tell me how to make these annotations? Thank you!
image

Size mismatch error with pre-trained model

Hi, I got error message: "size mismatch for extension_layer.hand_contact_state_layer.3.weight: copying a param with shape torch.Size([5, 32]) from checkpoint, the shape in current model is torch.Size([4, 32])." when running your code with provided pre-trained model. A temporary work around would be checkout to previous version.

License

Hi

Thanks for releasing this wonderful work ! It works really well.

I want to use part of the code in my own project but I want the license is missing ?

Would you mind add the license if feasible ?

Thanks !

Why is the O label hardcoded?

obj_state = state_map2[int(obj_dets[i, 5])] will never map to O, and in draw_obj_mask() the O label is hardcoded. Why is that?

Cannot run demo

Hi, great work!

I can't run the demo.
Whenever I run the demo, I get this error while loading the checkpoint weights to the network.

*** RuntimeError: CUDA error: unknown error

If I try to load the weights in the python interactive mode, I have no problem.
Some dependencies related to these imports seem to cause the trouble, but not sure.

from model.roi_layers import nms
from model.faster_rcnn.vgg16 import vgg16
from model.faster_rcnn.resnet import resnet

Please help me.

RuntimeError: Not compiled with GPU support

Hi @ddshan , I tried to setup the v-env based on the instructions in the README.md, but on running both demo.py as well as test_net.py I get this error of RuntimeError: Not compiled with GPU support. Not sure what is going wrong during installation.

Do you happen to know the root cause?

(handobj_new) kavitshah@frerd001:~/fair/hand_object_detector$ CUDA_VISIBLE_DEVICES=0 python test_net.py --model_name=handobj_100K --save_name=handobj_100K --cuda --checkepoch=8 --checkpoint=132028
Called with args:
Namespace(cfg_file='cfgs/resnet101.yml', checkepoch=8, checkpoint=132028, checksession=1, class_agnostic=False, cuda=True, dataset='pascal_voc', large_scale=False, load_dir='models', mGPUs=False, model_name='handobj_100K', net='res101', parallel_type=0, save_name='handobj_100K', set_cfgs=None, thresh_hand=0.1, thresh_obj=0.1, vis=False)
Using config:
{'ANCHOR_RATIOS': [0.5, 1, 2],
 'ANCHOR_SCALES': [8, 16, 32, 64],
 'CROP_RESIZE_WITH_MAX_POOL': False,
 'CUDA': False,
 'DATA_DIR': '~/fair/hand_object_detector/data',
 'DEDUP_BOXES': 0.0625,
 'EPS': 1e-14,
 'EXP_DIR': 'res101',
 'FEAT_STRIDE': [16],
 'GPU_ID': 0,
 'MATLAB': 'matlab',
 'MAX_NUM_GT_BOXES': 20,
 'MOBILENET': {'DEPTH_MULTIPLIER': 1.0,
               'FIXED_LAYERS': 5,
               'REGU_DEPTH': False,
               'WEIGHT_DECAY': 4e-05},
 'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]),
 'POOLING_MODE': 'align',
 'POOLING_SIZE': 7,
 'RESNET': {'FIXED_BLOCKS': 1, 'MAX_POOL': False},
 'RNG_SEED': 3,
 'ROOT_DIR': '~/fair/hand_object_detector',
 'TEST': {'BBOX_REG': True,
          'HAS_RPN': True,
          'MAX_SIZE': 1000,
          'MODE': 'nms',
          'NMS': 0.3,
          'PROPOSAL_METHOD': 'gt',
          'RPN_MIN_SIZE': 16,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 300,
          'RPN_PRE_NMS_TOP_N': 6000,
          'RPN_TOP_N': 5000,
          'SCALES': [600],
          'SVM': False},
 'TRAIN': {'ASPECT_GROUPING': False,
           'BATCH_SIZE': 128,
           'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
           'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
           'BBOX_NORMALIZE_TARGETS': True,
           'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
           'BBOX_REG': True,
           'BBOX_THRESH': 0.5,
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0.0,
           'BIAS_DECAY': False,
           'BN_TRAIN': False,
           'DISPLAY': 20,
           'DOUBLE_BIAS': False,
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'GAMMA': 0.1,
           'HAS_RPN': True,
           'IMS_PER_BATCH': 1,
           'LEARNING_RATE': 0.001,
           'MAX_SIZE': 1000,
           'MOMENTUM': 0.9,
           'PROPOSAL_METHOD': 'gt',
           'RPN_BATCHSIZE': 256,
           'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
           'RPN_CLOBBER_POSITIVES': False,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 8,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POSITIVE_WEIGHT': -1.0,
           'RPN_POST_NMS_TOP_N': 2000,
           'RPN_PRE_NMS_TOP_N': 12000,
           'SCALES': [600],
           'SNAPSHOT_ITERS': 5000,
           'SNAPSHOT_KEPT': 3,
           'SNAPSHOT_PREFIX': 'res101_faster_rcnn',
           'STEPSIZE': [30000],
           'SUMMARY_INTERVAL': 180,
           'TRIM_HEIGHT': 600,
           'TRIM_WIDTH': 600,
           'TRUNCATED': False,
           'USE_ALL_GT': True,
           'USE_FLIPPED': True,
           'USE_GT': False,
           'WEIGHT_DECAY': 0.0001},
 'USE_GPU_NMS': True}

--------> dataset path = ~/fair/hand_object_detector/data/VOCdevkit2007_handobj_100K

Loaded dataset `voc_2007_test` for training
Set proposal method: gt
voc_2007_test
Preparing training data...
voc_2007_test gt roidb loaded from ~/fair/hand_object_detector/data/cache_handobj_100K/voc_2007_test_gt_roidb.pkl
done

--------> dataset path = ~/fair/hand_object_detector/data/VOCdevkit2007_handobj_100K

9983 roidb entries

 ---------> which model = models/res101_handobj_100K/pascal_voc/faster_rcnn_1_8_132028.pth

load checkpoint models/res101_handobj_100K/pascal_voc/faster_rcnn_1_8_132028.pth
load model successfully!

---------> det score thres_hand = 0.1


---------> det score thres_obj = 0.1

Traceback (most recent call last):
  File "test_net.py", line 245, in <module>
    rois_label, loss_list = fasterRCNN(im_data, im_info, gt_boxes, num_boxes, box_info)
  File "~/miniconda3/envs/hos/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/fair/hand_object_detector/lib/model/faster_rcnn/faster_rcnn.py", line 57, in forward
    rois, rpn_loss_cls, rpn_loss_bbox = self.RCNN_rpn(base_feat, im_info, gt_boxes, num_boxes)
  File "~/miniconda3/envs/hos/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/fair/hand_object_detector/lib/model/rpn/rpn.py", line 77, in forward
    rois = self.RPN_proposal((rpn_cls_prob.data, rpn_bbox_pred.data,
  File "~/miniconda3/envs/hos/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/fair/hand_object_detector/lib/model/rpn/proposal_layer.py", line 147, in forward
    keep_idx_i = nms(proposals_single, scores_single.squeeze(1), nms_thresh)
RuntimeError: Not compiled with GPU support

RuntimeError: CUDA error: device-side assert triggered (can't train the model if batch is not 1)

it seems that the batch_size can only be 1,
when I set the batch_size = 4 or 8 during training, the error occurs:

Traceback (most recent call last):
File "trainval_net.py", line 321, in
rois_label, loss_list = fasterRCNN(im_data, im_info, gt_boxes, num_boxes, box_info)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/Hand-Object-Interaction-detection/lib/model/faster_rcnn/faster_rcnn.py", line 62, in forward
roi_data = self.RCNN_proposal_target(rois, gt_boxes, num_boxes, box_info)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/Hand-Object-Interaction-detection/lib/model/rpn/proposal_target_layer_cascade.py", line 52, in forward
rois_per_image, self._num_classes, box_info)
File "/content/Hand-Object-Interaction-detection/lib/model/rpn/proposal_target_layer_cascade.py", line 146, in _sample_rois_pytorch
fg_inds = torch.nonzero(max_overlaps[i] >= cfg.TRAIN.FG_THRESH).view(-1)
RuntimeError: CUDA error: device-side assert triggered

Questions about AP on H+O

Hello,I run the model on the testset of 100K dataset. But handobj_100K AP on H+O can not reach 46.9, which is only about 41. Other APs can reach what showed in the table and the paper. Do I have anything wrong or miss anything?

Run time for extracting images.

I used demo.py to process a set of 54 test images. On my CPU the process takes about 2 seconds per image. On my a40, for some reason it takes 8 seconds per image.

I'm only interested in the detection values not the image generation. Is this a normal time for extraction per image? I was curious if there the rate should be faster and maybe I'm doing something wrong, as it would take about 14 months to process SomethingsomethingV2 dataset at this rate.

fatal error: ATen/ceil_div.h: No such file or directory

  1. I got this error executing "python setup.py build develop":
    ... something else ...
    fatal error: ATen/ceil_div.h: No such file or directory
    #include <ATen/ceil_div.h>
    | ^~~~~~~~~~~~~~~~~
    compilation terminated.
    error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

My environment: Ubuntu 22.04, GCC 11.4.0. I failed install opencv-python using "pip install -r requirements.txt", so I manually installed opencv-python = 4.3.0.38 for python=3.6 environment.

  1. I tried to copy missing "ceil_div.h" from other places in anaconda3 directory, it stopped reporting missing files, but went into a lot of other problems, and they are weird:
    /home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:56:24: error: ‘conditional_t’ does not name a type
    56 | using forwarded_type = conditional_t<
    | ^~~~~~~~~~~~~
    /home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:62:1: error: ‘forwarded_type’ does not name a type
    62 | forwarded_type<T, U> forward_like(U &&u) {
    | ^~~~~~~~~~~~~~
    /home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:68:22: error: ‘make_caster’ does not name a type; did you mean ‘type_caster’?
    68 | using key_conv = make_caster;
    | ^~~~~~~~~~~
    | type_caster
    /home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:70:15: error: ‘handle’ has not been declared
    70 | bool load(handle src, bool convert) {
    | ^~~~~~
    /home/zjt/anaconda3/envs/handobj/lib/python3.6/site-packages/torch/lib/include/pybind11/stl.h:85:12: error: ‘handle’ does not name a type
    85 | static handle cast(T &&src, return_value_policy policy, handle parent) {

what can I do for it?

How to test on CharadesEgo?

Hi,

Thanks for sharing your excellent work!

I plan to develop an activity recognition method with your hand-object module integrated and conduct experiments on CharadesEgo. I noticed that you have already tested this method on CharadesEgo by reading your paper, but there are no instructions on how to test on CharadesEgo in this repo. Could you give me some advice (especially on how to organize the corresponding folder structure)? Thank you!

Re-train the network

Dear Dandan,

Thanks for presenting this project!

I am trying to train your network with some additional hand images. However, I am not sure how to feed my dataset to the model. Sorry for the simple question. I am a beginner on neural network.

Many thanks in advance! @ddshan

Need some assistance to Run the Demo

Hi Guys! Before I was facing trouble with compiling that I could resolve now. But I am having trouble understanding how to run the demo.py on my images, I was wondering if I can connect with you guys to get a deeper understanding of implementation, please.

How do I define CUDA_PATH?

Hello,

Thank you for your great project! I am trying to build to use GPU. However, I got an error since I am using Conda virtual environment. The CUDA_PATH is None and when I define the CUDA_PATH by using the path under conda environment. However, since NVCC is not there, I get the following error.

/home/user_name/miniconda3/pkgs/cudatoolkit-11.0.221-h6bb024c_0/bin/nvcc: not found

CUDA is not available in the system. Under this circumstances, is it possible to solve this problem?

Thank you and best regards.

Why is there no input for the network structure after exporting the provided weights to onnx

I changed the input of the model from 5 to 2, with the other 3 inputs written in the model initialization. But no matter how I modify it or input the original 5 parameters, the exported ONNX model just doesn't have any input

Here are the areas for code changes:
`
class _fasterRCNN(nn.Module):
""" faster RCNN """
def init(self, classes, class_agnostic):
super(_fasterRCNN, self).init()
self.classes = classes
self.n_classes = len(classes)
self.class_agnostic = class_agnostic
# loss
self.RCNN_loss_cls = 0
self.RCNN_loss_bbox = 0

    # define rpn
    self.RCNN_rpn = _RPN(self.dout_base_model)
    self.RCNN_proposal_target = _ProposalTargetLayer(self.n_classes)

    # self.RCNN_roi_pool = _RoIPooling(cfg.POOLING_SIZE, cfg.POOLING_SIZE, 1.0/16.0)
    # self.RCNN_roi_align = RoIAlignAvg(cfg.POOLING_SIZE, cfg.POOLING_SIZE, 1.0/16.0)

    self.RCNN_roi_pool = ROIPool((cfg.POOLING_SIZE, cfg.POOLING_SIZE), 1.0/16.0)
    self.RCNN_roi_align = ROIAlign((cfg.POOLING_SIZE, cfg.POOLING_SIZE), 1.0/16.0, 0)
    self.extension_layer = extension_layers.extension_layer()
    self.gt_boxes = torch.tensor([[[0., 0., 0., 0., 0.]]], device='cuda:0')
    self.num_boxes = torch.tensor([0], device='cuda:0')
    # self.gt_boxes = torch.tensor([[[0., 0., 0., 0., 0.]]])
    # self.num_boxes = torch.tensor([0])
    self.box_info = torch.tensor([[[0., 0., 0., 0., 0.]]])

def forward(self, im_data, im_info):
    batch_size = im_data.size(0)
    gt_boxes, num_boxes, box_info = self.gt_boxes, self.num_boxes, self.box_info

    im_info = im_info.data
    gt_boxes = gt_boxes.data
    num_boxes = num_boxes.data
    box_info = box_info.data

    # feed image data to base model to obtain base feature map
    base_feat = self.RCNN_base(im_data)
    # feed base feature map tp RPN to obtain rois
    rois, rpn_loss_cls, rpn_loss_bbox = self.RCNN_rpn(base_feat, im_info, gt_boxes, num_boxes)


    # if it is training phrase, then use ground trubut bboxes for refining
    if self.training:
        roi_data = self.RCNN_proposal_target(rois, gt_boxes, num_boxes, box_info)
        rois, rois_label, rois_target, rois_inside_ws, rois_outside_ws, box_info = roi_data
        rois_label_retain = Variable(rois_label.long())
        box_info = Variable(box_info)
        rois_label = Variable(rois_label.view(-1).long())
        rois_target = Variable(rois_target.view(-1, rois_target.size(2)))
        rois_inside_ws = Variable(rois_inside_ws.view(-1, rois_inside_ws.size(2)))
        rois_outside_ws = Variable(rois_outside_ws.view(-1, rois_outside_ws.size(2)))

    else:
        rois_label = None
        rois_target = None
        rois_inside_ws = None
        rois_outside_ws = None
        rpn_loss_cls = 0
        rpn_loss_bbox = 0
    

    rois = Variable(rois)
    rois_padded = Variable(self.enlarge_bbox(im_info, rois, 0.3))

    # do roi pooling based on predicted rois
    if cfg.POOLING_MODE == 'align':
        pooled_feat = self.RCNN_roi_align(base_feat, rois.view(-1, 5))
        pooled_feat_padded = self.RCNN_roi_align(base_feat, rois_padded.view(-1, 5))
    elif cfg.POOLING_MODE == 'pool':
        pooled_feat = self.RCNN_roi_pool(base_feat, rois.view(-1,5))
        pooled_feat_padded = self.RCNN_roi_pool(base_feat, rois_padded.view(-1,5))


    # feed pooled features to top model
    pooled_feat = self._head_to_tail(pooled_feat)
    pooled_feat_padded = self._head_to_tail(pooled_feat_padded)

    # compute bbox offset
    bbox_pred = self.RCNN_bbox_pred(pooled_feat)
    if self.training and not self.class_agnostic:
        # select the corresponding columns according to roi labels
        bbox_pred_view = bbox_pred.view(bbox_pred.size(0), int(bbox_pred.size(1) / 4), 4)
        bbox_pred_select = torch.gather(bbox_pred_view, 1, rois_label.view(rois_label.size(0), 1, 1).expand(rois_label.size(0), 1, 4))
        bbox_pred = bbox_pred_select.squeeze(1)

    # compute object classification probability
    cls_score = self.RCNN_cls_score(pooled_feat)
    cls_prob = F.softmax(cls_score, 1)
    # object_feat = pooled_feat[rois_label==1,:]
    # result = self.lineartrial(object_feat)  
    # extension layer
    RCNN_loss_cls = 0
    RCNN_loss_bbox = 0
    loss_list = []

    if self.training:
        # classification loss
        RCNN_loss_cls = F.cross_entropy(cls_score, rois_label)

        # bounding box regression L1 loss
        RCNN_loss_bbox = _smooth_l1_loss(bbox_pred, rois_target, rois_inside_ws, rois_outside_ws)

        # auxiliary layer
    #     loss_list = self.extension_layer(pooled_feat, pooled_feat_padded, rois_label_retain, box_info)
    # else:
    #     loss_list = self.extension_layer(pooled_feat, pooled_feat_padded, None, box_info)
        l1, l2, l3 = self.extension_layer(pooled_feat, pooled_feat_padded, rois_label_retain, box_info)
    else:
        l1, l2, l3 = self.extension_layer(pooled_feat, pooled_feat_padded, None, box_info)
    cls_prob = cls_prob.view(batch_size, rois.size(1), -1)
    bbox_pred = bbox_pred.view(batch_size, rois.size(1), -1)

    if self.training:
        return rois, cls_prob, bbox_pred, rpn_loss_cls, rpn_loss_bbox, RCNN_loss_cls, RCNN_loss_bbox, rois_label, l1[0], l2[0], l3[0]
    else:
        return rois, cls_prob, bbox_pred, l1[0], l2[0], l3[0]

def enlarge_bbox(self, im_info, rois, ratio=0.5):
    rois_width, rois_height = (rois[:,:,3]-rois[:,:,1]), (rois[:,:,4]-rois[:,:,2])
    rois_padded = rois.clone()
    rois_padded[:,:,1] = rois_padded[:,:,1] - ratio*rois_width
    rois_padded[:,:,2] = rois_padded[:,:,2] - ratio*rois_height
    rois_padded[:,:,1][rois_padded[:,:,1] < 0] = 0
    rois_padded[:,:,2][rois_padded[:,:,2] < 0] = 0
    
    rois_padded[:,:,3] = rois_padded[:,:,3] + ratio*rois_width
    rois_padded[:,:,4] = rois_padded[:,:,4] + ratio*rois_height
    rois_padded[:,:,3][rois_padded[:,:,3] > im_info[:,0]] = im_info[:,0]
    rois_padded[:,:,4][rois_padded[:,:,4] > im_info[:,1]] = im_info[:,1]
    return rois_padded

def _init_weights(self):
    def normal_init(m, mean, stddev, truncated=False):
        """
        weight initalizer: truncated normal and random normal.
        """
        # x is a parameter
        if truncated:
            m.weight.data.normal_().fmod_(2).mul_(stddev).add_(mean) # not a perfect approximation
        else:
            m.weight.data.normal_(mean, stddev)
            m.bias.data.zero_()

    normal_init(self.RCNN_rpn.RPN_Conv, 0, 0.01, cfg.TRAIN.TRUNCATED)
    normal_init(self.RCNN_rpn.RPN_cls_score, 0, 0.01, cfg.TRAIN.TRUNCATED)
    normal_init(self.RCNN_rpn.RPN_bbox_pred, 0, 0.01, cfg.TRAIN.TRUNCATED)
    normal_init(self.RCNN_cls_score, 0, 0.01, cfg.TRAIN.TRUNCATED)
    normal_init(self.RCNN_bbox_pred, 0, 0.001, cfg.TRAIN.TRUNCATED)

def create_architecture(self):
    self._init_modules()
    self._init_weights()

`

The following is the code for exporting onnx:

`
import numpy as np
import torch

from demo import _get_image_blob, parse_args
from model.utils.config import cfg, cfg_from_file, cfg_from_list
from model.faster_rcnn.resnet import resnet

cfg_from_file('cfgs/res101.yml')
cfg.USE_GPU_NMS = True

pascal_classes = np.asarray(['background', 'targetobject', 'hand'])
fasterRCNN = resnet(pascal_classes, 101, pretrained=False, class_agnostic=False)

fasterRCNN.create_architecture()

load_name = 'faster_rcnn_1_8_132028.pth'

print("load checkpoint %s" % (load_name))
checkpoint = torch.load(load_name)
fasterRCNN.load_state_dict(checkpoint['model'])
if 'pooling_mode' in checkpoint.keys():
cfg.POOLING_MODE = checkpoint['pooling_mode']

fasterRCNN.eval()

print('load model successfully!')

im_data = torch.randn(1, 3, 600, 600)
size = im_data.size()
im_info = torch.tensor([[size[2], size[3], 1.1719]])

print(im_data.size(), im_info)

onnx_path = "faster_rcnn.onnx"
torch.onnx.export(fasterRCNN, (im_data, im_info), onnx_path, opset_version=11, verbose=True)
`

Hand Mesh Reconstruction

I couldn't find the code for the self-consistent hand mesh evaluation network that can distinguish between god/bad meshes. Is it present in one of the other project repositories?

Failing with Cuda Detection

I want to use your code for some hand detection in ego-centric videos but am failing at the very beginning.
It is unable to read the nvcc from the folder, can you help me with this or atleast suggest what to edit in code.

API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
unable to execute '/is/software/nvidia/cuda-10.2/bin/nvcc': No such file or directory
error: command '/is/software/nvidia/cuda-10.2/bin/nvcc' failed with exit status 1

/is/software ... is folder in my home directory with several cuda versions and I have it added in the path too.

Download not working

How can I download the full dataset (3.0T in total)?

1709977204446

When I click on this, nothing happens. Any suggestions?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.