Giter Site home page Giter Site logo

Comments (4)

zerchen avatar zerchen commented on July 2, 2024

Hi,

Sorry for a slow response.
The images in Fig2, Fig 7, Fig C.2 and Fig C.3 are directly taken from the original DexYCB images, and are only used as a visualization. Actually, the cropped images will be fed into networks.
You are right. In the AlignSDF paper, I make the hand wrist lie in the center of the cropped image and crop a 480x480 image out of 640x480. Finally, I resize the image to 256x256 and input it into the network.
In the lastest gSDF work, I use another strategy to crop images and find it is a better solution. The code will be ready before the main CVPR conference. Here I attach some relevant preprocessing codes (the derived bbox will be used to crop images) for a reference. Hope it helps.

Best,
Zerui

from distutils.log import debug
import numpy as np
import torch
import os
import os.path as osp
from tqdm import tqdm
from fire import Fire
import json
import pickle
import shutil
import trimesh
from scipy.spatial import cKDTree as KDTree
import sys
from glob import glob
import cv2
import lmdb
sys.path.insert(0, '../common')
from mano.manolayer import ManoLayer
from utils.img_utils import generate_patch_image, process_bbox
sys.path.insert(0, '..')
from datasets.dexycb.toolkit.factory import get_dataset
from datasets.dexycb.toolkit.dex_ycb import _SUBJECTS, _SERIALS


def preprocess(data_root='../datasets/dexycb', split='s0', mode='test', side='right'):
    sdf_data_root = os.path.join(data_root, 'data', 'sdf_data')
    if mode == 'test':
        hand_mesh_data_root = os.path.join(data_root, 'data', 'mesh_data', 'mesh_hand')
        obj_mesh_data_root = os.path.join(data_root, 'data', 'mesh_data', 'mesh_obj')
        os.makedirs(hand_mesh_data_root, exist_ok=True)
        os.makedirs(obj_mesh_data_root, exist_ok=True)

    dataset_name = f'{split}_{mode}'
    dataset = get_dataset(dataset_name)
    selected_ids = []

    with open(f'{data_root}/dexycb_{split}_{mode}_t.json', 'w') as json_data:
        coco_file = dict()
        data_images = []
        data_annos = []

        for i in tqdm(range(len(dataset))):
            sample = dataset[i]
            img_info = dict()
            anno_info = dict()
            sample_id = i

            if sample['mano_side'] in side:
                img_path = sample['color_file']
                subject = int(img_path.split('/')[6].split('-')[-1])
                video_id = img_path.split('/')[7]
                sub_video_id = img_path.split('/')[8]
                frame_idx = int(img_path.split('/')[-1].split('_')[-1].split('.')[0])

                if frame_idx % 5 != 0:
                    continue

                img_info['id'] = sample_id
                img_info['file_name'] = '_'.join([str(subject), video_id, sub_video_id, str(frame_idx)])

                if (os.path.exists(os.path.join(sdf_data_root, 'norm', img_info['file_name'] + '.npz')) and mode == 'train') or mode == 'test':
                    anno_info['id'] = sample_id
                    anno_info['image_id'] = sample_id
                    anno_info['fx'] = sample['intrinsics']['fx']
                    anno_info['fy'] = sample['intrinsics']['fy']
                    anno_info['cx'] = sample['intrinsics']['ppx']
                    anno_info['cy'] = sample['intrinsics']['ppy']

                    label = np.load(sample['label_file'])
                    mano_layer = ManoLayer(flat_hand_mean=False, ncomps=45, side=sample['mano_side'], mano_root='../common/mano/assets/', use_pca=True)
                    betas = torch.tensor(sample['mano_betas'], dtype=torch.float32).unsqueeze(0)
                    hand_verts, _, hand_poses, _, _ = mano_layer(torch.from_numpy(label['pose_m'][:, 0:48]), betas, torch.from_numpy(label['pose_m'][:, 48:51]))
                    hand_verts = hand_verts[0].numpy().tolist()

                    anno_info['hand_joints_3d'] = label['joint_3d'][0].tolist()
                    anno_info['hand_poses'] = hand_poses[0].numpy().tolist()
                    anno_info['hand_trans'] = label['pose_m'][0, 48:].tolist()
                    anno_info['hand_shapes'] = sample['mano_betas']

                    grasp_obj_id = sample['ycb_ids'][sample['ycb_grasp_ind']]
                    anno_info['ycb_id'] = grasp_obj_id
                    obj_rest_mesh = trimesh.load(dataset.obj_file[grasp_obj_id], process=False)
                    offset = (obj_rest_mesh.vertices.min(0) + obj_rest_mesh.vertices.max(0)) / 2
                    obj_rest_corners = trimesh.bounds.corners(obj_rest_mesh.bounds) - offset
                    pose_y = label['pose_y'][sample['ycb_grasp_ind']]
                    R, t = pose_y[:3, :3], pose_y[:, 3:]
                    new_t = R @ offset.reshape(3, 1) + t
                    obj_affine_transform = np.concatenate([np.concatenate([R, new_t], axis=1), np.array([[0.0, 0.0, 0.0, 1.0]], dtype=np.float32)])
                    obj_corners = (R @ obj_rest_corners.transpose(1, 0) + new_t).transpose(1, 0)

                    homo_obj_verts = np.ones((obj_rest_mesh.vertices.shape[0], 4))
                    homo_obj_verts[:, :3] = obj_rest_mesh.vertices
                    obj_verts = np.dot(pose_y, homo_obj_verts.transpose(1, 0)).transpose(1, 0)
                    anno_info['obj_transform'] = obj_affine_transform.tolist()
                    anno_info['obj_center_3d'] = new_t.squeeze().tolist()
                    anno_info['obj_corners_3d'] = obj_corners.tolist()
                    anno_info['obj_rest_corners_3d'] = obj_rest_corners.tolist()
                
                    hand_joints_2d = label['joint_2d'][0]
                    if np.all(hand_joints_2d) == -1.0:
                        continue
                    obj_corners_2d = np.zeros((8, 2))
                    obj_corners_2d[:, 0] = anno_info['fx'] * obj_corners[:, 0] / obj_corners[:, 2] + anno_info['cx']
                    obj_corners_2d[:, 1] = anno_info['fy'] * obj_corners[:, 1] / obj_corners[:, 2] + anno_info['cy']
                    tl = np.min(np.concatenate([hand_joints_2d, obj_corners_2d], axis=0), axis=0)
                    br = np.max(np.concatenate([hand_joints_2d, obj_corners_2d], axis=0), axis=0)
                    box_size = br - tl
                    bbox = np.concatenate([tl-10, box_size+20],axis=0)
                    bbox = process_bbox(bbox)
                    anno_info['bbox'] = bbox.tolist()

                    if mode == 'train':
                        sdf_norm_data = np.load(os.path.join(sdf_data_root, 'norm', img_info['file_name'] + '.npz'))
                        anno_info['sdf_scale'] = sdf_norm_data['scale'].tolist()
                        anno_info['sdf_offset'] = sdf_norm_data['offset'].tolist()
                    elif mode == 'test':
                        hand_points_kd_tree = KDTree(hand_verts)
                        obj2hand_distances, _ = hand_points_kd_tree.query(obj_verts)
                        if obj2hand_distances.min() > 0.005:
                            continue
                        
                        hand_faces = np.load('../common/mano/assets/closed_fmano.npy')
                        hand_mesh = trimesh.Trimesh(vertices=hand_verts, faces=hand_faces)
                        obj_faces = obj_rest_mesh.faces
                        obj_mesh = trimesh.Trimesh(vertices=obj_verts, faces=obj_faces)
                        mesh_filename = '_'.join([str(subject), video_id, sub_video_id, str(frame_idx)]) + '.obj'
                        hand_mesh.export(os.path.join(hand_mesh_data_root, mesh_filename))
                        obj_mesh.export(os.path.join(obj_mesh_data_root, mesh_filename))
                    
                    selected_ids.append(str(sample_id).rjust(8, '0'))
                    data_images.append(img_info)
                    data_annos.append(anno_info)

        coco_file['images'] = data_images
        coco_file['annotations'] = data_annos
        json.dump(coco_file, json_data, indent=2)

        with open(f'{data_root}/splits/{split}_{mode}.json', 'w') as f:
            json.dump(selected_ids, f, indent=2)

from alignsdf.

JackMa-coder avatar JackMa-coder commented on July 2, 2024

Hi,

Sorry for a slow response. The images in Fig2, Fig 7, Fig C.2 and Fig C.3 are directly taken from the original DexYCB images, and are only used as a visualization. Actually, the cropped images will be fed into networks. You are right. In the AlignSDF paper, I make the hand wrist lie in the center of the cropped image and crop a 480x480 image out of 640x480. Finally, I resize the image to 256x256 and input it into the network. In the lastest gSDF work, I use another strategy to crop images and find it is a better solution. The code will be ready before the main CVPR conference. Here I attach some relevant preprocessing codes (the derived bbox will be used to crop images) for a reference. Hope it helps.

Best, Zerui

from distutils.log import debug
import numpy as np
import torch
import os
import os.path as osp
from tqdm import tqdm
from fire import Fire
import json
import pickle
import shutil
import trimesh
from scipy.spatial import cKDTree as KDTree
import sys
from glob import glob
import cv2
import lmdb
sys.path.insert(0, '../common')
from mano.manolayer import ManoLayer
from utils.img_utils import generate_patch_image, process_bbox
sys.path.insert(0, '..')
from datasets.dexycb.toolkit.factory import get_dataset
from datasets.dexycb.toolkit.dex_ycb import _SUBJECTS, _SERIALS


def preprocess(data_root='../datasets/dexycb', split='s0', mode='test', side='right'):
    sdf_data_root = os.path.join(data_root, 'data', 'sdf_data')
    if mode == 'test':
        hand_mesh_data_root = os.path.join(data_root, 'data', 'mesh_data', 'mesh_hand')
        obj_mesh_data_root = os.path.join(data_root, 'data', 'mesh_data', 'mesh_obj')
        os.makedirs(hand_mesh_data_root, exist_ok=True)
        os.makedirs(obj_mesh_data_root, exist_ok=True)

    dataset_name = f'{split}_{mode}'
    dataset = get_dataset(dataset_name)
    selected_ids = []

    with open(f'{data_root}/dexycb_{split}_{mode}_t.json', 'w') as json_data:
        coco_file = dict()
        data_images = []
        data_annos = []

        for i in tqdm(range(len(dataset))):
            sample = dataset[i]
            img_info = dict()
            anno_info = dict()
            sample_id = i

            if sample['mano_side'] in side:
                img_path = sample['color_file']
                subject = int(img_path.split('/')[6].split('-')[-1])
                video_id = img_path.split('/')[7]
                sub_video_id = img_path.split('/')[8]
                frame_idx = int(img_path.split('/')[-1].split('_')[-1].split('.')[0])

                if frame_idx % 5 != 0:
                    continue

                img_info['id'] = sample_id
                img_info['file_name'] = '_'.join([str(subject), video_id, sub_video_id, str(frame_idx)])

                if (os.path.exists(os.path.join(sdf_data_root, 'norm', img_info['file_name'] + '.npz')) and mode == 'train') or mode == 'test':
                    anno_info['id'] = sample_id
                    anno_info['image_id'] = sample_id
                    anno_info['fx'] = sample['intrinsics']['fx']
                    anno_info['fy'] = sample['intrinsics']['fy']
                    anno_info['cx'] = sample['intrinsics']['ppx']
                    anno_info['cy'] = sample['intrinsics']['ppy']

                    label = np.load(sample['label_file'])
                    mano_layer = ManoLayer(flat_hand_mean=False, ncomps=45, side=sample['mano_side'], mano_root='../common/mano/assets/', use_pca=True)
                    betas = torch.tensor(sample['mano_betas'], dtype=torch.float32).unsqueeze(0)
                    hand_verts, _, hand_poses, _, _ = mano_layer(torch.from_numpy(label['pose_m'][:, 0:48]), betas, torch.from_numpy(label['pose_m'][:, 48:51]))
                    hand_verts = hand_verts[0].numpy().tolist()

                    anno_info['hand_joints_3d'] = label['joint_3d'][0].tolist()
                    anno_info['hand_poses'] = hand_poses[0].numpy().tolist()
                    anno_info['hand_trans'] = label['pose_m'][0, 48:].tolist()
                    anno_info['hand_shapes'] = sample['mano_betas']

                    grasp_obj_id = sample['ycb_ids'][sample['ycb_grasp_ind']]
                    anno_info['ycb_id'] = grasp_obj_id
                    obj_rest_mesh = trimesh.load(dataset.obj_file[grasp_obj_id], process=False)
                    offset = (obj_rest_mesh.vertices.min(0) + obj_rest_mesh.vertices.max(0)) / 2
                    obj_rest_corners = trimesh.bounds.corners(obj_rest_mesh.bounds) - offset
                    pose_y = label['pose_y'][sample['ycb_grasp_ind']]
                    R, t = pose_y[:3, :3], pose_y[:, 3:]
                    new_t = R @ offset.reshape(3, 1) + t
                    obj_affine_transform = np.concatenate([np.concatenate([R, new_t], axis=1), np.array([[0.0, 0.0, 0.0, 1.0]], dtype=np.float32)])
                    obj_corners = (R @ obj_rest_corners.transpose(1, 0) + new_t).transpose(1, 0)

                    homo_obj_verts = np.ones((obj_rest_mesh.vertices.shape[0], 4))
                    homo_obj_verts[:, :3] = obj_rest_mesh.vertices
                    obj_verts = np.dot(pose_y, homo_obj_verts.transpose(1, 0)).transpose(1, 0)
                    anno_info['obj_transform'] = obj_affine_transform.tolist()
                    anno_info['obj_center_3d'] = new_t.squeeze().tolist()
                    anno_info['obj_corners_3d'] = obj_corners.tolist()
                    anno_info['obj_rest_corners_3d'] = obj_rest_corners.tolist()
                
                    hand_joints_2d = label['joint_2d'][0]
                    if np.all(hand_joints_2d) == -1.0:
                        continue
                    obj_corners_2d = np.zeros((8, 2))
                    obj_corners_2d[:, 0] = anno_info['fx'] * obj_corners[:, 0] / obj_corners[:, 2] + anno_info['cx']
                    obj_corners_2d[:, 1] = anno_info['fy'] * obj_corners[:, 1] / obj_corners[:, 2] + anno_info['cy']
                    tl = np.min(np.concatenate([hand_joints_2d, obj_corners_2d], axis=0), axis=0)
                    br = np.max(np.concatenate([hand_joints_2d, obj_corners_2d], axis=0), axis=0)
                    box_size = br - tl
                    bbox = np.concatenate([tl-10, box_size+20],axis=0)
                    bbox = process_bbox(bbox)
                    anno_info['bbox'] = bbox.tolist()

                    if mode == 'train':
                        sdf_norm_data = np.load(os.path.join(sdf_data_root, 'norm', img_info['file_name'] + '.npz'))
                        anno_info['sdf_scale'] = sdf_norm_data['scale'].tolist()
                        anno_info['sdf_offset'] = sdf_norm_data['offset'].tolist()
                    elif mode == 'test':
                        hand_points_kd_tree = KDTree(hand_verts)
                        obj2hand_distances, _ = hand_points_kd_tree.query(obj_verts)
                        if obj2hand_distances.min() > 0.005:
                            continue
                        
                        hand_faces = np.load('../common/mano/assets/closed_fmano.npy')
                        hand_mesh = trimesh.Trimesh(vertices=hand_verts, faces=hand_faces)
                        obj_faces = obj_rest_mesh.faces
                        obj_mesh = trimesh.Trimesh(vertices=obj_verts, faces=obj_faces)
                        mesh_filename = '_'.join([str(subject), video_id, sub_video_id, str(frame_idx)]) + '.obj'
                        hand_mesh.export(os.path.join(hand_mesh_data_root, mesh_filename))
                        obj_mesh.export(os.path.join(obj_mesh_data_root, mesh_filename))
                    
                    selected_ids.append(str(sample_id).rjust(8, '0'))
                    data_images.append(img_info)
                    data_annos.append(anno_info)

        coco_file['images'] = data_images
        coco_file['annotations'] = data_annos
        json.dump(coco_file, json_data, indent=2)

        with open(f'{data_root}/splits/{split}_{mode}.json', 'w') as f:
            json.dump(selected_ids, f, indent=2)

Thank for your detailed reply. I have followed your latest research work gSDF, which has made many outstanding contributions. And When I read these above code snippets, I found the 'from utils.img_utils import process_bbox' is not included in AlignSDF code.
Thank for your selfless help.

from alignsdf.

zerchen avatar zerchen commented on July 2, 2024

Hi,

Thanks for mentioning this issue. I attach relevant codes here for your reference.

Best,
Zerui

import numpy as np
import cv2


def gen_trans_from_patch_cv(c_x, c_y, src_width, src_height, dst_width, dst_height, scale, rot, inv=False):
    """
    @description: Modified from https://github.com/mks0601/3DMPPE_ROOTNET_RELEASE/blob/master/data/dataset.py.
                  get affine transform matrix
    ---------
    @param: image center, original image size, desired image size, scale factor, rotation degree, whether to get inverse transformation.
    -------
    @Returns: affine transformation matrix
    -------
    """

    def rotate_2d(pt_2d, rot_rad):
        x = pt_2d[0]
        y = pt_2d[1]
        sn, cs = np.sin(rot_rad), np.cos(rot_rad)
        xx = x * cs - y * sn
        yy = x * sn + y * cs
        return np.array([xx, yy], dtype=np.float32)

    # augment size with scale
    src_w = src_width * scale
    src_h = src_height * scale
    src_center = np.array([c_x, c_y], dtype=np.float32)

    # augment rotation
    rot_rad = np.pi * rot / 180
    src_downdir = rotate_2d(np.array([0, src_h * 0.5], dtype=np.float32), rot_rad)
    src_rightdir = rotate_2d(np.array([src_w * 0.5, 0], dtype=np.float32), rot_rad)

    dst_w = dst_width
    dst_h = dst_height
    dst_center = np.array([dst_w * 0.5, dst_h * 0.5], dtype=np.float32)
    dst_downdir = np.array([0, dst_h * 0.5], dtype=np.float32)
    dst_rightdir = np.array([dst_w * 0.5, 0], dtype=np.float32)

    src = np.zeros((3, 2), dtype=np.float32)
    src[0, :] = src_center
    src[1, :] = src_center + src_downdir
    src[2, :] = src_center + src_rightdir

    dst = np.zeros((3, 2), dtype=np.float32)
    dst[0, :] = dst_center
    dst[1, :] = dst_center + dst_downdir
    dst[2, :] = dst_center + dst_rightdir

    if inv:
        trans = cv2.getAffineTransform(np.float32(dst), np.float32(src))
    else:
        trans = cv2.getAffineTransform(np.float32(src), np.float32(dst))

    return trans


def generate_patch_image(cvimg, bbox, input_shape, scale, rot):
    """
    @description: Modified from https://github.com/mks0601/3DMPPE_ROOTNET_RELEASE/blob/master/data/dataset.py.
                  generate the patch image from the bounding box and other parameters.
    ---------
    @param: input image, bbox(x1, y1, h, w), dest image shape, do_flip, scale factor, rotation degrees.
    -------
    @Returns: processed image, affine_transform matrix to get the processed image.
    -------
    """

    img = cvimg.copy()
    img_height, img_width, _ = img.shape

    bb_c_x = float(bbox[0] + 0.5 * bbox[2])
    bb_c_y = float(bbox[1] + 0.5 * bbox[3])
    bb_width = float(bbox[2])
    bb_height = float(bbox[3])

    trans = gen_trans_from_patch_cv(bb_c_x, bb_c_y, bb_width, bb_height, input_shape[1], input_shape[0], scale, rot, inv=False)
    img_patch = cv2.warpAffine(img, trans, (int(input_shape[1]), int(input_shape[0])), flags=cv2.INTER_LINEAR)
    new_trans = np.zeros((3, 3), dtype=np.float32)
    new_trans[:2, :] = trans
    new_trans[2, 2] = 1

    return img_patch, new_trans

def merge_handobj_bbox(hand_bbox, obj_bbox):
    # the format of bbox: xyxy
    tl = np.min(np.concatenate([hand_bbox.reshape((2, 2)), obj_bbox.reshape((2, 2))], axis=0), axis=0)
    br = np.max(np.concatenate([hand_bbox.reshape((2, 2)), obj_bbox.reshape((2, 2))], axis=0), axis=0)
    box_size = br - tl
    bbox = np.concatenate([tl, box_size], axis=0)

    return bbox
    

def process_bbox(bbox):
    # aspect ratio preserving bbox
    w = bbox[2]
    h = bbox[3]
    c_x = bbox[0] + w / 2.
    c_y = bbox[1] + h / 2.
    aspect_ratio = 1.
    if w > aspect_ratio * h:
        h = w / aspect_ratio
    elif w < aspect_ratio * h:
        w = h * aspect_ratio
    bbox[2] = w * 1.25
    bbox[3] = h * 1.25
    bbox[0] = c_x - bbox[2] / 2.
    bbox[1] = c_y - bbox[3] / 2.

    return bbox

from alignsdf.

JackMa-coder avatar JackMa-coder commented on July 2, 2024

Hi,

Thanks for mentioning this issue. I attach relevant codes here for your reference.

Best, Zerui

import numpy as np
import cv2


def gen_trans_from_patch_cv(c_x, c_y, src_width, src_height, dst_width, dst_height, scale, rot, inv=False):
    """
    @description: Modified from https://github.com/mks0601/3DMPPE_ROOTNET_RELEASE/blob/master/data/dataset.py.
                  get affine transform matrix
    ---------
    @param: image center, original image size, desired image size, scale factor, rotation degree, whether to get inverse transformation.
    -------
    @Returns: affine transformation matrix
    -------
    """

    def rotate_2d(pt_2d, rot_rad):
        x = pt_2d[0]
        y = pt_2d[1]
        sn, cs = np.sin(rot_rad), np.cos(rot_rad)
        xx = x * cs - y * sn
        yy = x * sn + y * cs
        return np.array([xx, yy], dtype=np.float32)

    # augment size with scale
    src_w = src_width * scale
    src_h = src_height * scale
    src_center = np.array([c_x, c_y], dtype=np.float32)

    # augment rotation
    rot_rad = np.pi * rot / 180
    src_downdir = rotate_2d(np.array([0, src_h * 0.5], dtype=np.float32), rot_rad)
    src_rightdir = rotate_2d(np.array([src_w * 0.5, 0], dtype=np.float32), rot_rad)

    dst_w = dst_width
    dst_h = dst_height
    dst_center = np.array([dst_w * 0.5, dst_h * 0.5], dtype=np.float32)
    dst_downdir = np.array([0, dst_h * 0.5], dtype=np.float32)
    dst_rightdir = np.array([dst_w * 0.5, 0], dtype=np.float32)

    src = np.zeros((3, 2), dtype=np.float32)
    src[0, :] = src_center
    src[1, :] = src_center + src_downdir
    src[2, :] = src_center + src_rightdir

    dst = np.zeros((3, 2), dtype=np.float32)
    dst[0, :] = dst_center
    dst[1, :] = dst_center + dst_downdir
    dst[2, :] = dst_center + dst_rightdir

    if inv:
        trans = cv2.getAffineTransform(np.float32(dst), np.float32(src))
    else:
        trans = cv2.getAffineTransform(np.float32(src), np.float32(dst))

    return trans


def generate_patch_image(cvimg, bbox, input_shape, scale, rot):
    """
    @description: Modified from https://github.com/mks0601/3DMPPE_ROOTNET_RELEASE/blob/master/data/dataset.py.
                  generate the patch image from the bounding box and other parameters.
    ---------
    @param: input image, bbox(x1, y1, h, w), dest image shape, do_flip, scale factor, rotation degrees.
    -------
    @Returns: processed image, affine_transform matrix to get the processed image.
    -------
    """

    img = cvimg.copy()
    img_height, img_width, _ = img.shape

    bb_c_x = float(bbox[0] + 0.5 * bbox[2])
    bb_c_y = float(bbox[1] + 0.5 * bbox[3])
    bb_width = float(bbox[2])
    bb_height = float(bbox[3])

    trans = gen_trans_from_patch_cv(bb_c_x, bb_c_y, bb_width, bb_height, input_shape[1], input_shape[0], scale, rot, inv=False)
    img_patch = cv2.warpAffine(img, trans, (int(input_shape[1]), int(input_shape[0])), flags=cv2.INTER_LINEAR)
    new_trans = np.zeros((3, 3), dtype=np.float32)
    new_trans[:2, :] = trans
    new_trans[2, 2] = 1

    return img_patch, new_trans

def merge_handobj_bbox(hand_bbox, obj_bbox):
    # the format of bbox: xyxy
    tl = np.min(np.concatenate([hand_bbox.reshape((2, 2)), obj_bbox.reshape((2, 2))], axis=0), axis=0)
    br = np.max(np.concatenate([hand_bbox.reshape((2, 2)), obj_bbox.reshape((2, 2))], axis=0), axis=0)
    box_size = br - tl
    bbox = np.concatenate([tl, box_size], axis=0)

    return bbox
    

def process_bbox(bbox):
    # aspect ratio preserving bbox
    w = bbox[2]
    h = bbox[3]
    c_x = bbox[0] + w / 2.
    c_y = bbox[1] + h / 2.
    aspect_ratio = 1.
    if w > aspect_ratio * h:
        h = w / aspect_ratio
    elif w < aspect_ratio * h:
        w = h * aspect_ratio
    bbox[2] = w * 1.25
    bbox[3] = h * 1.25
    bbox[0] = c_x - bbox[2] / 2.
    bbox[1] = c_y - bbox[3] / 2.

    return bbox

Thank you for your patient reply. Wishing you success in your research

from alignsdf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.