Giter Site home page Giter Site logo

multiview_preprocessing's Introduction

Multiview Camera Preprocessing Scripts

Requirements

  1. Run pip install -r requirements.txt

  2. (Optional for checking camera pose) Install PyTorch (1.7.1, 1.9.0 tested)

  3. (Optional for generating camera poses) COLMAP

  4. Foreground image generation BackgroundMattingV2

Usage

  1. Download data and extract

  2. Run sh extract_frame.sh [data_root_directory] [output_directory] [intrinsics_folder]

  3. Use the precalibrated camera poses or generate your own (see below).

    (Optional) Acquire camera pose with COLMAP using python get_pose.py [scene_directory] --scene_ids [desired scene ids]

    For example, python get_pose.py [output_directory]/test/2_7k/2_8/ --scene_ids 18 to process only scene018 or omit --scene_ids to process all scenes

    Might need to check COLMAP results to see if camera poses are reasonable (looks like a grid). If not, consider supplying a pose prior as follows.

    Run python get_pose_with_prior.py --root_dir [images_folder] --prior_path [poses_bounds_npy] --extension [jpg_or_png]

    For example, python get_pose_with_prior.py --root_dir [output_directory]/test/2_7k/2_8/scene018/images --prior_path scene018_pb.npy --extension .jpg to acquire pose for scene018 using a pose prior stored in the npy file.

  4. Generate background images with sh gen_bg.sh [output_directory]

  5. Generate foreground images (requires BackgroundMattingV2)

    Drop gen_fg.py and gen_fg.sh into BackgroundMattingV2's folder and run sh gen_fg.sh [ckpt_path] [resnet101 | resnet50] [output_directory]

  6. Compress images with sh compress_data.sh [output_directory] [compressed_output_directory]

  7. Collect the poses with sh collect_poses.sh [output_directory] [compressed_output_directory]

Dataset usage

The camera poses are stored in LLFF convention in npy format.

E.g. camera poses are stored in shape (#views, 17) where the camera parameters are flattened.

It can be converted to intrinsic and world-to-camera matrix with:

import numpy as np

def pose2mat(pose):
    """Convert pose matrix (3x5) to extrinsic matrix (4x4) and
       intrinsic matrix (3x3)
    
    Args:
        pose: 3x5 pose matrix
    Returns:
        Extrinsic matrix (4x4) and intrinsic matrix (3x3)
    """
    extrinsic = np.eye(4)
    extrinsic[:3, :] = pose[:, :4]
    h, w, focal_length = pose[:, 4]
    intrinsic = np.array([[focal_length, 0, w/2],
                        [0, focal_length, h/2],
                        [0,            0,   1]])

    return extrinsic, intrinsic

def convert_llff(pose):
    """Convert LLFF poses to OpenCV convention (w2c extrinsic and hwf)
    """
    hwf = pose[:3, 4:] 

    ext = np.eye(4)

    ext[:3, :4] = pose[:3, :4]

    ext = np.concatenate([ext[:, 1:2], 
                        ext[:, 0:1], 
                        -ext[:, 2:3], 
                        ext[:, 3:4]], axis=1)

    mat = np.linalg.inv(ext)

    return np.concatenate([mat[:3, :4], hwf], -1)
input_poses = np.load('scene000_pb.npy')
pose = input_poses[0, :-2].reshape([3, 5])
bounds = input_poses[0, -2:]
w2c, k = pose2mat(convert_llff(pose))
print(w2c.shape)
print(k.shape)

The image data after compression is stored in h5 format with keys:

rgb: original RGB frames in shape (#views, #frames, height, width, 3)

fg_rgb: processed foreground RGB frames in shape (#views, #frames, height, width, 4)

bg_rgb: median-filtered background RGB images in shape (#views, height, width, 3)

Example script:

import h5py
with h5py.File('scene000.h5', 'r') as hf:
    rgb = hf['rgb'] # (#views, #frames, height, width, 3)

Troubleshooting

Sometimes the last frame could be corrupted when doing frame extraction. Rerun undistort_opencv.py with argument --fix_last to fix it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.