Giter Site home page Giter Site logo

sunset1995 / horizonnet Goto Github PK

View Code? Open in Web Editor NEW
318.0 22.0 88.0 10.25 MB

Pytorch implementation of HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation.

Home Page: https://sunset1995.github.io/HorizonNet/

License: MIT License

Python 100.00%
horizonnet room-layout computer-vision cvpr2019 360-photo pano-stretch-augmentation

horizonnet's Introduction

HorizonNet

This is the implementation of our CVPR'19 " HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation" (project page).

Update

Feature

This repo is a pure python implementation that you can:

  • Inference on your images to get cuboid or general shaped room layout
  • 3D layout viewer
  • Correct rotation pose to ensure manhattan alignment
  • Pano stretch augmentation copy and paste to apply on your own task
  • Quantitative evaluatation of 2D IoU, 3D IoU, Corner Error, Pixel Error of cuboid/general shape
  • Your own dataset preparation and training

Method overview

Installation

Pytorch installation is machine dependent, please install the correct version for your machine. The tested version is pytorch 1.8.1 with python 3.7.6.

Dependencies (click to expand)
  • numpy
  • scipy
  • sklearn
  • Pillow
  • tqdm
  • tensorboardX
  • opencv-python>=3.1 (for pre-processing)
  • pylsd-nova
  • open3d>=0.7 (for layout 3D viewer)
  • shapely

Download

Dataset

  • PanoContext/Stanford2D3D Dataset
    • Download preprocessed pano/s2d3d for training/validation/testing
      • Put all of them under data directory so you should get:
        HorizonNet/
        ├──data/
        |  ├──layoutnet_dataset/
        |  |  |--finetune_general/
        |  |  |--test/
        |  |  |--train/
        |  |  |--valid/
        
      • test, train, valid are processed from LayoutNet's cuboid dataset.
      • finetune_general is re-annotated by us from train and valid. It contains 65 general shaped rooms.
  • Structured3D Dataset
    • See the tutorial to prepare training/validation/testing for HorizonNet.
  • Zillow Indoor Dataset
    • See the tutorial to prepare training/validation/testing for HorizonNet.

Pretrained Models

Plase download the pre-trained model here

  • resnet50_rnn__panos2d3d.pth
    • Trained on PanoContext/Stanford2d3d 817 pano images.
    • Trained for 300 epoch
  • resnet50_rnn__st3d.pth
    • Trained on Structured3D 18362 pano images
    • Data setup: original furniture and lighting.
    • Trained for 50 epoch.
  • resnet50_rnn__zind.pth
    • Trained on Zillow Indoor 20077 pano images.
    • Data setup: layout_visible, is_primary, is_inside, is_ceiling_flat.
    • Trained for 50 epoch.

Inference on your images

In below explaination, I will use assets/demo.png for example.

  • (modified from PanoContext dataset)

1. Pre-processing (Align camera rotation pose)

  • Execution: Pre-process the above assets/demo.png by firing below command.
    python preprocess.py --img_glob assets/demo.png --output_dir assets/preprocessed/
    • --img_glob telling the path to your 360 room image(s).
      • support shell-style wildcards with quote (e.g. "my_fasinated_img_dir/*png").
    • --output_dir telling the path to the directory for dumping the results.
    • See python preprocess.py -h for more detailed script usage help.
  • Outputs: Under the given --output_dir, you will get results like below and prefix with source image basename.
    • The aligned rgb images [SOURCE BASENAME]_aligned_rgb.png and line segments images [SOURCE BASENAME]_aligned_line.png
      • demo_aligned_rgb.png demo_aligned_line.png
    • The detected vanishing points [SOURCE BASENAME]_VP.txt (Here demo_VP.txt)
      -0.002278 -0.500449 0.865763
      0.000895 0.865764 0.500452
      0.999999 -0.001137 0.000178
      

2. Estimating layout with HorizonNet

  • Execution: Predict the layout from above aligned image and line segments by firing below command.
    python inference.py --pth ckpt/resnet50_rnn__mp3d.pth --img_glob assets/preprocessed/demo_aligned_rgb.png --output_dir assets/inferenced --visualize
    • --pth path to the trained model.
    • --img_glob path to the preprocessed image.
    • --output_dir path to the directory to dump results.
    • --visualize optinoal for visualizing model raw outputs.
    • --force_cuboid add this option if you want to estimate cuboid layout (4 walls).
  • Outputs: You will get results like below and prefix with source image basename.
    • The 1d representation are visualized under file name [SOURCE BASENAME].raw.png
    • The extracted corners of the layout [SOURCE BASENAME].json
      {"z0": 50.0, "z1": -59.03114700317383, "uv": [[0.029913906008005142, 0.2996523082256317], [0.029913906008005142, 0.7240479588508606], [0.015625, 0.3819984495639801], [0.015625, 0.6348703503608704], [0.056027885526418686, 0.3881891965866089], [0.056027885526418686, 0.6278984546661377], [0.4480381906032562, 0.3970482349395752], [0.4480381906032562, 0.6178648471832275], [0.5995567440986633, 0.41122356057167053], [0.5995567440986633, 0.601679801940918], [0.8094607591629028, 0.36505699157714844], [0.8094607591629028, 0.6537724137306213], [0.8815288543701172, 0.2661873996257782], [0.8815288543701172, 0.7582473754882812], [0.9189453125, 0.31678876280784607], [0.9189453125, 0.7060701847076416]]}
      

3. Layout 3D Viewer

  • Execution: Visualizing the predicted layout in 3D using points cloud.
    python layout_viewer.py --img assets/preprocessed/demo_aligned_rgb.png --layout assets/inferenced/demo_aligned_rgb.json --ignore_ceiling
    • --img path to preprocessed image
    • --layout path to the json output from inference.py
    • --ignore_ceiling prevent showing ceiling
    • See python layout_viewer.py -h for usage help.
  • Outputs: In the window, you can use mouse and scroll wheel to change the viewport

Your own dataset

See tutorial on how to prepare it.

Training

To train on a dataset, see python train.py -h for detailed options explaination.
Example:

python train.py --id resnet50_rnn
  • Important arguments:
    • --id required. experiment id to name checkpoints and logs
    • --ckpt folder to output checkpoints (default: ./ckpt)
    • --logs folder to logging (default: ./logs)
    • --pth finetune mode if given. path to load saved checkpoint.
    • --backbone backbone of the network (default: resnet50)
      • other options: {resnet18,resnet34,resnet50,resnet101,resnet152,resnext50_32x4d,resnext101_32x8d,densenet121,densenet169,densenet161,densenet201}
    • --no_rnn whether to remove rnn (default: False)
    • --train_root_dir root directory to training dataset. (default: data/layoutnet_dataset/train)
    • --valid_root_dir root directory to validation dataset. (default: data/layoutnet_dataset/valid/)
      • If giveng, the epoch with best 3DIoU on validation set will be saved as {ckpt}/{id}/best_valid.pth
    • --batch_size_train training mini-batch size (default: 4)
    • --epochs epochs to train (default: 300)
    • --lr learning rate (default: 0.0001)
    • --device set CUDA enabled device using device id (not to be used if multi_gpu is used)
    • --multi_gpu enable parallel computing on all available GPUs

Quantitative Evaluation - Cuboid Layout

To evaluate on PanoContext/Stanford2d3d dataset, first running the cuboid trained model for all testing images:

python inference.py --pth ckpt/resnet50_rnn__panos2d3d.pth --img_glob "data/layoutnet_dataset/test/img/*" --output_dir output/panos2d3d/resnet50_rnn/ --force_cuboid
  • --img_glob shell-style wildcards for all testing images.
  • --output_dir path to the directory to dump results.
  • --force_cuboid enfoce output cuboid layout (4 walls) or the PE and CE can't be evaluated.

To get the quantitative result:

python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/*txt"
  • --dt_glob shell-style wildcards for all the model estimation.
  • --gt_glob shell-style wildcards for all the ground truth.

If you want to:

  • just evaluate PanoContext only python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/pano*txt"
  • just evaluate Stanford2d3d only python eval_cuboid.py --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/camera*txt"

📋 The quantitative result for the released resnet50_rnn__panos2d3d.pth is shown below:

Testing Dataset 3D IoU(%) Corner error(%) Pixel error(%)
PanoContext 83.39 0.76 2.13
Stanford2D3D 84.09 0.63 2.06
All 83.87 0.67 2.08

Quantitative Evaluation - General Layout

TODO

  • Faster pre-processing script (top-fron alignment) (maybe cython implementation or fernandez2018layouts)

Acknowledgement

  • Credit of this repo is shared with ChiWeiHsiao.
  • Thanks limchaos for the suggestion about the potential boost by fixing the non-expected behaviour of Pytorch dataloader. (See Issue#4)

Citation

@inproceedings{SunHSC19,
  author    = {Cheng Sun and
               Chi{-}Wei Hsiao and
               Min Sun and
               Hwann{-}Tzong Chen},
  title     = {HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch
               Data Augmentation},
  booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR}
               2019, Long Beach, CA, USA, June 16-20, 2019},
  pages     = {1047--1056},
  year      = {2019},
}

horizonnet's People

Contributors

chiweihsiao avatar dimabendera avatar shi-yan avatar snd-ml avatar sunset1995 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

horizonnet's Issues

Point Cloud Registration

Hi all,

I am pretty new to 3D modelling and I was wondering if anyone could set me in the right direction regarding point clouds registration.

Was anyone successful in registering the results from room layout to "compose" a 3D floorplan?

I am grateful for any suggestion.

Thank you very much!
R.

Non-2x1 images

How can I tweak this to support non-2x1 panorama sizes? That is, 360 degrees yaw but not 180 (~150) degrees pitch?

the result seems to be inversed

This is the 360 photo, as you can see, when facing the wall that has the main entrance, the main entrance should be on the right hand side.

demo2_aligned_rgb

the result is inversed along one direction. The door is now on the left hand side

Screen Shot 2020-02-08 at 10 03 32 PM

Missing model weights

Hi, thanks for your great work.

In the readme, it states to run

python inference.py --pth ckpt/resnet50_rnn__mp3d.pth --img_glob assets/preprocessed/demo_aligned_rgb.png --output_dir assets/inferenced --visualize

however, this ckpt/resnet50_rnn__mp3d.pth model checkpoint is not released publicly, from what I could see. Is that correct?

Getting error while training the model

I have tried to train this model on layoutnet datasets with all default parameters mentioned here (https://github.com/sunset1995/HorizonNet).
I executed the following command
(HorizonNet) D:\HorizonNet-master>python train.py --id resnet50_rnn

I am getting the following error

Epoch: 0%| | 0/300 [00:00<?, ?ep/s]
Traceback (most recent call last):
File "train.py", line 181, in
iterator_train = iter(loader_train)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\utils\data\dataloader.py", line 279, in iter
return _MultiProcessingDataLoaderIter(self)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\utils\data\dataloader.py", line 719, in init
w.start()
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function at 0x0000023358098158>: attribute lookup on main failed

(HorizonNet) D:\HorizonNet-master>Traceback (most recent call last):
File "", line 1, in
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Anaconda3\envs\HorizonNet\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Then I modified "train.py" at line number 114 as "num_workers=0".
I am using anaconda in which a new environment named HorizonNet is created with python version = 3.6.
Now I am getting the following error

(HorizonNet) D:\HorizonNet-master>python train.py --id resnet50_rnn --epochs 50
Train ep1: 0%| | 0/204 [00:01<?, ?it/s]
Epoch: 0%| | 0/50 [00:01<?, ?ep/s]
Traceback (most recent call last):
File "train.py", line 191, in
losses = feed_forward(net, x, y_bon, y_cor)
File "train.py", line 26, in feed_forward
y_bon_, y_cor_ = net(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 242, in forward
feature = self.reduce_height_module(conv_list, x.shape[3]//self.step_cols)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 166, in forward
for f, x, out_c in zip(self.ghc_lst, conv_list, self.cs)
File "D:\HorizonNet-master\model.py", line 166, in
for f, x, out_c in zip(self.ghc_lst, conv_list, self.cs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 138, in forward
x = self.layer(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 124, in forward
return self.layers(x)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\Anaconda3\envs\HorizonNet\lib\site-packages\torch\nn\modules\module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "D:\HorizonNet-master\model.py", line 31, in forward
return lr_pad(x, self.padding)
File "D:\HorizonNet-master\model.py", line 21, in lr_pad
return torch.cat([x[..., -padding:], x, x[..., :padding]], dim=3)
RuntimeError: CUDA out of memory. Tried to allocate 66.00 MiB (GPU 0; 6.00 GiB total capacity; 4.28 GiB already allocated; 4.91 MiB free; 4.34 GiB reserved in total by PyTorch)

Please help

How to estimate room dimensions from reconstructed layout?

@sunset1995, thank you for the great work.

Is it possible to estimate the dimensions of the room, either from panorama or reconstruction, using known camera parameters, such as camera height or focal length? I would like to know how HorizonNet can recover the actual dimensions of the room in terms of height x width of the walls, for example. Thank you.

Customized dataset

Sorry, I wonder how to define my own data format about occlusion area:
In README_PREPARE_DATASET.md:
"Please note that 713 100 and it floor correspondent 713 415 are occluded."
How to note coordinates for occlusion?

about `np_coorx2u` and `np_coory2v ` function in `post_proc.py`

hi, sunset
I am very interested in your work
then I have an issue about np_coorx2u and np_coory2v function in post_proc.py

def np_coorx2u(coorx, coorW=1024):
    return ((coorx + 0.5) / coorW - 0.5) * 2 * PI


def np_coory2v(coory, coorH=512):
    return -((coory + 0.5) / coorH - 0.5) * PI

I think coorx / coorW or coory / coorH already got rate , why need +0.5?

Finetuned model

Hello,when training, in order to get the finetuned model, what's the meaning of "Finetuned on finetune_general/ 66 images"? I just want to know the process of training and validation,can you help me? Thank you!

Assertion Error while training with custom datasets

I have tried to train this model with my own custom dataset but the following is the error I encountered.

The format of the dataset followed the one from the tutorial.
The images I used are 360 degree pictures of my room/office taken by myself and I have labeled the images without pre-processing(aligning camera rotation pose). Is this process necessary?

Looking forward to any suggestions.

Traceback (most recent call last): | 0/2 [00:00<?, ?it/s]
File "train.py", line 171, in
x, y_bon, y_cor = next(iterator_train)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
AssertionError: Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/Projects/HorizonNet_Original/dataset.py", line 68, in getitem
assert (cor[0::2, 1] > cor[1::2, 1]).sum() == 0
AssertionError

is there some model adjustment in it?

RuntimeError: Error(s) in loading state_dict for HorizonNet:
Missing key(s) in state_dict: "feature_extractor.encoder.conv1.1.weight", "feature_extractor.encoder.bn1.weight", "feature_extractor.encoder.bn1.bias", "feature_extractor.encoder.bn1.running_mean", "feature_extractor.encoder.bn1.running_var", "feature_extractor.encoder.layer1.0.conv1.weight", "feature_extractor.encoder.layer1.0.bn1.weight", "feature_extractor.encoder.layer1.0.bn1.bias", "feature_extractor.encoder.layer1.0.bn1.running_mean", "feature_extractor.encoder.layer1.0.bn1.running_var", "feature_extractor.encoder.layer1.0.conv2.1.weight", "feature_extractor.encoder.layer1.0.bn2.weight", "feature_extractor.encoder.layer1.0.bn2.bias", "feature_extractor.encoder.layer1.0.bn2.running_mean", "feature_extractor.encoder.layer1.0.bn2.running_var",
...
Unexpected key(s) in state_dict: "stage1.0.layer.1.weight", "stage1.0.layer.1.bias", "stage1.0.layer.2.weight", "stage1.0.layer.2.bias", "stage1.0.layer.2.running_mean", "stage1.0.layer.2.running_var", "stage1.0.layer.2.num_batches_tracked", "stage1.0.layer.5.weight", "stage1.0.layer.5.bias", "stage1.0.layer.6.weight", "stage1.0.layer.6.bias", "stage1.0.layer.6.running_mean", "stage1.0.layer.6.running_var", "stage1.0.layer.6.num_batches_tracked", "stage1.0.layer.9.weight", "stage1.0.layer.9.bias",
...

finetune model

Hello, I have fine-tuned the training model according to the meaning of the article, but the results are different from yours. I don't know why the result is so bad. Can you help me solve it? Thank you!

Run command “python train.py --id finetune --freeze_earlier_blocks 4"

This is the code I modified.

Create dataloader

####################### #modify#########################
dataset_train_finetune = PanoCorBonDataset(
root_dir=args.train_finetune_dir,
flip=not args.no_flip, rotate=not args.no_rotate, gamma=not args.no_gamma,
stretch=not args.no_pano_stretch)

loader_train_finetune = DataLoader(dataset_train_finetune, args.batch_size_train,
                          shuffle=True, drop_last=True,
                          num_workers=args.num_workers,
                          pin_memory=not args.no_cuda,
                          worker_init_fn=lambda x: np.random.seed())
 ####################### #modify#########################
dataset_train = PanoCorBonDataset(
    root_dir=args.train_root_dir,
    flip=not args.no_flip, rotate=not args.no_rotate, gamma=not args.no_gamma,
    stretch=not args.no_pano_stretch)
loader_train = DataLoader(dataset_train, args.batch_size_train,
                          shuffle=True, drop_last=True,
                          num_workers=args.num_workers,
                          pin_memory=not args.no_cuda,
                          worker_init_fn=lambda x: np.random.seed())
if args.valid_root_dir:
    dataset_valid = PanoCorBonDataset(
        root_dir=args.valid_root_dir,
        flip=False, rotate=False, gamma=False,
        stretch=False)
    loader_valid = DataLoader(dataset_valid, args.batch_size_valid,
                              shuffle=False, drop_last=False,
                              num_workers=args.num_workers,
                              pin_memory=not args.no_cuda)

Start training

for ith_epoch in trange(1, args.epochs + 1, desc='Epoch', unit='ep'):

    # Train phase
    net.train()
    if args.freeze_earlier_blocks != -1:
        b0, b1, b2, b3, b4 = net.feature_extractor.list_blocks()
        blocks = [b0, b1, b2, b3, b4]
        for i in range(args.freeze_earlier_blocks + 1):
            for m in blocks[i]:
                m.eval()
    iterator_train = iter(loader_train)
    iterator_train_finetune = iter(loader_train_finetune)

    ith_batch = 0

#######################################modify##################################
for _ in trange(len(loader_train_finetune),
desc='Train ep%s' % ith_epoch, position=1):
# Set learning rate
adjust_learning_rate(optimizer, args)

        args.cur_iter += 1
        x1, y_bon1, y_cor1 = next(iterator_train_finetune)
        x2, y_bon2, y_cor2 = next(iterator_train)

        x = torch.cat([x1,x2],0)
        y_bon = torch.cat([y_bon1, y_bon2], 0)
        y_cor = torch.cat([y_cor1, y_cor2], 0)

        losses = feed_forward(net, x, y_bon, y_cor)
        for k, v in losses.items():
            k = 'train/%s' % k
            tb_writer.add_scalar(k, v.item(), args.cur_iter)
        tb_writer.add_scalar('train/lr', args.running_lr, args.cur_iter)
        loss = losses['total']

#######################################modify##################################
# backprop
optimizer.zero_grad()
loss.backward()
nn.utils.clip_grad_norm_(net.parameters(), 3.0, norm_type='inf')
optimizer.step()
ith_batch += 1
179,24 82%

Why z0 is set to be 50?

Hi, thanks for your excellent work! I have two questions.

  1. According to my understanding, z0 represents the height of the ceiling from the camera plane,isn't it 1.6 meters? why z0 is set to be 50?
    line 95 inference.py
# Init floor/ceil plane
    z0 = 50
    _, z1 = post_proc.np_refine_by_fix_z(*y_bon_, z0)
  1. Does variable tol mean the threshold of signal when recovering all planes? tol is 0.05 in paper but abs(0.16 * z1 / 1.6) in code.
    line 106 inference.py
    cor, xy_cor = post_proc.gen_ww(xs_, y_bon_[0], z0, tol=abs(0.16 * z1 / 1.6), force_cuboid=force_cuboid)
    line 81 post_proc.py
    invalid = (n < len(vec) * 0.4) | (l > tol)

Magic number in decoder fc layer

Hi, thanks for sharing the code, in the code of decoder bias, what is the meaning of these magic numbers? Can we randomly initialize the bias in the normal way?

HorizonNet/model.py

Lines 214 to 216 in e6d7e03

self.linear.bias.data[0*self.step_cols:1*self.step_cols].fill_(-1)
self.linear.bias.data[1*self.step_cols:2*self.step_cols].fill_(-0.478)
self.linear.bias.data[2*self.step_cols:3*self.step_cols].fill_(0.425)

How to train the network in three GPUs

I have noticed you have mentioned that it takes four hours to finish the training on three NVIDIA GTX 1080 Ti GPUs. However, you do not describe how to train the network on three GPUs in README.md.
When I run
python train.py --id resnet50_rnn --use_rnn
, it only takes a single GPU, and the batch size of it is eight which is different from that mentioned in your paper.
Could you please describe the process of training in detail.

Question about ordering corner coordinates

Hi, I am preparing my own custom dataset of 360 panorama images. I read in issue #20 that I have to order the corner coordinates based on the 3d skeleton of the room layout. Can you elaborate more on this ? So I first generate the 3D layout of the room using layout viewer from the panorama image. Then, what is the correct viewing angle to see the order of the layout or corners ?

max of 4 walls to model

Hi,
when I'm trying to create a 3d model with more than 4 walls, it looks like it still make it with 4 walls.

can i please get the exactly same resnet that you have been used at your example? (resnet50 rnn_mp3d)

are there any tutorials / guidelines for sperical projection

Hi sunset1995, thank you so much for your contribution!

I am currently going through this repository trying to understand what the code is doing, but I find some of the functions in inference, postprocessing and evil_utils quite difficult to understand, as I do not know the naming convention, and lack the required background knowledge of the formulas used for 3d 2d transformation.

Would you be so kind as to point me in the right direction in order to understand the code? thank u soooo much > v <

I want to save the generated model as an obj+mtl file and import it into Blender. May I ask which UV coordinate values to pass in?

I want to save the generated model as an .obj + .mtl file and import it into Blender. May I ask which UV coordinate values to pass in?
if args.vis:
mesh = o3d.geometry.TriangleMesh()
mesh.vertices = o3d.utility.Vector3dVector(points[:, :3])
mesh.vertex_colors = o3d.utility.Vector3dVector(points[:, 3:] / 255.)
mesh.triangles = o3d.utility.Vector3iVector(faces)

    text = cv2.imread('assets/demo.png')

    
    mesh.triangle_uvs = o3d.open3d.utility.Vector2dVector(xyzrgb)  # ??????---triangle_uvs ???
    mesh.triangle_material_ids = o3d.utility.IntVector([0] * len(faces))
    mesh.textures = [o3d.geometry.Image(text)]

    #save .obj & .mtl 
    o3d.io.write_triangle_mesh('/home/demo01.obj', mesh)

截图_选择区域_20230704172530

截图_选择区域_20230704173421

error while using customized data

while running the training I am getting below error:

print('Skip ground truth invalid (%s)' % gt_path)
NameError: name 'gt_path' is not defined

please let me know how to correct this error

Error in inference.py

After I trained by my customized data I want to evaluate the model on the test set. But when I run the inference.py for general shape estimation, the error came out and progress interrupted:

Traceback (most recent call last):
File "inference.py", line 198, in
args.min_v, args.r)
File "inference.py", line 112, in inference
if not Polygon(xy2d).is_valid:
File "/home/anaconda3/lib/python3.7/site-packages/shapely/geometry/polygon.py", line 240, in init
ret = geos_polygon_from_py(shell, holes)
File "/home/anaconda3/lib/python3.7/site-packages/shapely/geometry/polygon.py", line 494, in geos_polygon_from_py
ret = geos_linearring_from_py(shell)
File "shapely/speedups/_speedups.pyx", line 239, in shapely.speedups._speedups.geos_linearring_from_py
ValueError: A LinearRing must have at least 3 coordinate tuples

Is it the reason that corners estimations are so bad that it could not generate a polygon? Could you tell me why this happened? Thanks a lot.

boundary training loss encounters impulse

Hi, thanks for sharing your code~
I run your code using the hyper-parameters described in your paper on the dataset (panoContext + Stanford 2d-3d) extracted by your LayoutNet-Pytorch code from .t7, however I find that the boundary training loss encounters impulse in some iterations but converge in the overall trend. Can you tell me why please? Are there some annotations wrong? Thanks a lot!
horizonnet

Adding Multi GPU support and solving dependency issues!

  1. Adding environment.yml for better dependency handling.
  2. Updating the code such that it can run on Multiple GPUs at once or specified CUDA capable device
  3. Implemented automatic mixed precision training (AMP) in the code to reduce the GPU memory overhead. This will help to accommodate larger batch size.

3D IoU evaluation bug in `eval_general.py`

Let:

  • A be the 2d area of prediction
  • B be the 2d area of ground-truth
  • I be the 2d area of intersection of prediction and ground-truth
  • Ha be the layout height of prediction
  • Hb be the layout height of ground-truth

The 3D IoU should be:

area3d_I = I * min(Ha, Hb)
area3d_A = A * Ha
area3d_B = B * Hb
iou3d = area3d_I / (area3d_A + area3d_B - area3d_I)

However, the original implementation is wrong:

iou2d = I / (A + B - I)
iouH = min(Ha, Hb) / max(Ha, Hb)
iou3d = iout2d * iouH

For an easier comparison, let rewrite it into same form:

iou3d = iout2d * iouH
iou3d = I / (A + B - I) * iouH
iou3d = I / (A + B - I) * min(Ha, Hb) / max(Ha, Hb)
iou3d = I * min(Ha, Hb) / (A + B - I) / max(Ha, Hb)
iou3d = area3d_I / ((A + B - I) * max(Ha, Hb))

Without loss of generality, let say Ha >= Hb. Then the difference is:

  • area3d_I / (A * Ha + B * Ha - I * Ha) (my fault)
  • area3d_I / (A * Ha + B * Hb - I * Hb) (the correct one)

As B >= I and Ha >= Hb, my 3D IoU is less or equal to the correct 3D IoU.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.