p517332051 / zoedepthrefine Goto Github PK

View Code? Open in Web Editor NEW

This project forked from goooyi/zoedepthrefine

0.0 0.0 0.0 32 KB

A small network to improve ZoeDepth on HabitatDyn Dataset

Python 100.00%

zoedepthrefine's Introduction

ZoeDepthRefine

A small network to improve ZoeDepth on HabitatDyn Dataset

Related-reading for better understand the problem:

Paper:Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

Paper:On regression losses for deep depth estimation

Paper:On the Metrics for Evaluating Monocular Depth Estimation

Paper: ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Blog:Monocular Depth Estimation using ZoeDepth : Our Experience

TODO:

standalone config file,
how to pass parameter to train() cleaner? wraper function? the example from pytorch-ratytune use load_data() and pass a data_dir to trainer
better val video clip to cover most range in HabitatDyn dataset
ray tun: Use metric to determin which loss-fun, epoch, lr, is best combo
Better Loss function
val and test
implement Metrics used to compare different model: Relative-Square, Relative-absolute and RMS
extend this repo as a standalone method that does Relative Depth Estimation (RDE) to Metric Depth Estimation (MDE), since this should be a small plug-in module as the RDE model doing the hard work and this module only trailer to specfic dataset with metric depth info. Application?

Metrcis to evaluate different loss function.

MAE SquaRel RMSE RMSLE Delta

HYPERPARAMETER TUNING WITH RAY TUNE

Original Blog

Research: Dataloader bottele neck: The PyTorch IO problem

Local Specs:

Pytroch model: 
    pytorch Unet model with ResNet18 backbone
Hardware spec: 
    CPU: AMD Ryzen 9 3900X 12-Core Processor
    GPU: RTX3090
    Disk: Unknown
Data:
    image data of size 480*640 rgb.jpg and depth.png
    resizing to 192*256 and concatenate in Dataset class using torch
    1 video clip of 240 frames rgb.jpg is 15.5MB on hard disk
    1 depth folder with corresponding rgb.png file 3.5MB 
    1 pseudo depth of 16bit format from zoe model is 16 MB
    so for 1 video about 35MB is loaded to memory
    1 scene has about 54 videos, 10 scene will have about 20GB memory usage: but 27

For batch size 64, shuffle=True, pin_memory=True, num_workers=4,prefetch_factor=2:

dataloader time: 2.9199090003967285 
memory to cuda time: 0.03172612190246582 
train train 1 batch time:  0.08431768417358398

Observe lsage around 35% for all thread, GPU usage peak to 99% but only for a short period cause the datalloding bottleneck

Solutions

not doing transform will save 2 second for each folder, so doing transfrom using cuda and pin_memory=False
Usually, big gains over torch.DataParallel using apex.DistributedDataParallel. Moving from ‘one main process + worker process + multiple-GPU with DataParallel’ to 'one process-per GPU with apex (and presumably torch) DistributedDataParallel has always improved performance for me. Remember to (down)scale your worker processes per training process accordingly. Higher GPU utilization and less waiting for synchronization usually results, the variance in batch times will reduce with the average time moving closer to the peak.
multi-treading vs Distributed(multi-server) vs Data Parallel(multi-gpu) vs multi-gpu(https://pytorch.org/docs/stable/notes/multiprocessing.htm)
voom voom repo

for epoch in tqdm(range(1000),desc="EPOCH", position=0):
    if epoch%4 == 0:
        print(f'change the scenes\n')
        datasets = []
        for record_file in tqdm(random.sample(dataset_list,432), desc= "Load the datasets"):
            record = np.load(dataset_path + record_file,allow_pickle=True).tolist()
            datasets += (record)
        print(f'change the scenes finished data length: {len(datasets)}\n')
        imagination_dataset = ImaginationDataset(datasets=datasets, config=config)
        train_dataloader = DataLoader(imagination_dataset, batch_size=batch_size, shuffle=True)