fyu / drn Goto Github PK

View Code? Open in Web Editor NEW

1.1K 1.1K 219.0 521 KB

Dilated Residual Networks

Home Page: https://www.vis.xyz/pub/drn

License: BSD 3-Clause "New" or "Revised" License

Python 72.82% Makefile 0.66% C 6.93% C++ 1.48% Cuda 17.74% Shell 0.37%

drn's People

Contributors

Stargazers

Watchers

Forkers

ml-lab zgsxwsdxg bityangke codeaudit achaiah hedgefair johndpope crcrpar starstylesky searobbersduck adrianhust cclauss tpys cometyang alicanb benjamesbabala awesome-archive duke24k jgraving hkcaesar 94mia jianghuairong samvitj fireae resurgo-genetics mahlermozart zkdfbb westamine hkaraoguz denethor1997 ieee820 pschafhalter amit2014 noahgolmant xiaochaowei jimmycai91 lonestar686 dreadlord1984 xinw1012 yousongzhu lzd0825 ccxu qxr04025 vpomponiu liujie3948 yalechang acgtyrant lyken17 lraxue lextal zmlshiwo shaunlipy nikky4d grseb9s csjunxu matthew43 enqing626 keyky erkang royalvane chengruizhe ai3dvision forschumi xyyue ustc2014 lynkzhang hsouporto hzhang57 catalys1 oujieww afcarl phymhan icgog tglubenov artemsavkin tarun005 sriharsha0806 zilipeng nerei eggpan95 re3write wpfhtl queenie88 suyanzhou626 zhenyuczy qinfeiyu19941208 weig1210 bhatiaabhishek mily33 klqulei kingmv clxie clegendbuptsun wyk0517 fendaq syzlhh wh-forker zenozhouzhao ddeeppnneett saptakatha

drn's Issues

test error : 'unexpected key "base.0.0.weight" in state_dict'

Hi, @fyu ,

When I start to perform the testing as guided by readme, I came across 'unexpected key "base.0.0.weight" in state_dict issue:

(tf1.3_pytorch0.2) root@milton-OptiPlex-9010:/data/code/drn# python segment.py test -d /data2/cityscapes_dataset -c 19 -s 896 --arch drn_d_22 --pretrained ./models/drn_d_22_cityscapes.pth --phase test --batch-size 1
segment.py test -d /data2/cityscapes_dataset -c 19 -s 896 --arch drn_d_22 --pretrained ./models/drn_d_22_cityscapes.pth --phase test --batch-size 1
Namespace(arch='drn_d_22', batch_size=1, classes=19, cmd='test', crop_size=896, cuda=True, data_dir='/data2/cityscapes_dataset', epochs=10, evaluate=False, load_rel=None, lr=0.01, momentum=0.9, no_cuda=False, phase='test', pretrained='./models/drn_d_22_cityscapes.pth', resume='', step=200, weight_decay=0.0001, workers=8)
momentum : 0.9
pretrained : ./models/drn_d_22_cityscapes.pth
resume : 
batch_size : 1
cuda : True
weight_decay : 0.0001
workers : 8
load_rel : None
evaluate : False
no_cuda : False
step : 200
phase : test
classes : 19
arch : drn_d_22
lr : 0.01
crop_size : 896
cmd : test
data_dir : /data2/cityscapes_dataset
epochs : 10
Traceback (most recent call last):
  File "segment.py", line 556, in <module>
    main()
  File "segment.py", line 553, in main
    test_seg(args)
  File "segment.py", line 485, in test_seg
    model.load_state_dict(torch.load(args.pretrained))
  File "/root/anaconda3/envs/tf1.3_pytorch0.2/lib/python3.5/site-packages/torch/nn/modules/module.py", line 355, in load_state_dict
    .format(name))
KeyError: 'unexpected key "base.0.0.weight" in state_dict'

Any suggestion to fix it? My system uses pytorch 0.2.0 already. It seems there's inconsistency with pytorch internal module when loading trained model "drn_d_22_cityscapes.pth" ...

Thanks!

How do you train DRN-D-22?

I trained it with python3 segment.py train -d <data_folder> -c <category_number> -s 896 --arch drn_d_22 --batch-size 14 --epochs 250 --lr 0.001 --momentum 0.99 --step 100, and multi-scale test only get 66.49.

It seems that you have many GPUs, you may use a huge batch-size or big crop size, right?

It seems that the synced var is not the true var

The code just averaging the vars from all GPUs, not computing the total var in all GPUs.

Drn-c model

How can I download drn-c-* models ?

sync-bn efficiency?

Hi, thanks for your work on sync-bn.
1.I want to know the efficiency comparison between un-sync bn and your implementation of sync-bn?
2. Btw, has anyone meet program stuck (I meet this at some iterations when 2 gpus are used and at the beginning if 4 gpus are adopted)?

hard to train to see the pole object in the CamVid

Thanks for your code using drn to segment the CamVid dataset, but I use the code well on most objects segmenting, it is hard to segment the pole object in the CamVid dataset, so what can I do to get the pole object to get segmenting.

CUDA_LAUNCH_BLOCKING stucks

I find if train with CUDA_LAUNCH_BLOCKING=1, then the segment.py will stuck like below:

I google and google again, no one compains he will stuck when use CUDA_LAUNCH_BLOCKING=1.

how to use the codes under the folder of the "lib"?

I'm really interested in your work. But I'm not similar with Pytorch and confused about how to use the codes under the folder of the "lib", there are many code , like ".cu", "*.c", ".so". would you mind giving some advice or links to build them. Thanks. @fyu

trained model for classification

Hi, @fyu ,

Can you share your trained model on imagenet?

Thanks!

Question: purpose of BatchNormSync

Is it supposed to better utilize multi-gpu case?

Are there any other differences from the standard BatchNorm?
Thanks!

After running the program on ubuntu14, the program becomes a disk sleep state and cannot be killed.

cmd:
CUDA_VISIBLE_DEVICES=0,1 python classify.py train --arch drn_d_38 -j 8 --batch-size 32 data/goods --epochs 120 2> &1 |tee log1 &

State:
wust 13620 0.3 0.5 1394124 253628 pts/9 D 19:18 0:00 python classify.py train --arch drn_d_38 -j 8 --batch-size 128 data/goods --epochs 120

The above is my running command and status.

about train_images.txt

Hi, @fyu ,

Thanks for releasing drn. For the training on Cityscapes dataset, train_images.txt / train_labels.txt are required and they are explained in readme. Could you upload these files or could we download them in Cityscapes official site?

Thanks!

Dilation in bottleneck layers

For a BasicBlock, the dilation is d for all layers except the first; and for the first, it is d/2. What is the equivalent of this logic in the bottleneck layers? In the BottleNeck subroutine, only dilation[1] is used, so does it mean that the same dilation is used for all the layers in the block?

on saving test images error

Hi! Following your introduction, I achieved the testing. But I found that it saves the predicted images into the original images folder --"test", and overwrites the original test images. It does not create the folder as the "save_colorful_images" & "save_output_images" function described.
Can your tell me why. Thanks!

drn_a_50 missing keys when testing

Thank you so much for the code! I've encountered an issue when testing the trained DRN-A-50 model. A very long list of missing keys appears when running the segmentation file in test mode. This is by loading the latest checkpoint as pre-trained. Can anyone advise me on this?

"AssertionError "

Thanks for the "Dilated Residual Networks" pipeline. I have followed all the instructions mentioned in the documentation. However I have encountered across following error. Is anyone able to run the complete training pipeline?

(py36) gpu3@GPU3:~/anue/drn-master$ python3 segment.py train -d /home/gpu3/anue/drn-master/datasets/ -c 19 -s 896 --arch drn_d_22 --batch-size 32 --epochs 250 --lr 0.01 --momentum 0.9 --step 100
segment.py train -d /home/gpu3/anue/drn-master/datasets/ -c 19 -s 896 --arch drn_d_22 --batch-size 32 --epochs 250 --lr 0.01 --momentum 0.9 --step 100
Namespace(arch='drn_d_22', batch_size=32, bn_sync=False, classes=19, cmd='train', crop_size=896, data_dir='/home/gpu3/anue/drn-master/datasets/', epochs=250, evaluate=False, list_dir=None, load_rel=None, lr=0.01, lr_mode='step', momentum=0.9, ms=False, phase='val', pretrained='', random_rotate=0, random_scale=0, resume='', step=100, test_suffix='', weight_decay=0.0001, with_gt=False, workers=8)
segment.py train -d /home/gpu3/anue/drn-master/datasets/ -c 19 -s 896 --arch drn_d_22 --batch-size 32 --epochs 250 --lr 0.01 --momentum 0.9 --step 100
data_dir : /home/gpu3/anue/drn-master/datasets/
cmd : train
list_dir : None
classes : 19
crop_size : 896
step : 100
arch : drn_d_22
batch_size : 32
epochs : 250
lr : 0.01
lr_mode : step
momentum : 0.9
weight_decay : 0.0001
evaluate : False
resume :
pretrained :
workers : 8
load_rel : None
phase : val
random_scale : 0
random_rotate : 0
bn_sync : False
ms : False
with_gt : False
test_suffix :
Traceback (most recent call last):
File "segment.py", line 745, in
main()
File "segment.py", line 739, in main
train_seg(args)
File "segment.py", line 371, in train_seg
list_dir=args.list_dir),
File "segment.py", line 133, in init
self.read_lists()
File "segment.py", line 157, in read_lists
assert len(self.image_list) == len(self.label_list)
AssertionError

BatchNormsync with Adam Optimizer

Is the bnsync code written specifically for SGD optimizer? The loss is not converging if I use and train the model with Adam optimizer.

name 'batchnormsync' is not defined

Traceback (most recent call last):
File "segment.py", line 743, in
main()
File "segment.py", line 735, in main
args = parse_args()
File "segment.py", line 729, in parse_args
drn.BatchNorm = batchnormsync.BatchNormSync
NameError: name 'batchnormsync' is not defined

The mean of fill_up_weights

Hi, I am reading the code. And I am wondering what's the mean of the function----fill_up_weights.
The code is here.
It seems that use the fill_up_weight to init the parameter of ConvTranspose2d. However, the parameter of ConvTranspose2d maybe not update in the training process. So why do freeze the weight of ConvTranspose2d?

how to do Object Localization on imagenet ? Can you share the code?

Does BatchNormSync support torch.nn.parallel.DistributedDataParallel?

What is the "score" while training?

While training it shows a score next to the loss. What is this exactly? I thought this would be mIOU at first, but the results are way too high to be that.

the architecture of the DRN of layer3-6 is different from that the paper described?

Upload pretrained DRN models as they no longer seem available on the Princeton website

Hello @fyu — I am not sure if you moved institutions—if so, congratulations!—but the URLs for the pretrained models all return 404s, even at the root URL.

Would it be possible to upload them to e.g. GDrive along the Cityscape ones? I need them for replicating some downstream work :)

Hope you had good holidays!

Best,

Can't access model files on princeton site

Hi, everyone!

Cannot download file from https://tigress-web.princeton.edu/~fy/drn/models/drn_d_54-0e0534ff.pth.
While looking at brouser it writes: "Forbidden You don't have permission to access /~fy/drn/models/drn_d_54-0e0534ff.pth on this server."

Please fix files availability!

Best regards,
Artem

train error

Hi,
when I train the segment.py, it crashed, the problem is listed below：

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1503968623488/work/torch/lib/THC/generic/THCStorage.c line=32 error=59 : device-side assert triggered
Traceback (most recent call last):
File "segment.py", line 553, in
main()
File "segment.py", line 548, in main
train_seg(args)
File "segment.py", line 355, in train_seg
eval_score=accuracy)
File "segment.py", line 249, in train
losses.update(loss.data[0], input.size(0))
RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1503968623488/work/torch/lib/THC/generic/THCStorage.c:32

How can I get with this problem.

Could you provide script to train on BDD100K for segmentation task?

Great job. Could you also provide the dataset loader and script to train on your dataset (BDD100K) for segmentation? Thanks

evaluation results(mIou) with validation dataset

hello @fyu , when I run as follows with your released pretrained model:
python segment.py test -d ./dataset/cityscapes/ -c 19 --arch drn_d_105 --pretrained output/drn-d-105_ms_cityscapes.pth --phase val --batch-size 1 --ms

it gives the a lower mIou(44%) then your performance,

The mean and std used in preprocess are not same as those used in pretraining the ImageNet model.

As far as I know, someone use pretrained model to fine tune in semantic segmentation, they will still use the mean and std from ImageNet which is used to pretrain the pretrained model.

So if we use the models from torchvision to finetune, the official advice:

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]. You can use the following transform to normalize:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

However you use:

{
    "mean": [
        0.290101,
        0.328081,
        0.286964
    ],
    "std": [
        0.182954,
        0.186566,
        0.184475
    ]
}

to normalize, any idea?

Can Aligned-Inception-ResNet benefits the semantic segmentation task?

I read the arch from DRN, and I get the findings from paper "Deformable Convolutional Networks" that the Aligned-Inception-ResNet arch can get goodness from both inception and resnet, I wonder whether it can get better score replacing the backbone.

Can I ask why we use 'reflection' mode for padding in image.

Can I know the reason? If we use 'reflect mode', for original part, we predict label, but the reflected part, we should predict 255, am I right? I can't understand it. Thank you! Dr. Yu.

Drivable Area baseline model BDD100k

Can you provide me with the baseline drivable area segmentation model (DRN-D-22) or steps to achieve the results shown in the BDD100k paper for the Drivable Area Segmentation task?

Error in cityscapes data preparartion

I am following the instructions to set up cityscapes data. I downloaded the dataset. The code prepare_data.py to convert the original segmentation label ids to one of 19 training ids runs without any error. But doesnt generate the images with ___trainIds.png.

Can anyone please help with this?

Does the weight(drn_d_105-12b40979.pth) only train on ImageNet?

I am working on weakly-supervised semantic segmentation.
I used drn-d-105 as our backbone for deeplabv2 and it worked very well.
During training, I load drn_d_105-12b40979.pth as our pretrained weight from this website.
Does the weight(drn_d_105-12b40979.pth) only train on ImageNet?
I have to make sure the weight is not related to other non-image level labels.

Problems about the test results and the number of categories for Cityscapes

Hi, @fyu ,

Thanks for your code.
When I downloaded the Cityscapes dataset, I found the number of classes is 30， not 19 in the README.md. So I can't reproduce the mAP of the validation set. I am so confused.

And I am also confused about the difference between IOUclass and IOU category in the evaluation for Cityscapes.

Thanks.

How to calculate pixel accuracy?

Will pre-trained classification model will be shared?

@fyu Thanks for your excellent work about the about the dilated convolution and using this technique combined with the residual network. And, I would like to know whether you will share the pre-trained classification model for us, I would like to use your classification model in the field of pose estimation.
Looking forward your classification model

tarfile.ReadError: invalid header

Hi.
I tried test, but this error was returned.
drn_d_22_cityscapes.pth is broken?

Command
python3 segment.py test -d cityscapes/ -c 19 --arch drn_d_22 --pretrained drn_d_22_cityscapes.pth --phase test --batch-size 1

Error

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/tarfile.py", line 182, in nti
    n = int(s.strip() or "0", 8)
ValueError: invalid literal for int() with base 8: 'ons\nOrde'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/tarfile.py", line 2281, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/local/lib/python3.5/tarfile.py", line 1083, in fromtarfile
    obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
  File "/usr/local/lib/python3.5/tarfile.py", line 1025, in frombuf
    chksum = nti(buf[148:156])
  File "/usr/local/lib/python3.5/tarfile.py", line 184, in nti
    raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "segment.py", line 553, in <module>
    main()
  File "segment.py", line 550, in main
    test_seg(args)
  File "segment.py", line 467, in test_seg
    single_model.load_state_dict(torch.load(args.pretrained))
  File "/usr/local/lib/python3.5/site-packages/torch/serialization.py", line 248, in load
    return _load(f, map_location, pickle_module)
  File "/usr/local/lib/python3.5/site-packages/torch/serialization.py", line 314, in _load
    with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar, \
  File "/usr/local/lib/python3.5/tarfile.py", line 1577, in open
    return func(name, filemode, fileobj, **kwargs)
  File "/usr/local/lib/python3.5/tarfile.py", line 1607, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/local/lib/python3.5/tarfile.py", line 1470, in __init__
    self.firstmember = self.next()
  File "/usr/local/lib/python3.5/tarfile.py", line 2293, in next
    raise ReadError(str(e))
tarfile.ReadError: invalid header

mIOU isn't working?

When I try to runt he mIOU calculation in segment.py test, it runs all the way through then errors with a TypeError, saying that it must be a real number, not NoneType. Presumably something happens in the mIOU calcuation that turns it to a none type. The weird this is that thas worked for me in the past, so I'm pretty confused.

RuntimeError: CUDA error: out of memory for "DRN-D-105" while testing

Is anyone able to test the code for "DRN-D-105" architecture on test data??
I am able to train and validate but while testing error occurs as "RuntimeError: CUDA error: out of memory" even with small crop size = 256*256 and batchsize =1.
I checked resources while testing and resources are free enough (both GPU memory and system RAM)
I am using NVIDIA P100 GPU with 16 GB memory.

Any thought?

(bhakti) user@user:/mnt/komal/bhakti/anue$ python3 segment.py test -d dataset/ -c 26 --arch drn_d_105 --resume model_best.pth.tar --phase test --batch-size 1 -j2
segment.py test -d dataset/ -c 26 --arch drn_d_105 --resume model_best.pth.tar --phase test --batch-size 1 -j2
Namespace(arch='drn_d_105', batch_size=1, bn_sync=False, classes=26, cmd='test', crop_size=896, data_dir='dataset/', epochs=10, evaluate=False, list_dir=None, load_rel=None, lr=0.01, lr_mode='step', momentum=0.9, ms=False, phase='test', pretrained='', random_rotate=0, random_scale=0, resume='model_best.pth.tar', step=200, test_suffix='', weight_decay=0.0001, with_gt=False, workers=2)
classes : 26
batch_size : 1
pretrained :
momentum : 0.9
with_gt : False
phase : test
list_dir : None
lr_mode : step
weight_decay : 0.0001
epochs : 10
step : 200
bn_sync : False
ms : False
arch : drn_d_105
random_rotate : 0
random_scale : 0
workers : 2
crop_size : 896
lr : 0.01
load_rel : None
resume : model_best.pth.tar
evaluate : False
cmd : test
data_dir : dataset/
test_suffix :
[2019-09-14 19:14:23,173 segment.py:697 test_seg] => loading checkpoint 'model_best.pth.tar'
[2019-09-14 19:14:23,509 segment.py:703 test_seg] => loaded checkpoint 'model_best.pth.tar' (epoch 1)
segment.py:540: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
image_var = Variable(image, requires_grad=False, volatile=True)
Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f15eff61160>>
Traceback (most recent call last):
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 399, in del
self._shutdown_workers()
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
self.worker_result_queue.get()
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/multiprocessing/queues.py", line 337, in get
return ForkingPickler.loads(res)
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
fd = df.detach()
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/multiprocessing/resource_sharer.py", line 58, in detach
return reduction.recv_handle(conn)
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/multiprocessing/reduction.py", line 181, in recv_handle
return recvfds(s, 1)[0]
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/multiprocessing/reduction.py", line 152, in recvfds
msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size))
ConnectionResetError: [Errno 104] Connection reset by peer
Traceback (most recent call last):
File "segment.py", line 789, in
main()
File "segment.py", line 785, in main
test_seg(args)
File "segment.py", line 720, in test_seg
has_gt=phase != 'test' or args.with_gt, output_dir=out_dir)
File "segment.py", line 544, in test
final = model(image_var)[0]
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "segment.py", line 142, in forward
y = self.up(x)
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/user/anaconda2/envs/bhakti/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 691, in forward
output_padding, self.groups, self.dilation)
RuntimeError: CUDA error: out of memory

What is the appropriate image size for segment.py?

Hello.
Thank you for a nice implementation!

I would like to know what image size works well when applying segmentation.
If we use Cityscapes, original size is 2048x1024. Should we use the original size?

How to build lib folder for pytorch from scratch

When I have tried using the Makefile. I get the following error :

src/batchnormp_cuda_kernel.cu:1:20: fatal error: THCUNN.h: No such file or directory
compilation terminated.
Makefile:33: recipe for target 'dense/batchnormp_kernel.so' failed
make: *** [dense/batchnormp_kernel.so] Error 1

Here is the Make file
PYTORCH_LIB_DIR := /users/sudhirkumar/fcn/py3_pytorch0.4/lib/python3.5/site-packages/torch/lib

PYTHON := python3
NVCC_COMPILE := nvcc -c -o
RM_RF := rm -rf

Library compilation rules.

NVCC_FLAGS := -x cu -Xcompiler -fPIC -shared

File structure.

BUILD_DIR := dense
INCLUDE_DIRS := TH THC THCUNN include include/TH
TORCH_FFI_BUILD := build.py
BN_KERNEL := $(BUILD_DIR)/batchnormp_kernel.so
TORCH_FFI_TARGET := $(BUILD_DIR)/batch_norm/_batch_norm.so

INCLUDE_FLAGS := $(foreach d, $(INCLUDE_DIRS), -I$(PYTORCH_LIB_DIR)/$d)

#INCLUDE_FLAGS2 := $(foreach d, $(INCLUDE_DIRS), -I$(PYTORCH_LIB_DIR2)/$d)

#INCLUDE_FLAGS3 := $(foreach d, $(INCLUDE_DIRS), -I$(PYTORCH_LIB_DIR3)/$d)

all: $(TORCH_FFI_TARGET)

$(TORCH_FFI_TARGET): $(BN_KERNEL) $(TORCH_FFI_BUILD)
$(PYTHON) $(TORCH_FFI_BUILD)

$(BUILD_DIR)/batchnormp_kernel.so: src/batchnormp_cuda_kernel.cu
@mkdir -p $(BUILD_DIR)
$(NVCC_COMPILE) $@ $? $(NVCC_FLAGS) $(INCLUDE_FLAGS) -Isrc -std=c++11

clean:
$(RM_RF) $(BUILD_DIR)

Thanks,
Sudhir

KeyError: 'epoch'

I try to execute

python3 -u /home/dl-box/ld/github/drn/segment.py test -d=/media/dl-box/HDD3/ld/Documents/datasets/CITYSCAPES -c=19 --arch=drn_d_22 --resume=/home/dl-box/ld/github/drn/pretrained/city_seg/drn_d_22_cityscapes.pth --phase=test --batch-size=1

However, It will meet the Error like this:

Traceback (most recent call last):
  File "/home/dl-box/ld/github/drn/segment.py", line 748, in <module>
    main()
  File "/home/dl-box/ld/github/drn/segment.py", line 744, in main
    test_seg(args)
  File "/home/dl-box/ld/github/drn/segment.py", line 658, in test_seg
    start_epoch = checkpoint['epoch']
KeyError: 'epoch'

And I try to look the key of checkpoint, the result are:

key is  base.0.0.weight
key is  base.0.1.weight
key is  base.0.1.bias
key is  base.0.1.running_mean
key is  base.0.1.running_var
key is  base.1.0.weight
key is  base.1.1.weight
key is  base.1.1.bias
key is  base.1.1.running_mean
key is  base.1.1.running_var

I try to use torch-0.3.1 torch-0.4.0 and torch-1.0. All meet the same error.
Do you know how to solve it?
Thank you very much!

--no-cuda is unused

args.cuda is set up properly (via user input or checking if CUDA is not available), but is not used in any of the functions - everything is cast to cuda, irregardless of the value of this option.

What GPU do you use to train?

If you use 8 GPUs for 16 crops per batch, the memory for each GPU is more than 12GB.

My GTX 1080Ti has only 11171MiB even.

pre-trained classification model?

@fyu excellent work.but, will you plan to share the pre-trained model of the classification about the DRN-C-26 or the DRN-C-num_layers ?
Thanks

The pretrained models are much bigger than models you reported

As reported, the parameters in 'D' arch models are:
DRN-D-22 | 16.4M
DRN-D-38 | 26.5M
DRN-D-105 | 54.8M
however, the pretrained models you provided are 60M, 99M, and 207M.
Is there anything wrong?

[Errno 111] Connection refused

getting a connection refused error when trying to download models:

from torch.utils import model_zoo as mz
mz.load_url("http://dl.yf.io/drn/drn_c_26-ddedf421.pth")

getting the following error:

urllib.error.URLError: <urlopen error [Errno 111] Connection refused>

CityScapes labels: Number of classes

Hello,

I'm going to do a fine-tuning of your model for a segmentation task.
I would like to keep the same 19 semantic classes that you used for trainining the model on the CityScapes dataset.

Could you please tell me how you pre-processed the CityScapes labels in order to obtain the 19 classes?

Thanks a lot.
Filippo

There is a convolution layer with kernel_size=1 and stride=2.

There is a convolution layer with kernel_size=1 and stride=2. I think its a loss of 50% of the information during processing.

I got this by running this cell of code.

module = drn_d_24(pretrained=False)
module

fyu / drn Goto Github PK

drn's People

Contributors

Stargazers

Watchers

Forkers

drn's Issues

Library compilation rules.

File structure.

Recommend Projects

Recommend Topics

Recommend Org