ydhonghit / ddrnet Goto Github PK

The official implementation of "Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes"

License: MIT License

Python 100.00%

ddrnet's People

Contributors

Stargazers

Watchers

Forkers

wuxiaolianggit longbui99 zachary-zch wulingtian tony-hou pkucactus herolin12 jxncyym jeffreykuang slinene shimashahfar cyrilyang tianjun-world china-gaozw sixgodgg kevinzhaozl mbcel fenguoo zwy4896 evdotheo cbhxdyx juyebshin svyatoslavsokolov dlyshare fangwudi liuqinglong110 eeqmcc scott-mao cbanyungong zyxjtu nobtree eungyokweon neverstoplearn leonsakura saquibmazhar zhugeroastedfish cgmangod xczhou520 ma-zhuang robinhoodki cv-seg ningzhenrrr allezsyh tusharkanwaria weisili2016 chenchong137125 kingloo2014 pminimd xuejie7 cuteboyqq jerry2990 chengwei920412 robotwithcv

ddrnet's Issues

Wrong pretrained models for segmentation task

I've tried to use your code and pretrained models for an image segmentation. However, it turns out that files best_val.pth and best_val_smaller.pth from the Google Drive are compatible with some unknown class with fields model and seghead_extra rather than DualResNet.
Currently I load pretrained weight with the following code:

def fix_snapshot(state_dict, prefix='model.'):
    return {key[len(prefix):]: weight
            for key, weight in state_dict.items()
            if key.startswith(prefix)}

from DDRNet.segmentation.DDRNet_23_slim import DualResNet_imagenet  # or DDRNet_23
net = DualResNet_imagenet()
state_dict = torch.load('weights/best_val_smaller.pth')  # or best_val.pth
state_dict = fix_snapshot(state_dict)
net.load_state_dict(state_dict, strict=False)  # None missing keys, a lot of unexpected ones

and it works quite fine, but it doesn't seem right to me.
Could you please update pretrained models or provide code of that class?
(There is no such problem for classification task models)

Not an issue, but applause!

This model is extremely fast & accurate. I've tried so many methods to make a model run fast enough with the same accuracy, like lite backbone, shrink the PPM or ASPP module, or cut some channels... But actuallly those changes decay the accuracy by a large scale...

Luckily I saw your work in paperwithcode, and I notice the super-high fps along with the amazingly high IoU you achive. So I think, why not try it in my project?

And here it is, the DDRNet improves the task fps from my original model's 2.5 to epically 16! What shocks me most is that the IoU even get 1 point higher, O.M.G!!!

Many great thanks to your work, your DDRNet makes me aware that there still exists huge improvements potential in the filed of semantic segmentaion's speed & accuracy.

Hope to see more!

Question about crop augmentation during training

Thank you very much for your great contribution!

I have a question about the crop augmentation that you used during training. In the paper you say that you cropped the cityscapes images to 1024x1024 during training. Considering this how did you inference on the full resolution images (2048x1024) to get the benchmark results? Did you feed in 2 crops (left and right 1024x1024 crop) into the model with same size as during training and merged both into the full image afterwards or did you feed in the full resolution image during inferencing, although training was done with smaller image crops?

A question about testset result

Hi, thanks for your share! when I trained the model on cityscapes using my train code. but The test results of the val set and the test set are much different，the val mIoU is close to the paper ,but the test set mIoU is very low,why ?

pre-trained model on the COCO dataset

Hi, thank you for sharing your great work.
May you please share the pre-trained model on the COCO dataset also?

Using models

Hello,

Thank you for your work. I was wondering how do I use the files included to perform tests?
Let's say I want to use DDRNet_23.py, how do I give it an input of a photo to perform semantic segmentation on?

Error in code or paper?

Hi, nice work first of all!
I stumbled across the relu activations in the code:
x = x + self.down3(self.relu(x_)) x_ = x_ + F.interpolate( self.compression3(self.relu(layers[2])), size=[height_output, width_output], mode='bilinear')

DDRNet/segmentation/DDRNet_23_slim.py

Line 312 in ba659f9

x = x + self.down3(self.relu(x_))

In the paper in Fig. 3 there are no activations after blocks, but only after the bilateral fusion. Now I'm wondering what is correct?

Link for trained models

Is there a googledrive /dropbox /onedrive link available for trained models?

有关训练的问题

您好，看到了您在回复其他人问题的时候说，您是把训练和验证集合成了大的训练集（train+val=2975+500）去训练的，您有没有尝试过仅仅使用原始训练集train(2975)去做训练，我仅使用训练集train，ddrnet_23_silm的miou仅能达到74左右

cityscapes测试集上传问题

我用作者保存的训练好的ddrnet23-slim，跑了cityscapes测试集，上传到官网，得到的结果是76.38，而不是77.44，是我漏掉了什么设置吗？有挺多模型都是这样得不到论文中的精度。

这个网络速度是真的快建议大家工程可以用

实际应用效果真的不错，又快又好，我们公司线上就用了这个

where is the 'Bottleneck_last' and 'SPP_super' in ddrnet.py

DAPPM + DDRNet slim for small image segmentation input_size (e.g. 128x128 or 256x256)

What should be the kernel sizes+strides for DAPPM module for small image sizes? and the number of spp planes?
Thanks :)

about training code

Hi，the third code dosen‘t work on my PC，when will you please upload you whole project？thx

About mIoU result on Cityscapes validation dataset

Hi,

Thanks for your work and providing code. I am just confuse about the mIoU reported in the paper on cityscapes validation dataset which is around 77%. Kindly let me know is this mIoU achieved on pretrained model (imagenet)? I am training DDR-Net-Slim23 from scratch and I am getting around 55% mIoU on validation data of Cityscapes.

关于图片尺寸

你好，想问一下为什么输入图片大小为512×512和768×1536时速度基本相当？

关于2080ti的推理速度

你好作者，我在2080ti上运行了您提供的测速代码，结果只有50多fps，请问这个正常吗，是设备问题还是其他

Train model problem

I train the DDRNet_23_silm and use the pretrained model but I get a wrong result, the mIoU only 58%, too low than your result in paper.I also find the loss decline is slow, the best loss is 0.9.
I use a single GPU, lr:3e-3, epochs:500, optim:SGD, loss:ohemceloss

Pretrained model does not work for me (ddrnet23 slim)

@ydhongHIT I downloaded pretrained model for ddrnet23(cityscapes). I want to try this model on two images. As you see in below code, I resized images to (1024,2048,3) like mentioned in paper. The get_seg_model defined in somewhere else but I did not copy it to here.

import matplotlib.pyplot as plt
import cv2
import torch
import numpy as np
img = cv2.imread('E:/deneme_alani_iha/uavid_data/npy_deneme/ddrNet_city/ss.png')
img2 = cv2.imread('E:/deneme_alani_iha/uavid_data/npy_deneme/ddrNet_city/ss2.png')
img = cv2.resize(img,(2048,1024))
img2 = cv2.resize(img2,(2048,1024))
img = img.reshape(3,2048,1024)
img2 = img2.reshape(3,2048,1024)
data = [img,img2]
data = np.array(data)
data = torch.Tensor(data)

#%%
model = get_seg_model(1)
with torch.no_grad():
    output = model(data)
#%%
plt.imshow(output[0].cpu().detach().numpy().reshape(128,256,19)[:,:,0])
#%%
def eight2twoconverter(labels, size):
    colors = [  (150,120,90), (153,153,153), (153,153,153), (250,170,30), (220,220,0), (107,142,35), (152,251,152), ( 70,130,180), (220,20,60), (255,0,0), (  0,0,142), (  0,0,70), (  0,60,100), (  0,0,90), (  0,0,110), (  0,80,100), (  0,0,230), (119,11,32), (  0,0,142)]
    
    b = np.zeros(( size, 256,3))
    for i in range(b.shape[0]):
        for j in range(b.shape[1]):
            indexx = np.argmax(labels[i,j,:])
            b[i,j]  = colors[indexx]
    return b
#%%
out = eight2twoconverter(output[0].cpu().detach().numpy().reshape(128,256,19), 128)
plt.imshow(out)

I gave correct path and num_classes parameter to this function.

def DualResNet_imagenet(pretrained=True):
    model = DualResNet(BasicBlock, [2, 2, 2, 2], num_classes=19, planes=32, spp_planes=128, head_planes=64, augment=False)
    if pretrained:
        checkpoint = torch.load("E:/deneme_alani_iha/uavid_data/npy_deneme/best_val_smaller.pth", map_location='cpu') 
         
        new_state_dict = OrderedDict()
        for k, v in checkpoint['state_dict'].items():
            name = k[7:]  
            new_state_dict[name] = v
        model_dict.update(new_state_dict)
        model.load_state_dict(model_dict)
        
        model.load_state_dict(new_state_dict, strict = False)
    return model

Raw Image

Output

As you see above output of the network is some thing like a random picture. How can I solve it?

完整代码

看到论文已被接受，可否将完整工程代码放出来

Tensorflow Implementation

Hi. Tensorflow implementation would be good. Do you guys have a plan about implement this model to Tensorflow platform?

Is there a TensorRT implementation

Hi, recently I've read the paper, It's an exciting work that achieved the state-of-the-art performance on the cityscape.
First of all, thanks for sharing the work, I wanna know have you ever did its TensorRT version. If did, can you share it?

Merge conv2d and bn

I am using your model to do segmentation, and its performance is really amazing. However, the inference speed is not ideal. I noticed that you merged conv2d and batchnorm during inference. This is probably the potential reason for low speed. However, after I started to work on this, I realized this is not easy work. Could you provide us with the source code for merging conv2d and batchnorm? Thanks in advance!

您好,想請問會公開訓練的code嗎？

@ydhongHIT ,您好,想請問你們會公開訓練的code嗎？想嘗試來訓練自己的訓練集！
謝謝您!!

the pretrained net on imagenet

could you upload the pretrained model on imagenet. thanks!

SPP_super and Bottleneck_last

Going through your code, I cannot seem to find these two classes which are referred to in DDRNET_39.py

These do not exist in the 23 versions so this might be a mistake?

关于预训练

作者您好！非常感谢您的工作！注意到您在微调之前使用了imagenet数据集进行预训练，但是在论文中并未给出使用imagenet预训练给模型的分割精度带来了多少的提升，我想请问一下，使用imagenet预训练大概能给分割精度带来多少的提升呢？

about training resolution

Hi, bro, What is training resolution do you use?

关于辅助损失

请问辅助损失使用的也是OHEM吗？貌似论文消融实验部分显示辅助损失没有提点效果？

sorry, I come again. I used the train and val set to train model, but the miou of test set is still low, just 0.54. Can you tell me something about your training details？Thank you grateful !

About the structure of the DDRNet

Dera Yuanduo

Thanks very much for your perfect Network.
I have a question about the structure of the DDRNet.

In your program code, the layer of the bottleneck block in the low-resolution branch was made with stride=2, I think after this bottleneck block, the output size will become 1/2.
And in your paper, as the image below shows, in the conv5_1 of the low-resolution branch, one Residual basic block (stride=2) and one bottleneck block(stride=2) have been used. Why does the output size change from14x14 to 7x7 after two blocks whose stride is both equal to 2?

I would appreciate it if you would answer my question.

camvid weights question

hi，thanks for your work and sharing.
i download the DDRNet_23_slim pre-trained weights on camvid datasets. However, it can't parse images and obtain accurate semantic segmentation results by normalize with mean_std = ([0.406, 0.456, 0.485], [0.225, 0.224, 0.229]) . Could you tell me how to evaluate with this model.
Thank you a lot!

PS:

cityscapes pretrained weights of DDRNet-39

Congratulations and thank you for sharing this work. Unfortunately, I was not able to reproduce the DDRNet-39 performance on cityscapes twice with the paper settings.
Is it possible to provide the weights for DDRNet-39 model on cityscapes, which achieves ~80% mIOU%?
Thanks

resolution of segmentation output map

Is resolution of segmentation output map 1/8 of input image resolution?
How can I get the same resolution of segmentation output map as the input image resolution?

Thanks,

当我使用1024*1024尺寸训练后进行推理出现比较明显的锯齿状

请问这是正常的吗，有什么改善的方法没

Apply on 256*256 input

if the scale of input is 256256, should I rescale it to a large size? Scale 10241024 or something.

可否上传ImageNet预训练的代码

你好，我修改了DDRNet网络，因此想重新使用ImageNet进行预训练，但是怕ImageNet预训练这里出现问题导致模型效果受影响
您能否上传一下DDRNet的在ImageNet的预训练代码呢？
谢谢

Multiple questions about DDRNet23 Slim Model

I build the "DDRNet23 Slim" model that you provided. I have images with shape of 1024 H x 1024 W x 3 C. There are 8 different classes in my dataset. When I check model summary with summary(net.cuda(),(3,1024,1024)), I get model summary like:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1         [-1, 32, 512, 512]             896
       BatchNorm2d-2         [-1, 32, 512, 512]              64
              ReLU-3         [-1, 32, 512, 512]               0
            Conv2d-4         [-1, 32, 256, 256]           9,248
       BatchNorm2d-5         [-1, 32, 256, 256]              64
              ReLU-6         [-1, 32, 256, 256]               0
            Conv2d-7         [-1, 32, 256, 256]           9,216
       BatchNorm2d-8         [-1, 32, 256, 256]              64
              ReLU-9         [-1, 32, 256, 256]               0
           Conv2d-10         [-1, 32, 256, 256]           9,216
      BatchNorm2d-11         [-1, 32, 256, 256]              64
             ReLU-12         [-1, 32, 256, 256]               0
       BasicBlock-13         [-1, 32, 256, 256]               0
           Conv2d-14         [-1, 32, 256, 256]           9,216
      BatchNorm2d-15         [-1, 32, 256, 256]              64
             ReLU-16         [-1, 32, 256, 256]               0
           Conv2d-17         [-1, 32, 256, 256]           9,216
      BatchNorm2d-18         [-1, 32, 256, 256]              64
       BasicBlock-19         [-1, 32, 256, 256]               0
             ReLU-20         [-1, 32, 256, 256]               0
           Conv2d-21         [-1, 64, 128, 128]          18,432
      BatchNorm2d-22         [-1, 64, 128, 128]             128
             ReLU-23         [-1, 64, 128, 128]               0
           Conv2d-24         [-1, 64, 128, 128]          36,864
      BatchNorm2d-25         [-1, 64, 128, 128]             128
           Conv2d-26         [-1, 64, 128, 128]           2,048
      BatchNorm2d-27         [-1, 64, 128, 128]             128
             ReLU-28         [-1, 64, 128, 128]               0
       BasicBlock-29         [-1, 64, 128, 128]               0
           Conv2d-30         [-1, 64, 128, 128]          36,864
      BatchNorm2d-31         [-1, 64, 128, 128]             128
             ReLU-32         [-1, 64, 128, 128]               0
           Conv2d-33         [-1, 64, 128, 128]          36,864
      BatchNorm2d-34         [-1, 64, 128, 128]             128
       BasicBlock-35         [-1, 64, 128, 128]               0
             ReLU-36         [-1, 64, 128, 128]               0
           Conv2d-37          [-1, 128, 64, 64]          73,728
      BatchNorm2d-38          [-1, 128, 64, 64]             256
             ReLU-39          [-1, 128, 64, 64]               0
           Conv2d-40          [-1, 128, 64, 64]         147,456
      BatchNorm2d-41          [-1, 128, 64, 64]             256
           Conv2d-42          [-1, 128, 64, 64]           8,192
      BatchNorm2d-43          [-1, 128, 64, 64]             256
             ReLU-44          [-1, 128, 64, 64]               0
       BasicBlock-45          [-1, 128, 64, 64]               0
           Conv2d-46          [-1, 128, 64, 64]         147,456
      BatchNorm2d-47          [-1, 128, 64, 64]             256
             ReLU-48          [-1, 128, 64, 64]               0
           Conv2d-49          [-1, 128, 64, 64]         147,456
      BatchNorm2d-50          [-1, 128, 64, 64]             256
       BasicBlock-51          [-1, 128, 64, 64]               0
             ReLU-52         [-1, 64, 128, 128]               0
           Conv2d-53         [-1, 64, 128, 128]          36,864
      BatchNorm2d-54         [-1, 64, 128, 128]             128
             ReLU-55         [-1, 64, 128, 128]               0
           Conv2d-56         [-1, 64, 128, 128]          36,864
      BatchNorm2d-57         [-1, 64, 128, 128]             128
             ReLU-58         [-1, 64, 128, 128]               0
       BasicBlock-59         [-1, 64, 128, 128]               0
           Conv2d-60         [-1, 64, 128, 128]          36,864
      BatchNorm2d-61         [-1, 64, 128, 128]             128
             ReLU-62         [-1, 64, 128, 128]               0
           Conv2d-63         [-1, 64, 128, 128]          36,864
      BatchNorm2d-64         [-1, 64, 128, 128]             128
       BasicBlock-65         [-1, 64, 128, 128]               0
             ReLU-66         [-1, 64, 128, 128]               0
           Conv2d-67          [-1, 128, 64, 64]          73,728
      BatchNorm2d-68          [-1, 128, 64, 64]             256
             ReLU-69          [-1, 128, 64, 64]               0
           Conv2d-70           [-1, 64, 64, 64]           8,192
      BatchNorm2d-71           [-1, 64, 64, 64]             128
             ReLU-72          [-1, 128, 64, 64]               0
           Conv2d-73          [-1, 256, 32, 32]         294,912
      BatchNorm2d-74          [-1, 256, 32, 32]             512
             ReLU-75          [-1, 256, 32, 32]               0
           Conv2d-76          [-1, 256, 32, 32]         589,824
      BatchNorm2d-77          [-1, 256, 32, 32]             512
           Conv2d-78          [-1, 256, 32, 32]          32,768
      BatchNorm2d-79          [-1, 256, 32, 32]             512
             ReLU-80          [-1, 256, 32, 32]               0
       BasicBlock-81          [-1, 256, 32, 32]               0
           Conv2d-82          [-1, 256, 32, 32]         589,824
      BatchNorm2d-83          [-1, 256, 32, 32]             512
             ReLU-84          [-1, 256, 32, 32]               0
           Conv2d-85          [-1, 256, 32, 32]         589,824
      BatchNorm2d-86          [-1, 256, 32, 32]             512
       BasicBlock-87          [-1, 256, 32, 32]               0
             ReLU-88         [-1, 64, 128, 128]               0
           Conv2d-89         [-1, 64, 128, 128]          36,864
      BatchNorm2d-90         [-1, 64, 128, 128]             128
             ReLU-91         [-1, 64, 128, 128]               0
           Conv2d-92         [-1, 64, 128, 128]          36,864
      BatchNorm2d-93         [-1, 64, 128, 128]             128
             ReLU-94         [-1, 64, 128, 128]               0
       BasicBlock-95         [-1, 64, 128, 128]               0
           Conv2d-96         [-1, 64, 128, 128]          36,864
      BatchNorm2d-97         [-1, 64, 128, 128]             128
             ReLU-98         [-1, 64, 128, 128]               0
           Conv2d-99         [-1, 64, 128, 128]          36,864
     BatchNorm2d-100         [-1, 64, 128, 128]             128
      BasicBlock-101         [-1, 64, 128, 128]               0
            ReLU-102         [-1, 64, 128, 128]               0
          Conv2d-103          [-1, 128, 64, 64]          73,728
     BatchNorm2d-104          [-1, 128, 64, 64]             256
            ReLU-105          [-1, 128, 64, 64]               0
          Conv2d-106          [-1, 256, 32, 32]         294,912
     BatchNorm2d-107          [-1, 256, 32, 32]             512
            ReLU-108          [-1, 256, 32, 32]               0
          Conv2d-109           [-1, 64, 32, 32]          16,384
     BatchNorm2d-110           [-1, 64, 32, 32]             128
            ReLU-111         [-1, 64, 128, 128]               0
          Conv2d-112         [-1, 64, 128, 128]           4,096
     BatchNorm2d-113         [-1, 64, 128, 128]             128
            ReLU-114         [-1, 64, 128, 128]               0
          Conv2d-115         [-1, 64, 128, 128]          36,864
     BatchNorm2d-116         [-1, 64, 128, 128]             128
            ReLU-117         [-1, 64, 128, 128]               0
          Conv2d-118        [-1, 128, 128, 128]           8,192
     BatchNorm2d-119        [-1, 128, 128, 128]             256
          Conv2d-120        [-1, 128, 128, 128]           8,192
     BatchNorm2d-121        [-1, 128, 128, 128]             256
      Bottleneck-122        [-1, 128, 128, 128]               0
            ReLU-123          [-1, 256, 32, 32]               0
          Conv2d-124          [-1, 256, 32, 32]          65,536
     BatchNorm2d-125          [-1, 256, 32, 32]             512
            ReLU-126          [-1, 256, 32, 32]               0
          Conv2d-127          [-1, 256, 16, 16]         589,824
     BatchNorm2d-128          [-1, 256, 16, 16]             512
            ReLU-129          [-1, 256, 16, 16]               0
          Conv2d-130          [-1, 512, 16, 16]         131,072
     BatchNorm2d-131          [-1, 512, 16, 16]           1,024
          Conv2d-132          [-1, 512, 16, 16]         131,072
     BatchNorm2d-133          [-1, 512, 16, 16]           1,024
      Bottleneck-134          [-1, 512, 16, 16]               0
     BatchNorm2d-135          [-1, 512, 16, 16]           1,024
            ReLU-136          [-1, 512, 16, 16]               0
          Conv2d-137          [-1, 128, 16, 16]          65,536
       AvgPool2d-138            [-1, 512, 8, 8]               0
     BatchNorm2d-139            [-1, 512, 8, 8]           1,024
            ReLU-140            [-1, 512, 8, 8]               0
          Conv2d-141            [-1, 128, 8, 8]          65,536
     BatchNorm2d-142          [-1, 128, 16, 16]             256
            ReLU-143          [-1, 128, 16, 16]               0
          Conv2d-144          [-1, 128, 16, 16]         147,456
       AvgPool2d-145            [-1, 512, 4, 4]               0
     BatchNorm2d-146            [-1, 512, 4, 4]           1,024
            ReLU-147            [-1, 512, 4, 4]               0
          Conv2d-148            [-1, 128, 4, 4]          65,536
     BatchNorm2d-149          [-1, 128, 16, 16]             256
            ReLU-150          [-1, 128, 16, 16]               0
          Conv2d-151          [-1, 128, 16, 16]         147,456
       AvgPool2d-152            [-1, 512, 2, 2]               0
     BatchNorm2d-153            [-1, 512, 2, 2]           1,024
            ReLU-154            [-1, 512, 2, 2]               0
          Conv2d-155            [-1, 128, 2, 2]          65,536
     BatchNorm2d-156          [-1, 128, 16, 16]             256
            ReLU-157          [-1, 128, 16, 16]               0
          Conv2d-158          [-1, 128, 16, 16]         147,456
AdaptiveAvgPool2d-159            [-1, 512, 1, 1]               0
     BatchNorm2d-160            [-1, 512, 1, 1]           1,024
            ReLU-161            [-1, 512, 1, 1]               0
          Conv2d-162            [-1, 128, 1, 1]          65,536
     BatchNorm2d-163          [-1, 128, 16, 16]             256
            ReLU-164          [-1, 128, 16, 16]               0
          Conv2d-165          [-1, 128, 16, 16]         147,456
     BatchNorm2d-166          [-1, 640, 16, 16]           1,280
            ReLU-167          [-1, 640, 16, 16]               0
          Conv2d-168          [-1, 128, 16, 16]          81,920
     BatchNorm2d-169          [-1, 512, 16, 16]           1,024
            ReLU-170          [-1, 512, 16, 16]               0
          Conv2d-171          [-1, 128, 16, 16]          65,536
           DAPPM-172          [-1, 128, 16, 16]               0
     BatchNorm2d-173        [-1, 128, 128, 128]             256
            ReLU-174        [-1, 128, 128, 128]               0
          Conv2d-175         [-1, 64, 128, 128]          73,728
     BatchNorm2d-176         [-1, 64, 128, 128]             128
            ReLU-177         [-1, 64, 128, 128]               0
          Conv2d-178          [-1, 8, 128, 128]             520
     segmenthead-179          [-1, 8, 128, 128]               0
================================================================
Total params: 5,695,272
Trainable params: 5,695,272
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 12.00
Forward/backward pass size (MB): 1181.08
Params size (MB): 21.73
Estimated Total Size (MB): 1214.80
----------------------------------------------------------------

As you see that output layer become a 128 H x 128 W resolution but my labels are 1024H x 1024W shape. I can resize my labels from 1024 pixels to 128 pixel but this cause much loss of pixel information. Is this configuration correct for 1024 pixel input? Is the output necessary to be 128 pixel which is 1/8 scaled form of input? @ydhongHIT

Saved model without pretraining

Hello Yuanduo,

I think your achievement is really awesome. I am looking into your code and your model to develop more on it.

I was wondering whether the "best.pth" model is the trained model without Imagenet pre-training.

I remember I got the file from the google drive and it shows about 75.xx in validation set of Cityscapes.

Could you confirm it?