Giter Site home page Giter Site logo

datvuthanh / hybridnets Goto Github PK

View Code? Open in Web Editor NEW
574.0 16.0 118.0 55.68 MB

HybridNets: End-to-End Perception Network

License: MIT License

Python 84.91% Jupyter Notebook 6.42% CMake 0.37% C++ 8.30%
detection bifpn segmentation multitask-learning hybridnets autonomous-driving end2end-network

hybridnets's Introduction

loss.backward()

hybridnets's People

Contributors

datvuthanh avatar xoiga123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hybridnets's Issues

Guide for Custom Dataset Training

Hi,

Is there any guide/tutorial to train on a custom dataset? Especially fine tuning the pretrained weights towards a single downstream task (only object detection labels)?

Kind regards,
Talal

Training log?

Hello
I want to retrain your model.
Could u provide a training log file?
thanks very much

a problem in training?

您好,我在训练完一个epoch,做val的时候训练程序被杀死,没有任何其他报错,请问是什么原因啊?
1653270926(1)

Evaluate the model ?

hi, I have two problems and hope to help me out, thank you!

  1. I have trained new model and want to evaluate the model. There are two functions val() and val_from_cmd(), in val.py. what's the difference between the two functions?
  2. I have modified the number of categories to 10, and when I run the val.py, there is a error:ValueError: operands could not be broadcast together with shapes (12,) (4,)
    image

light weight backbone

would you support some light weight backbones such as RepVGG which is GPU friendly?

Checkpoint File?

Hi, I'm trying to reproduce your results. Could you please provide the best checkpoint file? Thanks in advance!

Train flow

hello,
In the ### Training stages
Are these three steps independent? My understanding is that the second step needs to continue training on the model of the first step, the third step continues to train on the model of the second step, and the second and third steps are specified by -w.
Right?

Thank you

MultiLabel Classification

@datvuthanh @xoiga123 Thanks for sharing the code base, it is really helpful , but i had a few queries

  1. Can we have multi-label classification also from the existing code? as I see two things one is that in the annotations there is attribute:{"trafficator": green} and in the loss.py there is a macro MULTILABEL mode is there
  2. I am looking for a single bounding box with multilabel classification output so what are the modifications to be made in the existing code

Thanks in advance

Out of memory error during call val.py?

In the training phase when it calls val.py to evaluate the model performance it will suspend for about half an hour and then display killed. Use the command:

dmesg | egrep -i -B100 "killed process"

It would report that the python process was killed because out of memory error occured.

How to solve this problem?

Lane Line mean Intersection Over Union

I've noticed that you have mentioned in the abstract that 31.6 is mean Intersection Over Union. I wonder if 31.6 is Intersection Over Union (without background) not mean Intersection Over Union

Did anyone successfully export onnx?

code as follows, but export nothing:

weight_path = 'weights/hybridnets.pth'
device = 'cuda' if torch.cuda.is_available() else 'cpu'
params = Params(os.path.join(Path(__file__).resolve().parent, "projects/bdd100k.yml"))
model = HybridNetsBackbone(num_classes=len(params.obj_list), compound_coef=3,
                           ratios=eval(params.anchors_ratios), scales=eval(params.anchors_scales),
                           seg_classes=len(params.seg_list), backbone_name=None)
model.load_state_dict(torch.load(weight_path, map_location=device))
model.eval()
inputs = torch.randn(1, 3, 384, 640)
print("begin to convert onnx")
torch.onnx.export(model, inputs, 'HybridNetsBackbone.onnx',
                  verbose=False, opset_version=12, input_names=['images'])
print("done")

shell log:

HybridNets/utils/utils.py:673: TracerWarning: torch.from_numpy results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  anchor_boxes = torch.from_numpy(anchor_boxes.astype(dtype)).to(image.device)
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.

...
ONNX export failed: Couldn't export Python operator SwishImplementation

How to replace Conv2dStaticSamePadding and MaxPool2dStaticSamePadding in order to be able to transfer out dlc on the Qualcomm platform

As shown above, using Qualcomm toolchain to export dlc file, there is error:

Encountered Error: ERROR_ASYMMETRIC_PADS_VALUES: Asymmetric pads values is not supported

I try EfficientNet-PyTorch to exprot dlc successfully.
I found the problem to arise Conv2dStaticSamePadding and MaxPool2dStaticSamePadding, the using F.pad to asymmetric padding.
So, How to replace them?

Docker support?

Thanks for sharing great work!

I've got Segmentation Fault when run demo image/video with pre-trained weight. Different version of dependencies(e.g: CUDA ..) may casue error. Would it be possible to provide Docker file to build container?

Evaluation results not accurate?

Hi,
I've been trying to recreate your results from the Hybridnets paper. I've run the eval code on the 100k dataset but the results I'm getting are nowhere close to the actual results you presented in the paper. I'm sure I'm missing something here, could you please tell me what could possibly be going wrong here.
I think the root locations that I'm giving are inaccurate.

data root - raw images from the 100k dataset
label root - I saved separate json file for each image from the "bdd100k_labels_images_val.json" that I downloaded from the bdd100k dataset. So in total I saved 10,000 separate json (one for each image) into another folder named "val".
Road labels in Seg_list - I used the drivable masks
Lane labels in Seg_list - I used the lane masks

The output that I got after evaluation of 100 images:
Screenshot from 2022-06-21 16-27-00

The iou values are inconsistent with the ones mentioned in the paper, they are nowhere near them. I was also wondering why the precision is so low, is there a specific reason as to why there are so many false positives. I hope you can help me out, thanks.

inference latency

the inference latency 37ms on V100 with FP16 from the paper.
does it test with tensorrt or just python inference ?

and how about the speed with preprocess and NMS postprocessing?

thanks very much!

The loss doesn't converge when training segmentation head only.

I changed the backbone to Efficientnet-b0 and reduce the number of BiFPN layers from 6 to 1, in order to cut down the runtime of inference. After training 200 epochs with segmentation head frozen, I tried to train the model freezing backbone and detection head. But I found that the train loss of segmentation head dose not seem to converge. And the loss of valuation and mIOU are reducing at the same time, which doesn't make sense.
Apart from that, I also found that when freezing segmentation head, the segmentation loss is not set to 0 in the code. which can affect the updating of weights in backbone, I suppose.

Inferring pictures and videos

In model validation,Inferring pictures and videos,appear RuntimeError: unexpected EOF, expected 596029 more bytes. The file might be corrupted

To control the network configuration

Hello
How are you?
I want to train a model for ONLY object detection except segmentation.
Could u provide a way to control this in .yaml file?
Thanks.

Could not use Pytorch quantization for model

model_to_quantize = copy.deepcopy(model)

qconfig_dict = {"": torch.quantization.get_default_qconfig('qnnpack')}

model_to_quantize.eval()

# prepare
model_prepared = quantize_fx.prepare_fx(model_to_quantize, qconfig_dict)

# calibrate (not shown)
# quantize
model_quantized = quantize_fx.convert_fx(model_prepared)

When using Pytorch quantization example for your model, I get this following error:

~/Documents/DL_course_project/HybridNets/backbone.py in forward(self, inputs)
100
101 # p1, p2, p3, p4, p5 = self.backbone_net(inputs)
102 --> p2, p3, p4, p5 = self.encoder(inputs)[-4:] # self.backbone_net(inputs)
103
104 features = (p3, p4, p5)
NameError: module is not installed as a submodule

How can I avoid this error?

How to generate drivable area and lane masks?

I know you shared a drive link for bdd100k drivable area and lane masks for the dataloader but I want to replicate it for understanding what to do for my custom dataset. I looked for bdd repo and there is "to_mask.py" which has some scripts to do it. It passes its own tests but I cannot get similar results as you shared. Can you please explain how to generate those masks? Thanks in advance.

Reproducing training results

First of all, thanks for sharing your study. I am trying to reproduce your results by training the network with strategy you gave. For me to compare how am I doing, could you please share your loss plots for each phases (freeze seg, freeze backbone and det, train end-to-end)?

Drivable and Lane Type training

@datvuthanh thanks for sharing the code based i have following queries

  1. since bdd100k has different types of lanes eg double yellow, single white lane , dashed, can we use current source code to train for different lane types? is so what are the modifications need to be made int he code based
  2. Can we similar train the current source code with the driveable area and alternate drivable area labels ? if so what are the changes to be made

Please share your thoughts
Thanks in advance

Lane color and Lane type Segmentation

In the traffic light and traffic sign detection issue, you've mentioned that we just have to change the obj_list in project.yml by adding the classes needed. Does that apply to seg_list as well?

If I want to detect lane color and lane type, can I change seg_list as follows

seg_list: ['road', 'double white', 'double yellow', 'single white', 'single yellow', 'solid', 'dashed']

Actually, I need classes like double solid yellow, double solid white, single solid yellow, single solid white, single dashed yellow, single dashed white, double dashed yellow, double dashed white, but BDD100K already has labeled double white, double yellow, single white, single yellow classes under Lane Categories and solid, dashed classes under Lane Styles

Will the change in seg_list as shown above work, if not, how to do it

[Discussion] Gradient flow

Back when we were toying with mosaic, we removed the segmentation head completely from the model and dataloader. Now that we try to add mosaic augmentation officially, we have to make a decision of not using it for segmentation training.

hybridnets/dataset.py

if self.use_mosaic:
    # honestly, mosaic is not for road and lane segmentation anyway
    # you cant expect road and lane to be split up in 4 separate corners in an image, do you?
    # only use mosaic with freeze_seg :)
    img, labels, seg_label, lane_label, (h0, w0), (h, w), path = self.load_mosaic(idx)

Only images and object annotations are mosaic, while segmentation annotations are kept intact, which produces incorrect segmentation loss but that doesn't matter because we froze segmentation head anyway, thinking that requires_grad=False makes the segmentation head disappear from backprop graph. But that is wrong, the backbone is still affected by segmentation loss.

Check this colab for interactive stuffs.

So we've been planning to just straight ahead set the losses to 0 when you --freeze_head like this:

cls_loss, reg_loss, seg_loss, regression, classification, anchors, segmentation = model(imgs, annot, seg_annot, obj_list=params.obj_list)
cls_loss = cls_loss.mean() if not opt.freeze_det else 0
reg_loss = reg_loss.mean() if not opt.freeze_det else 0
seg_loss = seg_loss.mean() if not opt.freeze_seg else 0

Is this approach too naive? Are there any recommendation regarding this matter? Or should we also mosaic the segmentation labels?

How to select number of gpus?

hi,
i want to train on multiple gpus with train_DDP.py,
but i do not know which param determine the number of gpus.
Looking forward to your reply ! Thank you!

Problem in Training stage

Hello,
I tried to follow your suggestion to train the model. So accordingly, at first I freeze the segmentation and trained for some epoch.

python train.py -p bdd100k -c 3 -n 4 -b 8 --freeze_seg True --lr 1e-5 --optim adamw --num_epochs 75 --val_interval 1 --log_path D:\HybridNets\rgb-clean --saved_path D:\HybridNets\rgb-clean --save_interval 500 --verbose True --num_gpus 1 --plots True

After that I am freezing the backbone and detection head.

python train.py -p bdd100k -c 3 -n 4 -b 8 --freeze_backbone True --freeze_det True --lr 1e-5 --optim adamw --num_epochs 12 --val_interval 1 --log_path D:\HybridNets\rgb_clean --saved_path D:\HybridNets\rgb_clean --save_interval 500 --verbose True --num_gpus 1 --plots True -w D:\HybridNets\rgb-clean\bdd100k\hybridnets-d3_74_129225_best.pth

But I am getting the error

image

Can you please suggest how to solve this issue?

Thank you in advance

How to prepare the bdd100k dataset?

Hi.
According to the README.txt, I had prepared bdd100k dataset. But the BddDataset failed to load the dataset. I think something wrong with my bdd100k dataset preparation process. In fact I had no idea about where to put colormaps, masks, polygons, rles folder of drivable and lane. I don't know where to put the json file of detection labels.

Please give the detailed folder structure.

AssertionError BUG

if i just want to seg one class,such as seg_list only have ’road‘.
Then i run train.py ,
in loss.py line 538,
in soft_tversky_score assert output.size() == target.size()
AssertionError

then i debug code,find output.size() = torch.Size([2, 1, 245760]) target.size() = torch.Size([2, 1, 491520])

How to fix that???

Issue with FPS mistake in the article

Hello. First of all,thank you for this work.

I noted you mistake the code about the inf_time and fps.

So I think maybe you calculation the inference time incorrectly in the article , your article show that YOLOP have 52ms the infercence time per frame(batch size 1), which mean 20fps? (although 41 fps show in the YOLOP's article).

And sadly in the hybridnets_test.py , i try calculate the HYBRIDNET's inference time but only get 0.06s(only model(x) ) ,which means 17-20fps. (Tesla v100 )but get 0.021s(only model(x) ) in YOLOP, which means 48 fps(Tesla v100 )

Sadly , it may not faster than YOLOP and not reach the real-time.

Tensor size mismatch

@datvuthanh @xoiga123
Hi. I am receiving this error when I pass a .jpg input image of size (2160,4096,3) for testing. Can you please help me resolve this issue? Thank you!

Command I ran - python hybridnets_test.py -w weights/hybridnets.pth --source demo/image --output demo_result --imshow False --imwrite True

Traceback (most recent call last):
  File "hybridnets_test.py", line 121, in <module>
    features, regression, classification, anchors, seg = model(x)
  File "env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "folder/HybridNets/backbone.py", line 104, in forward
    features = self.bifpn(features)
  File "env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "env/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "folder/HybridNets/hybridnets/model.py", line 179, in forward
    outs = self._forward_fast_attention(inputs)
  File "folder/HybridNets/hybridnets/model.py", line 211, in _forward_fast_attention
    p5_up = self.conv5_up(self.swish(weight[0] * p5_in + weight[1] * self.p5_upsample(p6_up)))
RuntimeError: The size of tensor a (11) must match the size of tensor b (12) at non-singleton dimension 2

amp & channels_last

channels_last:

While PyTorch operators expect all tensors to be in Channels First (NCHW) dimension format, PyTorch operators support 3 output memory formats.
Contiguous: Tensor memory is in the same order as the tensor’s dimensions.
ChannelsLast: Irrespective of the dimension order, the 2d (image) tensor is laid out as an HWC or NHWC (N: batch, H: height, W: width, C: channels) tensor in memory. The dimensions could be permuted in any order.

amp:

extra:

prefetch:

Going to train from scratch to see what's good, with a working log this time.
UPDATE 12/07/2022: Seems like the bottleneck is in dataloading, which takes an unholy amount of time even though I cached everything in RAM. Currently profiling CPU & GPU and trying out this dataloader which allegedly actually does prefetch.
UPDATE: It all makes sense now, Pytorch's Dataloader can only prefetch batches in the current running epoch. For the next epoch, there is apparently no prefetch whatsoever.

Multi-class vs multi-label segmentation

#15 #38
We were using multi-label dataloader, loss and metrics for a multi-class problem. Basically they work fine and the results are correct (maybe focal loss segment is a little bit off, who knows, will check further) but to someone reading the code, the semantic meaning is wrong.

TODO: Generalize to multi-class as default, with a switch to multi-label.
TODO in another issue: Multi-label for object detection.

Issue with FPS calculation code.

Hello. First of all, great work.

While running the hybridnets_test_videos.py I found some issues with the FPS calculation part.
In the hybridnets_test_videos.py script, the FPS is calculated as:
(t2-t1)/frame_count)

But it seems that the above code will give the inference time per frame and not the FPS. Most probably it should be:
1/((t2-t1)/frame_count))
That is, we need to divide it by 1.

Please let me know if any updates happen on this front.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.