datvuthanh / hybridnets Goto Github PK
View Code? Open in Web Editor NEWHybridNets: End-to-End Perception Network
License: MIT License
HybridNets: End-to-End Perception Network
License: MIT License
Hi,
Is there any guide/tutorial to train on a custom dataset? Especially fine tuning the pretrained weights towards a single downstream task (only object detection labels)?
Kind regards,
Talal
Hello
I want to retrain your model.
Could u provide a training log file?
thanks very much
hi, I have two problems and hope to help me out, thank you!
would you support some light weight backbones such as RepVGG which is GPU friendly?
Hi, I'm trying to reproduce your results. Could you please provide the best checkpoint file? Thanks in advance!
hello,
In the ### Training stages
Are these three steps independent? My understanding is that the second step needs to continue training on the model of the first step, the third step continues to train on the model of the second step, and the second and third steps are specified by -w.
Right?
Thank you
When my seg_list is ['road', 'lane'], output segmentation mask class contains [0, 1, 2], What do they mean ?
How to modify bdd100k.yaml so object detection includes traffic sign, traffic light and pedestrian to detection?
Thanks,
@datvuthanh @xoiga123 Thanks for sharing the code base, it is really helpful , but i had a few queries
Thanks in advance
In the training phase when it calls val.py to evaluate the model performance it will suspend for about half an hour and then display killed. Use the command:
dmesg | egrep -i -B100 "killed process"
It would report that the python process was killed because out of memory error occured.
How to solve this problem?
I've noticed that you have mentioned in the abstract that 31.6 is mean Intersection Over Union. I wonder if 31.6 is Intersection Over Union (without background) not mean Intersection Over Union
code as follows, but export nothing:
weight_path = 'weights/hybridnets.pth'
device = 'cuda' if torch.cuda.is_available() else 'cpu'
params = Params(os.path.join(Path(__file__).resolve().parent, "projects/bdd100k.yml"))
model = HybridNetsBackbone(num_classes=len(params.obj_list), compound_coef=3,
ratios=eval(params.anchors_ratios), scales=eval(params.anchors_scales),
seg_classes=len(params.seg_list), backbone_name=None)
model.load_state_dict(torch.load(weight_path, map_location=device))
model.eval()
inputs = torch.randn(1, 3, 384, 640)
print("begin to convert onnx")
torch.onnx.export(model, inputs, 'HybridNetsBackbone.onnx',
verbose=False, opset_version=12, input_names=['images'])
print("done")
shell log:
HybridNets/utils/utils.py:673: TracerWarning: torch.from_numpy results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
anchor_boxes = torch.from_numpy(anchor_boxes.astype(dtype)).to(image.device)
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
Warning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied.
...
ONNX export failed: Couldn't export Python operator SwishImplementation
As shown above, using Qualcomm toolchain to export dlc file, there is error:
Encountered Error: ERROR_ASYMMETRIC_PADS_VALUES: Asymmetric pads values is not supported
I try EfficientNet-PyTorch to exprot dlc successfully.
I found the problem to arise Conv2dStaticSamePadding
and MaxPool2dStaticSamePadding
, the using F.pad to asymmetric padding.
So, How to replace them?
Thanks for sharing great work!
I've got Segmentation Fault when run demo image/video with pre-trained weight. Different version of dependencies(e.g: CUDA ..) may casue error. Would it be possible to provide Docker file to build container?
Hi,
I've been trying to recreate your results from the Hybridnets paper. I've run the eval code on the 100k dataset but the results I'm getting are nowhere close to the actual results you presented in the paper. I'm sure I'm missing something here, could you please tell me what could possibly be going wrong here.
I think the root locations that I'm giving are inaccurate.
data root - raw images from the 100k dataset
label root - I saved separate json file for each image from the "bdd100k_labels_images_val.json" that I downloaded from the bdd100k dataset. So in total I saved 10,000 separate json (one for each image) into another folder named "val".
Road labels in Seg_list - I used the drivable masks
Lane labels in Seg_list - I used the lane masks
The output that I got after evaluation of 100 images:
The iou values are inconsistent with the ones mentioned in the paper, they are nowhere near them. I was also wondering why the precision is so low, is there a specific reason as to why there are so many false positives. I hope you can help me out, thanks.
the inference latency 37ms on V100 with FP16 from the paper.
does it test with tensorrt or just python inference ?
and how about the speed with preprocess and NMS postprocessing?
thanks very much!
I changed the backbone to Efficientnet-b0 and reduce the number of BiFPN layers from 6 to 1, in order to cut down the runtime of inference. After training 200 epochs with segmentation head frozen, I tried to train the model freezing backbone and detection head. But I found that the train loss of segmentation head dose not seem to converge. And the loss of valuation and mIOU are reducing at the same time, which doesn't make sense.
Apart from that, I also found that when freezing segmentation head, the segmentation loss is not set to 0 in the code. which can affect the updating of weights in backbone, I suppose.
Hello,
I'm encountering an issue where loss.py is trying to import a function named display from utils/utils.py but this function is undefined. I couldn't run train.py as a result of this.
In model validation,Inferring pictures and videos,appear RuntimeError: unexpected EOF, expected 596029 more bytes. The file might be corrupted
Hello
How are you?
I want to train a model for ONLY object detection except segmentation.
Could u provide a way to control this in .yaml file?
Thanks.
model_to_quantize = copy.deepcopy(model)
qconfig_dict = {"": torch.quantization.get_default_qconfig('qnnpack')}
model_to_quantize.eval()
# prepare
model_prepared = quantize_fx.prepare_fx(model_to_quantize, qconfig_dict)
# calibrate (not shown)
# quantize
model_quantized = quantize_fx.convert_fx(model_prepared)
When using Pytorch quantization example for your model, I get this following error:
~/Documents/DL_course_project/HybridNets/backbone.py in forward(self, inputs)
100
101 # p1, p2, p3, p4, p5 = self.backbone_net(inputs)
102 --> p2, p3, p4, p5 = self.encoder(inputs)[-4:] # self.backbone_net(inputs)
103
104 features = (p3, p4, p5)
NameError: module is not installed as a submodule
How can I avoid this error?
I know you shared a drive link for bdd100k drivable area and lane masks for the dataloader but I want to replicate it for understanding what to do for my custom dataset. I looked for bdd repo and there is "to_mask.py" which has some scripts to do it. It passes its own tests but I cannot get similar results as you shared. Can you please explain how to generate those masks? Thanks in advance.
First of all, thanks for sharing your study. I am trying to reproduce your results by training the network with strategy you gave. For me to compare how am I doing, could you please share your loss plots for each phases (freeze seg, freeze backbone and det, train end-to-end)?
@datvuthanh thanks for sharing the code based i have following queries
Please share your thoughts
Thanks in advance
In the traffic light and traffic sign detection issue, you've mentioned that we just have to change the obj_list in project.yml
by adding the classes needed. Does that apply to seg_list as well?
If I want to detect lane color and lane type, can I change seg_list
as follows
seg_list: ['road', 'double white', 'double yellow', 'single white', 'single yellow', 'solid', 'dashed']
Actually, I need classes like double solid yellow, double solid white, single solid yellow, single solid white, single dashed yellow, single dashed white, double dashed yellow, double dashed white
, but BDD100K already has labeled double white, double yellow, single white, single yellow
classes under Lane Categories and solid, dashed
classes under Lane Styles
Will the change in seg_list
as shown above work, if not, how to do it
if I have two classes, background and lane. My segmentation mask label contains 0 and 1 or 0 and 255?
Back when we were toying with mosaic, we removed the segmentation head completely from the model and dataloader. Now that we try to add mosaic augmentation officially, we have to make a decision of not using it for segmentation training.
hybridnets/dataset.py
if self.use_mosaic:
# honestly, mosaic is not for road and lane segmentation anyway
# you cant expect road and lane to be split up in 4 separate corners in an image, do you?
# only use mosaic with freeze_seg :)
img, labels, seg_label, lane_label, (h0, w0), (h, w), path = self.load_mosaic(idx)
Only images and object annotations are mosaic, while segmentation annotations are kept intact, which produces incorrect segmentation loss but that doesn't matter because we froze segmentation head anyway, thinking that requires_grad=False
makes the segmentation head disappear from backprop graph. But that is wrong, the backbone is still affected by segmentation loss.
Check this colab for interactive stuffs.
So we've been planning to just straight ahead set the losses to 0 when you --freeze_head like this:
cls_loss, reg_loss, seg_loss, regression, classification, anchors, segmentation = model(imgs, annot, seg_annot, obj_list=params.obj_list)
cls_loss = cls_loss.mean() if not opt.freeze_det else 0
reg_loss = reg_loss.mean() if not opt.freeze_det else 0
seg_loss = seg_loss.mean() if not opt.freeze_seg else 0
Is this approach too naive? Are there any recommendation regarding this matter? Or should we also mosaic the segmentation labels?
What if I only need to do lane line detection?
Can I achieve faster inference?
hi,
i want to train on multiple gpus with train_DDP.py,
but i do not know which param determine the number of gpus.
Looking forward to your reply ! Thank you!
Hello,
I tried to follow your suggestion to train the model. So accordingly, at first I freeze the segmentation and trained for some epoch.
python train.py -p bdd100k -c 3 -n 4 -b 8 --freeze_seg True --lr 1e-5 --optim adamw --num_epochs 75 --val_interval 1 --log_path D:\HybridNets\rgb-clean --saved_path D:\HybridNets\rgb-clean --save_interval 500 --verbose True --num_gpus 1 --plots True
After that I am freezing the backbone and detection head.
python train.py -p bdd100k -c 3 -n 4 -b 8 --freeze_backbone True --freeze_det True --lr 1e-5 --optim adamw --num_epochs 12 --val_interval 1 --log_path D:\HybridNets\rgb_clean --saved_path D:\HybridNets\rgb_clean --save_interval 500 --verbose True --num_gpus 1 --plots True -w D:\HybridNets\rgb-clean\bdd100k\hybridnets-d3_74_129225_best.pth
But I am getting the error
Can you please suggest how to solve this issue?
Thank you in advance
for example, left line and right line,not pixel
Hi.
According to the README.txt, I had prepared bdd100k dataset. But the BddDataset failed to load the dataset. I think something wrong with my bdd100k dataset preparation process. In fact I had no idea about where to put colormaps, masks, polygons, rles folder of drivable and lane. I don't know where to put the json file of detection labels.
Please give the detailed folder structure.
if i just want to seg one class,such as seg_list only have ’road‘.
Then i run train.py ,
in loss.py line 538,
in soft_tversky_score assert output.size() == target.size()
AssertionError
then i debug code,find output.size() = torch.Size([2, 1, 245760]) target.size() = torch.Size([2, 1, 491520])
How to fix that???
Hello. First of all,thank you for this work.
I noted you mistake the code about the inf_time and fps.
So I think maybe you calculation the inference time incorrectly in the article , your article show that YOLOP have 52ms the infercence time per frame(batch size 1), which mean 20fps? (although 41 fps show in the YOLOP's article).
And sadly in the hybridnets_test.py , i try calculate the HYBRIDNET's inference time but only get 0.06s(only model(x) ) ,which means 17-20fps. (Tesla v100 )but get 0.021s(only model(x) ) in YOLOP, which means 48 fps(Tesla v100 )
Sadly , it may not faster than YOLOP and not reach the real-time.
@datvuthanh @xoiga123
Hi. I am receiving this error when I pass a .jpg input image of size (2160,4096,3) for testing. Can you please help me resolve this issue? Thank you!
Command I ran - python hybridnets_test.py -w weights/hybridnets.pth --source demo/image --output demo_result --imshow False --imwrite True
Traceback (most recent call last):
File "hybridnets_test.py", line 121, in <module>
features, regression, classification, anchors, seg = model(x)
File "env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "folder/HybridNets/backbone.py", line 104, in forward
features = self.bifpn(features)
File "env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "env/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "folder/HybridNets/hybridnets/model.py", line 179, in forward
outs = self._forward_fast_attention(inputs)
File "folder/HybridNets/hybridnets/model.py", line 211, in _forward_fast_attention
p5_up = self.conv5_up(self.swish(weight[0] * p5_in + weight[1] * self.p5_upsample(p6_up)))
RuntimeError: The size of tensor a (11) must match the size of tensor b (12) at non-singleton dimension 2
When I run on CPU this error appear. Can you help me fix it?
channels_last:
While PyTorch operators expect all tensors to be in Channels First (NCHW) dimension format, PyTorch operators support 3 output memory formats.
Contiguous: Tensor memory is in the same order as the tensor’s dimensions.
ChannelsLast: Irrespective of the dimension order, the 2d (image) tensor is laid out as an HWC or NHWC (N: batch, H: height, W: width, C: channels) tensor in memory. The dimensions could be permuted in any order.
amp:
extra:
prefetch:
Going to train from scratch to see what's good, with a working log this time.
UPDATE 12/07/2022: Seems like the bottleneck is in dataloading, which takes an unholy amount of time even though I cached everything in RAM. Currently profiling CPU & GPU and trying out this dataloader which allegedly actually does prefetch.
UPDATE: It all makes sense now, Pytorch's Dataloader
can only prefetch batches in the current running epoch. For the next epoch, there is apparently no prefetch whatsoever.
Hello , when i put a image of size 1920*1080 for test,there is the following error.
Can you please help me resolve this issue? Thank you!
"IndexError: boolean index did not match indexed array along dimension 0; dimension is 1080 but corresponding boolean dimension is 720"
#15 #38
We were using multi-label dataloader, loss and metrics for a multi-class problem. Basically they work fine and the results are correct (maybe focal loss segment is a little bit off, who knows, will check further) but to someone reading the code, the semantic meaning is wrong.
TODO: Generalize to multi-class as default, with a switch to multi-label.
TODO in another issue: Multi-label for object detection.
Hello. First of all, great work.
While running the hybridnets_test_videos.py
I found some issues with the FPS calculation part.
In the hybridnets_test_videos.py
script, the FPS is calculated as:
(t2-t1)/frame_count)
But it seems that the above code will give the inference time per frame and not the FPS. Most probably it should be:
1/((t2-t1)/frame_count))
That is, we need to divide it by 1.
Please let me know if any updates happen on this front.
Has authors of this paper trained on COCO? If so, are there any pre-trained weights available?
If not, any suggestion on parameters that needs to be tuned to train on the COCO dataset?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.