tonyxuqaq / rngdetplusplus Goto Github PK

Official repo of paper RNGDet++: Road Network Graph Detection by Transformer with Instance Segmentation and Multi-scale Features Enhancement

License: GNU General Public License v3.0

Python 94.87% Shell 1.07% Go 4.05%

rngdetplusplus's Introduction

RNGDet++

This is the official repo of paper RNGDet++: Road Network Graph Detection by Transformer with Instance Segmentation and Multi-scale Features Enhancement by Zhenhua Xu, Yuxuan Liu, Yuxiang Sun, Ming Liu and Lujia Wang.

Supplementary materials

For the demo video and supplementary document, please visit our project page.

Update

Mar/1/2023: Paper accepted by RA-L.

Dec/23/2022: Add SpaceNet dataset

Nov/28/2022: Release the initial version training code

Oct/23/2022: Update the Sat2Graph City-Scale dataset onto Google drive, since the raw data link provided by Sat2Graph is not valid any longer.

Sep/21/2022: Release the inference code

Platform info

Hardware:

GPU: 4 RTX3090
CPU: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
RAM: 256G
SSD: 4T

Software:

Ubuntu 20.04.3 LTS
CUDA 11.1
Docker 20.10.7
Nvidia-driver 495.29.05

Docker

This repo is implemented in the docker container. All experiments except evaluation are conducted within docker containers. Make sure you have docker installed. Please refer to install Docker and Docker beginner tutorial for more information.

Docker image

cd docker
./build_image.bash

Docker container

In ./build_continer.bash, set RNGDet_dir as the directory of this repo.

./build_continer_cityscale.bash # to try city scale dataset released by Sat2Graph
./build_continer_spacenet.bash # to try SpaceNet dataset

Note We keep the raw code for the city scale dataset. For the new added spacenet dataset, we modify the processing stripts to better fit RNGDet++ to it, since the spacenet dataset has smaller images covering smaller regions, which has quite different characteristics with that of the city scale dataset.

Data preparation and pretrained checkpoints

Run the follow commands to prepare the dataset and pretrained checkpoints of RNGDet and RNGDet++.

cd prepare_dataset
./preprocessing.bash

Update Oct/23/2022

The raw data download link provided by MIT is invalid now. We update the data to Google Drive.

Update Dec/24/2022

The script to download the data from Google Drive is blocked. Please manually download the data and put it into prepare_dataset. The Google Drive link could be found in the comment line in ./prepare_dataset/preprocessing.bash

Sampling

Before training, run the sampler to generate traing samples:

./bash/run_sampler.sh

Parameters:

edge_move_ahead_length (int, default=30): Max distance(pixels) moving ahead in each step.
noise (int, default=8): Max random noise added during the sampling process (uniform distribution noise).
max_num_frame (int, default=10000): Max number of samples generated for each large aerial image.

Train

To train RNGDet, run

./bash/run_train_RNGDet.sh

To train RNGDet++, run

./bash/run_train_RNGDet++.sh

Note: Due to the randomness existing in both sampling and training, the final performance of the proposed models might be slightly different from the number reported in the paper. Please open an issue if you cannot produce the results.

Inference

To try RNGDet, run

./bash/run_test_RNGDet.sh

To try RNGDet++, run

./bash/run_test_RNGDet++.sh

Parameters:

candidate_filter_threshold (int, default=30): The distance threshold to filter initial candidated obtained from segmentation heatmap peaks. If one peak is too closed to the road network graph detected so far, it is filtered out.
logit_threshold (float,0~1,default=0.75): The threshold to filter invalid vertices in the next step.
extract_candidate_threshold (float,0~1,default=0.7): The threshold to detect local peaks in the segmentation heatmap to find initial candidates.
alignment_distance (int, default=5): The distance threshold for graph merge and alignment. If a predicted vertex is too closed to predicted key vertices in the past, they are merged.
instance_seg (bool, default=False): Whether the instance segmentation head is used.
multi_scale (bool, default=False): Whether multi-scale features are used.
process_boundary (bool, default=False): Whether increase the logit_threshold near image boundaries.

Note: We provide the parameter setting in inference scripts of RNGDet and RNGDet++ in ./bash that achieve the best performance.

Evaluation

Go to {{ DATASET_NAME }}/metrics. For TOPO metrics, run

./topo.bash

For APLS metrics, run

./apls.bash

Remember to set the path of predicted graphs in bash scripts.

Note: Evaluation metric scripts are not runnable in docker container. Please use them outside docker.

Note: Due to the randomness of RNGDet++ and evaluation metrics, the actual evaluation results might be slight different from the reported numbers in the paper.

Contact

For any questions, please open an issue.

Ackonwledgement

We thank the following open-sourced projects:

SAT2GRAPH

DETR

Citation

@article{xu2022rngdet++,
  title={RNGDet++: Road Network Graph Detection by Transformer with Instance Segmentation and Multi-scale Features Enhancement},
  author={Xu, Zhenhua and Liu, Yuxuan and Sun, Yuxiang and Liu, Ming and Wang, Lujia},
  journal={arXiv preprint arXiv:2209.10150},
  year={2022}
}

License

GNU General Public License v3.0

Not allowed for commercial purposes. Only for academic research.

rngdetplusplus's People

Contributors

Stargazers

Watchers

Forkers

guoxingyan codeastra2 sunstarchan klxqlehua 7777777fan sfoucher 81123 weisili2016 lyan-ing zsc1220 exaids66

rngdetplusplus's Issues

about Training label calculation

In Fig. 7. What is the significance of the intersection model? Why is the threshold set to 20 instead of 40 in this model?

I understand that in your paper, the starting point label of the intersection road is never the intersection point, but the point outside the intersection point 20. The penultimate point of the end is also the intersection.

Tensorboard Log

Hello, is it convenient to publish a tensorboard log map during the training process?

The function of counter_map

[output_pred_coords_world = [v for i,v in enumerate(pred_coords_world) if v!=self.current_coord and self.counter_map[v[1],v[0]]<10](url)

Hello, what is the function of counter_map? Why did he take the threshold 10?

Segmentation operations during training

in agent.py
for i in range(self.image_size//self.crop_size+1): for j in range(self.image_size//self.crop_size+1):
This may affect the test speed. I understand that you can change crop_size to 1024 and directly output the output to reduce the running of the for statement. Excuse me, do I understand correctly?

In addition, in the preprocessing process of the training, the segmentation loss operation is performed on the whole image.

dataset

Hi, I couldn't find the link to the RNGDet++ dataset on your GitHub page. Could you please provide the Google Drive link for the Sat2Graph City-Scale dataset? I would really appreciate your help!

how long it took for your model, which was trained on 4 RTX4090，and how many epochs it took to achieve the current level of performance.

谢谢你提供的代码，我是机器学习方面的小白，我想根据你们的模型重新训练以用在别的领域上，我尝试跑了一个自制的训练集，但效果很微弱（当然这是预料之内的）。我想知道你们用的4张4090训练花了多久？训练了多少轮才达到你们现在的效果？
Thank you for providing the code. I'm relatively new to machine learning, and I'm looking to retrain your model for use in a different domain. I've tried running it on my custom dataset, but the results are quite weak (which was expected, of course). I'm curious to know how long it took for your model, which was trained on 4 RTX4090。 and how many epochs it took to achieve the current level of performance.

If there are some turning points ahead, How do you find the turning point?

If there are some turning points ahead, we find the turning point that is closest to vt as (vt1+1)∗ (subfigure (b) of Fig. 7).

How did you achieve this? Is the implementation of this step in the published code?

I understand that the path has been simplified in the process of creating the graph.

We look forward to your reply.

Could you please release the code for GT generation during training? Thanks!

train on custom dataset

Hi, Thank you for sharing the code provided. I am looking to train RNGDET++ using a dataset that has a 10cm resolution, yet the model parameters are currently configured for a 1m resolution. Could you kindly advise on how to adjust this parameter or specify which parameter requires modification to accommodate the desired 10cm resolution? Your assistance is greatly appreciated. Thank you

About RNGDet result

Hi, thanks for your wonderful work! I have run RNGDet on Cityscale testing set, but the results is lower than that report in the paper. I use the conda enviroment, instead of docker. Is there anything wrong? And I wonder which version of torch did you use? I not find in the requirement.txt.

Prec.	Rec.	F1	APLS
86.01	68.57	76.13	63.758

About Evaluation Metrics

Hello, I found that in the accuracy evaluation section, you mentioned in the RNGDet paper: "For a pixel in B∗, if there exists a pixel in ˆB and the Euclidean distance between them is smaller than δ, then this pixel is treated as correctly retrieved." I would like to ask how you implemented this accuracy evaluation code, is it done with a dilation-corrosion algorithm? Can you share the relevant code? Thanks.

I am not sure why not try RoadTracer Dataset in RNGDet++ like the experiment conducted in RNGDet

Training Label Calculation

Hi,
Thanks so much for your work ! I had few understanding question regarding the training code since i am trying to re-implement it.

(ROI,H,S∗,I∗,V∗,P∗) would it be single training sample, I could generate the the first 4 using the existing agent code for cropping and label calculation. however, the last 2 I am ham having issues.
V∗ stands for valid ground truth co-ordinates in the next time step, and for this you propose graph simplification(remove deg 2 vertices) and then move the agent t ot t1, but how is the angle calculated in this case? How do we know the the angle in this case? Is this like we are on the vertex v and go over its edges and based on if it is intersection or segment we add a point along the direction of this edge?
P∗ stands for ground truth probability of vertices, since in the data generation phase we are only moving along the ground truth network, shouldn't it always be 1?
Also as regards the loss function for matching the vertices is this done with M=10 vertices or is it calculated once filtering out these vertices?

In general if you could describe the generation of the above 2 labels it would be of great help to me, thanks!

the dataset link can't open

hi, the dataset link can't open. Would you share with other link(like Google Drive)?

How to Batch Training？

I have reproduced the training code, but only one 128 x 128 input is supported, and the batchsize is 1. I do not know how to perform the batch operation. After all, the quantity of matching is inconsistent.

Some issues related to history_map

By making the dataset code, the generated dataset, whether it's mask or point, has more than one pixel width,
However, the pixel width of history_map is 1. Is this normal?

How does a road without an intersection generate a label?

in create_label.py
if (len(src.neighbor_edges)<=1 and len(dst.neighbor_edges)<=1)
The code here filters out the roads without intersection points (only the roads with the start and end points).

But this kind of data occurs frequently on datasets that are divided into smaller pieces.

Also, when does your department plan to train open source code?

We look forward to your reply. Thank you.

CNN backones

in backbone.py
def __init__(self, backbone: nn.Module, train_backbone: bool, num_channels: int, return_interm_layers: bool, history: bool):

I found that the history parameter was not actually used.
Focal Los is not used in the actual training code.

Inference problem on the new trained weights

First thanks for releasing your nice code. I wanted to fine-tune your model on my own dataset but after finishing the finetuning process there is a problem in using the trained weights for testing and inferencing results. In addition, I couldn't re-train your dataset and use the generated weights. Even though the train finished successfully, in the inferencing I received the following error and I couldn't run the main_test.py.
ERROR:

for your better understanding, these are the arguments that I used to run codes.
for training, I ran:

CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node 2 main_train.py --savedir RNGDet --dataroot ./dataset/ --batch_size 8 --ROI_SIZE 128 --nepochs 50 --backbone resnet101 --eos_coef 0.2 --lr 9e-5 --lr_backbone 9e-5 --weight_decay 1e-5 --noise 8 --image_size 2048\
  --candidate_filter_threshold 30 --logit_threshold 0.75 --extract_candidate_threshold 0.55 --alignment_distance 5 --instance_seg --multi_scale --multi_GPU --frozen_weights pretrain_cityscale/RNGDet_multi_ins/RNGDetPP_best.pt

for testing, I ran:

CUDA_VISIBLE_DEVICES=0,1 python main_test.py --savedir RNGDet --image_size 2048\
 --dataroot ./dataset/ --ROI_SIZE 128 --backbone resnet101 --checkpoint_dir RNGDetNet_0.pt\
 --candidate_filter_threshold 30 --logit_threshold 0.7 --extract_candidate_threshold 0.7 --alignment_distance 5 --batch_size 8 \
 --instance_seg --multi_scale --process_boundary

I am kindly asking for your help to solve this problem.

Intersection Detection

Hello, thanks for releasing the code. I trained the RNGDet++ model using data gathered from various European countries, and while the pixel accuracy was similar to the results reported in the paper, the intersection accuracy was significantly lower. I also tried fine-tuning the model on my small lane detection dataset, and the model couldn't detect almost any intersections. However, if I fine-tuned the backbone and the FPN head only with intersection data, the model could detect some intersections correctly. A possible cause for this might be that my data has higher pixel resolution, thus there are fewer intersections on each ROI sample, and the intersection loss is much smaller than the segment loss. I have thought about increasing the positive weight for the intersection loss, adding a third backbone for solely intersection detection and increasing the crop size of ROI. Would these ideas be worth trying? Do you have any other suggestions for improving the intersection detection accuracy?

Also, why are intersections detected at the edges of the satellite image, even if no actual intersections exist? Currently, these detections are filtered out in the code, which means that roads with no intersections on the satellite image won't be detected. These detections at the edges didn't appear when I fine-tuned the backbone separately with only intersection data.

Thank you!

L1 Loss

Whether the coordinates need to be normalized to 0-1 during L1loss regression？

This is a problem about rtree.go

when i run ./apls.bash .i meet this problem:
metrics/apls/main.go:106:50: cannot use rtreego.Point{...}.ToRect(tol) (type rtreego.Rect) as type *rtreego.Rect in return argument
metrics/apls/main.go:360:13: cannot use &gNode (type *gpsnode) as type rtreego.Spatial in argument to rt.Insert:

How's the effect at the regular tic junction?

在常规井字路口的效果怎么样？

aux_loss

I found that many of the parameters were not used, such as aux_loss. It is also mentioned in the paper that the aux auxiliary layer is not used. Have you done the experiment of using aux?

Could you release the codes for training RNGDetPlusPlus

I read your paper about RNGDet and found it's interesting and innovative. I'm also working on linear object detection, and want to compare the performance of your model and mine. So it would be great if you can release the codes for the training process. Thanks a lot!

About the preparation of the training dataset.

Thank you very much for sharing the source code. It is a great job. I currently want to train the RNGDet++ model on my dataset, but my ground-truth data is a road binary map in segmentation format, and I would like to know how to convert it to graph format. Could you please share the relevant data processing code? Enables me to create the sample points and the refined ground truth graphs (_refine_gt_graph.p) with a binary image. Thanks.

The number of "unexplored_edges" is larger than "num_queries"

Hi @TonyXuQAQ, I had a problem preparing my dataset before training the RNGDet++ model. Some vertices in my son file contain more than num_queries(let's say default = 10) => which raises a Value error during training. How did you make the dataset before running the create_label.py script?