Giter Site home page Giter Site logo

sabadijou / clld_official Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 1.0 2.05 MB

Contrastive Learning for Lane Detection via cross-similarity

Home Page: https://www.mdu.se/en/malardalen-university/centre-for-industrial-digitalisation/projects/autodeep-research-project

Python 100.00%
contrastive-learning instance-segmentation lane-detection python pytorch self-supervised-learning parallel-programming

clld_official's Introduction

Contrastive Learning for Lane Detection via Cross-Similarity

Contrastive Learning for Lane Detection via Cross-Similarity
arXiv

Overview of CLLD

Contrastive Learning for Lane Detection via cross-similarity (CLLD), is a self-supervised learning method that tackles this challenge by enhancing lane detection models’ resilience to real-world conditions that cause lane low visibility. CLLD is a novel multitask contrastive learning that trains lane detection approaches to detect lane markings even in low visible situations by integrating local feature contrastive learning (CL) with our new proposed operation cross-similarity. To ease of understanding some details are listed in the following:

  • CLLD employs similarity learning to improve the performance of deep neural networks in lane detection, particularly in challenging scenarios.
  • The approach aims to enhance the knowledge base of neural networks used in lane detection.
  • Our experiments were carried out using ImageNet as a pretraining dataset. We employed pioneering lane detection models like RESA, CLRNet, and UNet, to evaluate the impact of our approach on model performances.

CLLD architecture
CLLD architecture

Get started

  1. Clone the repository

    git clone https://github.com/sabadijou/clld_official.git
    

    We call this directory as $RESA_ROOT

  2. Create an environment and activate it (We've used conda. but it is optional)

    conda create -n clld python=3.9 -y
    conda activate clld
  3. Install dependencies

    # Install pytorch firstly, the cudatoolkit version should be same in your system. (you can also use pip to install pytorch and torchvision)
    conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
      
    # Install kornia and einops
    pip install kornia
    pip install einops
    
    # Install other dependencies
    pip install -r requirements.txt

How to Run CLLD

We conducted pretraining using the training data from ImageNet. However, you are free to utilize other datasets and configurations as needed. The configuration file for our approach can be found in the configs folder.

Once the dataset and new configurations are in place, you can execute the approach using the following command:

python main.py --dataset_path /Imagenet/train --encoder resnet50 --alpha 1 --batch_size 1024 --world_size 1 --gpus_id 0 1 

The following is a quick guide on arguments:

  • dataset_path: Path to training data directory
  • encoder: Select an encoder for training. resnet18, resnet34, resnet50, resnet101, resnet152, resnext50_32x4d,resnext101_32x8d, wide_resnet50_2, wide_resnet101_2.
  • alpha: Cross similarity window size
  • batch_size: Select a batch size that suits the GPU infrastructure you are using.
  • world_size: For example, if you are training a model on a single machine with 4 GPUs, the world size is 4. If you have 2 machines, each with 4 GPUs, and you use all of them for training, the world size would be 8.
  • gpus_id: Please specify all the GPU IDs that you used for training the approach.

How to publish weights

Upon completing the training phase, you can execute the command below to prepare the trained weights for use as prior knowledge in the backbone of a lane detection model.

python main.py --checkpoint path/to/checkpoint --encoder resnet50 

Our experiments

We specifically chose to evaluate CLLD with U-Net because it is a common encoder-decoder architecture used in various methods that approach lane detection as a segmentation-based problem. In addition, we tested our method using RESA, which is currently the state-of-the-art semantic segmentation lane detection method that is not based on the UNet architecture.This independent validation is necessary to ensure the accuracy of our results. Lastly, we evaluated CLLD using CLRNet, a leading anchor-based lane detection method.

Visualized results
Visualized results

Performance of UNet on CuLane and TuSimple with different contrastive learnings.

Method # Epoch Precision (CuLane) Recall (CuLane) F1-measure (CuLane) Accuracy (TuSimple)
PixPro 100 73.68 67.15 70.27 95.92
VICRegL 300 67.75 63.43 65.54 93.58
DenseCL 200 63.8 58.4 60.98 96.13
MoCo-V2 200 63.08 57.74 60.29 96.04
CLLD (α=1) 100 71.98 69.2 70.56 95.9
CLLD (α=2) 100 70.69 69.36 70.02 95.98
CLLD (α=3) 100 71.31 69.59 70.43 96.17

Performance of RESA on CuLane and TuSimple with different contrastive learnings.

Method # Epoch Precision (CuLane) Recall (CuLane) F1-measure (CuLane) Accuracy (TuSimple)
PixPro 100 77.41 73.69 75.51 96.6
VICRegL 300 76.27 69.58 72.77 96.18
DenseCL 200 77.67 73.51 75.53 96.28
MoCo-V2 200 78.12 73.36 75.66 96.56
CLLD (α=1) 100 79.01 72.99 75.88 96.74
CLLD (α=2) 100 78 73.45 75.66 96.78
CLLD (α=3) 100 78.34 74.29 76.26 96.81

Performance of CLRNet on CLRNet and TuSimple with different contrastive learnings.

Method # Epoch Precision (CuLane) Recall (CuLane) F1-measure (CuLane) Accuracy (TuSimple)
PixPro 100 89.19 70.39 78.67 93.88
VICRegL 300 87.72 71.15 78.72 89.01
DenseCL 200 88.07 69.67 77.8 85.15
MoCo-V2 200 88.91 71.02 78.96 93.87
CLLD (α=1) 100 88.72 71.33 79.09 90.68
CLLD (α=2) 100 87.95 71.44 78.84 93.48
CLLD (α=3) 100 88.59 71.73 79.27 94.25

Acknowledgement

clld_official's People

Contributors

alizoljodi avatar sabadijou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

llw111

clld_official's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.