Giter Site home page Giter Site logo

javiermcebrian / glcapsnet Goto Github PK

View Code? Open in Web Editor NEW
33.0 7.0 2.0 305 KB

Global-Local Capsule Network (GLCapsNet) is a capsule-based architecture able to provide context-based eye fixation prediction for several autonomous driving scenarios, while offering interpretability both globally and locally.

License: Apache License 2.0

Dockerfile 0.75% Python 97.77% Shell 1.48%
computer-vision capsule-networks deep-learning visual-attention autonomous-vehicles autonomous-driving context-specific interpretability convolutional-neural-networks

glcapsnet's Introduction

GLCapsNet

Code for the paper entitled Interpretable Global-Local Dynamics for the Prediction of Eye Fixations in Autonomous Driving Scenarios, publicly available in IEEE Access. Supplementary material as videos and images are provided along with the paper in the IEEE Access site.

picture

Global-Local Capsule Network (GLCapsNet) block diagram. It predicts eye fixations based on several contextual conditions of the scene, which are represented as combinations of several spatio-temporal features (RGB, Optical Flow and Semantic Segmentation). Its hierarchical multi-task approach routes Feature Capsules to Condition Capsules both globally and locally, which allows for the interpretation of visual attention in autonomous driving scenarios.

Docker environment

How to use it?

  • Install nvidia-docker
  • Configure environment-manager.sh:
    • image_name: the name of the Docker image
    • data_folder: the path to the storage (mounted as volume)
    • src_folder: the path to the local copy of this source code (mounted as volume)
  • Run environment-manager.sh:
    • service: one of the service names defined at docker-config.json, with the path to the child Dockerfile and the tag of the CUDA base image to use.
    • action: what to do with the environment

How to create a new environment?

Experiments

How to run it?

  • Generate the input features:
  • The usage is defined at execute.py:
    • mode: train, test (efficient computation of metrics), predict (sample-wise prediction for saving data to disk)
    • feature: rgb, of (optical flow), segmentation_probabilities (semantic segmentation)
    • conv_block: the kind of convolutional module to use from conv_blocks.py
    • caps_block: the kind of capsule-based module to use from caps_blocks.py
    • experiment_id: folder name of the experiment with datetime
    • do_visual: save visual predictions
  • The execution generates the following:
/path_output_in_config/[all,rgb,of,segmentation_probabilities]/conv_block/caps_block/experiment_id/config_train.py
/path_output_in_config/[all,rgb,of,segmentation_probabilities]/conv_block/caps_block/experiment_id/checkpoints/weights.h5
/path_output_in_config/[all,rgb,of,segmentation_probabilities]/conv_block/caps_block/experiment_id/logs/tensorboard-logs
/path_output_in_config/[all,rgb,of,segmentation_probabilities]/conv_block/caps_block/experiment_id/logs/log.csv
/path_output_in_config/[all,rgb,of,segmentation_probabilities]/conv_block/caps_block/experiment_id/logs/trace_sampling.npy
/path_output_in_config/[all,rgb,of,segmentation_probabilities]/conv_block/caps_block/experiment_id/predictions/[test_id,prediction_id]/[resulting_files]
  • Below it is described the training command to use per predefined config file (please note that the dataset and some other files must be generated first, and also the paths have to be adapted in each config file):
    • 00_branches:
      • rgb: python3.6 execute.py -m train -f rgb --conv_block cnn_generic_branch
      • of: python3.6 execute.py -m train -f of --conv_block cnn_generic_branch
      • segmentation_probabilities: python3.6 execute.py -m train -f segmentation_probabilities --conv_block cnn_generic_branch
    • 01_sf: python3.6 execute.py -m train -f all --conv_block cnn_generic_fusion
    • 02_gf: python3.6 execute.py -m train -f all --conv_block cnn_generic_fusion
    • 03_sc: python3.6 execute.py -m train -f all --conv_block cnn_generic_branch --caps_block ns_sc
    • 04_ns_sc: python3.6 execute.py -m train -f all --conv_block cnn_generic_branch --caps_block ns_sc
    • 05_triple_ns_sc: python3.6 execute.py -m train -f all --conv_block cnn_generic_branch --caps_block triple_ns_sc
    • 06_mask_triple_ns_sc: python3.6 execute.py -m train -f all --conv_block cnn_generic_branch --caps_block mask_triple_ns_sc
    • 07_mt_mask_triple_ns_sc: python3.6 execute.py -m train -f all --conv_block cnn_generic_branch --caps_block glcapsnet
    • 08_glcapsnet: python3.6 execute.py -m train -f all --conv_block cnn_generic_branch --caps_block glcapsnet

How to create new models?

Same I/O schema

  • Keep the input features, conditions and targets as for the already developed models:

New I/O schema:

Requirements

Model function names are required to be unique per conv_block or caps_block, as the code manage the executions via that names.

Citation

If you use portions of this code or ideas from the paper, please cite our work:

@article{martinez2020glcapsnet,
  title={Interpretable Global-Local Dynamics for the Prediction of Eye Fixations in Autonomous Driving Scenarios},
  author={J. {Martínez-Cebrián} and M. {Fernández-Torres} and F. {Díaz-de-María}},
  journal={IEEE Access},
  volume={8},
  pages={217068-217085},
  year={2020},
  publisher={IEEE},
  doi={10.1109/ACCESS.2020.3041606}
}

Questions

Plese, any question or comment email me at [email protected]. I will be happy to discuss anything related to the topic of the paper.

glcapsnet's People

Contributors

javiermcebrian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.