Giter Site home page Giter Site logo

comet0y / miccai21_mmq Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aioznetwork/miccai21_mmq

0.0 0.0 0.0 286 KB

Multiple Meta-model Quantifying for Medical Visual Question Answering (MICCAI 2021)

Home Page: https://blog.ai.aioz.io/research/vqa-mmq/

License: MIT License

Shell 1.71% Python 98.29%

miccai21_mmq's Introduction

Multiple Meta-model Quantifying for Medical Visual Question Answering (MMQ-VQA)

This repository is the implementation of MMQ for the visual question answering task in medical domain. Our proposal achieves superior accuracy in comparison with other state-of-the-art (sota) methods on two public medical VQA datasets: PathVQA dataset and VQA-RAD dataset.

  • On PathVQA dataset, our model achieved 13.9% for free-form and 63.8% for yes/no quetions.
  • On VQA-RAD dataset, our model achieved 56.9% for open-ended and 75.7% for close-ended questions.

For the detail, please refer to link.

This repository is based on and inspired by @Binh D. Nguyen's work. We sincerely thank for their sharing of the codes.

Prerequisites

PYTHON 3.6

CUDA 9.2

Please install dependence package by run following command:

pip install -r requirements.txt

MMQ Progress

Fig-1

Figure 1: Multiple Meta-model Quantifying in medical VQA. Dotted lines denote looping steps, the number of loop equals to $m$ required meta-models.

Preprocessing

Important: Before running any command lines, please run following command to access 'mmq_maml' folder:

$ cd mmq_maml

And now, you are in 'mmq_maml' folder.

  • PathVQA dataset for MMQ task should be downloaded via link. The downloaded file should be extracted to data/pathvqa_maml/ directory.
  • VQA-RAD dataset for MMQ task should be downloaded via link. The downloaded file should be extracted to data/vqarad_maml/ directory.

Traing MAML models with MMQ

Train MAML models with MMQ on PathVQA dataset:

$ sh run_pathVQA.sh

Train MAML models with MMQ on VQA-RAD dataset:

$ sh run_VQA_RAD.sh

VQA Progress

Fig-2

Figure 2: Our VQA framework is designed to integrate robust image features extracted from multiple meta-models outputted from MMQ.

Important: For all VQA experiments, you should be in the 'root' folder.

Preprocessing

PathVQA dataset for VQA task

All data should be downloaded via link. The downloaded file should be extracted to data_PathVQA/ directory.

VQA-RAD dataset for VQA task

All data should be downloaded via link. The downloaded file should be extracted to data_RAD/ directory.

Experimental results

MMQ results on PathVQA test set.

m n Free-form Yes/No Overall
MAML - - 5.8 79.5 42.9
MEVF - - 8.1 81.4 47.1
MMQ 5 3 13.4 84.0 48.8
MMQ + MEVF 5 2 13.9 83.8 49.0

MMQ results on VQA-RAD test set.

m n Open-ended Close-ended Overall
MAML - - 40.1 72.4 59.6
MEVF - - 43.9 75.1 62.7
MMQ 5 3 53.7 75.8 67
MMQ + MEVF 5 2 56.9 75.7 68.2

We have considered the recommendation of our reviewers about integrating MMQ into the MEVF. The setup further improves the overall performance in both PathVQA and VQA-RAD datasets. The number of parameters is only a 3% increase in comparison with our original MMQ. We please to provide the pre-trained weights of our state-of-the-art (SOTA) models in here .

Training

Train MMQ + MEVF model with Bilinear Attention Network on PathVQA dataset.

$ sh run_vqa_PathVQA.sh

Train MMQ + MEVF model with Bilinear Attention Network on VQA-RAD dataset.

$ sh run_vqa_VQA_RAD.sh

Pretrained models and Testing

For our SOTA model on PathVQA dataset MMQ_BAN_MEVF_pathVQA. Please download the link and move to saved_models/MMQ_BAN_MEVF_pathVQA/. The trained MMQ_BAN_MEVF_pathVQA model can be tested in PathVQA test set via:

$ sh run_test_PathVQA.sh

For our SOTA model on VQA-RAD dataset MMQ_BAN_MEVF_vqaRAD. Please download the link and move to saved_models/MMQ_BAN_MEVF_vqaRAD/. The trained MMQ_BAN_MEVF_vqaRAD model can be tested in VQA-RAD test set via:

$ sh run_test_VQA_RAD.sh

The result json file can be found in the directory results/.

We also provides the pretrained meta-models and CDAE models for further investigation as belows:

  • VQA-RAD:
    • The MAML models data_RAD/maml/*.pth is trained by using our MMQ source code.
    • The CDAE model data_RAD/pretrained_ae.pth.
  • PathVQA:
    • The MAML models data_PathVQA/maml/*.pth is trained by using our MMQ source code.
    • The CDAE model data_PathVQA/pretrained_ae.pth.

Citation

If you use this code as part of any published research, we'd really appreciate it if you could cite the following paper:

@inproceedings{aioz_mmq_miccai21,
  author={Tuong Do and Binh X. Nguyen and Erman Tjiputra and Minh Tran and Quang D. Tran and Anh Nguyen},
  title={Multiple Meta-model Quantifying for Medical Visual Question Answering},
  booktitle = {MICCAI},
  year={2021}
}

If you find that the Mixture of Enhanced Visual Features (MEVF) model for MedVQA is useful, you could cite the following paper:

@inproceedings{aioz_mevf_miccai19,
  author={Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran},
  title={Overcoming Data Limitation in Medical Visual Question Answering},
  booktitle = {MICCAI},
  year={2019}
}

License

MIT License

miccai21_mmq's People

Contributors

gazeal avatar quangduytran avatar xuanbinh-nguyen96 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.