Giter Site home page Giter Site logo

siyi-wind / tip Goto Github PK

View Code? Open in Web Editor NEW
10.0 3.0 1.0 1.47 MB

[ECCV 2024] TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data (an official implementation)

License: Apache License 2.0

Jupyter Notebook 44.63% Python 55.37%

tip's Introduction

TIP

Model architecture and algorithm of TIP: (a) Model overview with its image encoder, tabular encoder, and multimodal interaction module, which are pre-trained using 3 SSL losses: $\mathcal{L}_{itc}$, $\mathcal{L}_{itm}$, and $\mathcal{L}_{mtr}$. (b) Model details for (b-1) $\mathcal{L}_{itm}$ and $\mathcal{L}_{mtr}$ calculation and (b-2) tabular embedding with missing data. (c) Pre-training algorithm.

This is an official PyTorch implementation for TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data, ECCV 2024. We built the code based on paulhager/MMCL-Tabular-Imaging.

Concact: [email protected] (Siyi Du)

Share us a โญ if this repository does help.

Updates

[11/07/2024] The arXiv paper is released.

[08/07/2024] The code is released.

Contents

Requirements

This code is implemented using Python 3.9.15, PyTorch 1.11.0, PyTorch-lighting 1.6.4, CUDA 11.3.1, and CuDNN 8.

cd TIP/
conda env create --file environment.yaml
conda activate tip

Data

Download DVM data from here

Apply for the UKBB data here

Preparation

  1. Execute data/create_dvm_dataset.ipynb to get train, val, test datasets.
  2. Execute data/image2numpy.ipynb to convert jpg images to numpy format for faster reading during training.
  3. Execute data/create_missing_mask.ipynb to create missing masks (RVM, RFM, MIFM, LIFM) for incomplete data fine-tuning experiments.

Training

Pre-training & Fine-tuning

CUDA_VISIBLE_DEVICES=0 python -u run.py --config-name config_dvm_TIP exp_name=pretrain

Fine-tuning

CUDA_VISIBLE_DEVICES=0 python -u run.py --config-name config_dvm_TIP exp_name=finetune pretrain=False evaluate=True checkpoint={YOUR_PRETRAINED_CKPT_PATH}

Fine-tuning with incomplete data

CUDA_VISIBLE_DEVICES=0 python -u run.py --config-name config_dvm_TIP exp_name=missing pretrain=False evaluate=True checkpoint={YOUR_PRETRAINED_CKPT_PATH} missing_tabular=True missing_strategy=value missing_rate=0.3

Checkpoints

Pre-trained Checkpoints

Datasets DVM Cardiac
Checkpoints Download Download

Fine-tuned Checkpoints

Task Linear-probing Fully fine-tuning
Car model prediction (DVM) Download Download
CAD classification (Cardiac) Download Download
Infarction classification (Cardiac) Download Download

Lisence & Citation

This repository is licensed under the Apache License, Version 2.

If you use this code in your research, please consider citing:

@inproceedings{du2024tip,
  title={{TIP}: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data},
  author={Du, Siyi and Zheng, Shaoming and Wang, Yinsong and Bai, Wenjia and O'Regan, Declan P. and Qin, Chen},
  booktitle={18th European Conference on Computer Vision (ECCV 2024)},

Acknowledgements

We would like to thank the following repositories for their great works:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.