Giter Site home page Giter Site logo

songci's Introduction

SongCi 🐲

SongCi is a multi-modal deep learning model tailored for forensic pathological analyses. The architecture consists of three main parts, i.e., an imaging encoder for WSI feature extraction, a text encoder for the embedding of gross key findings as well as diagnostic queries, and a multi-modal fusion block that integrates the embeddings of WSI and gross key findings to align with those of the diagnostic queries.

δΈ­ζ–‡ | English

Large-vocabulary forensic pathological analyses via prototypical cross-modal contrastive learning

The framework of SongCi and studied large-vocabulary, multi-center datasets.

Updates:

  • 05/06/2024: We are working on refining the code updates for the SongCi model.

Installation:

Pre-requisites:

python 3.9+
CUDA 12.1
pip
ANACONDA

After activating the virtual environment, you can install specific package requirements as follows:

pip install -r requirements.txt

Optional: Conda Environment Setup For those who prefer using Conda:

conda create --name songci python=3.9.7
conda activate songci
git clone https://github.com/shenxiaochenn/SongCi.git
cd SongCi
pip install -r requirements.txt

WSI preprocessing and the content of text(gross key findings & forensic pathology diagnosis)

WSI

NOTE: In practical scenarios, a single slide can encompass a variety of tissue types. To reduce the labeling time required by forensic scientists, we have adopted a straightforward approach by delineating the area with a simple rectangular boundary. Conversely, regions comprising a single tissue type are segmented without the need for explicit labeling.

svs_datasets/
  β”œβ”€β”€ slide_1.svs
  β”œβ”€β”€ slide_2.svs
  β”œβ”€β”€ slide_3.svs
  β”œβ”€β”€ slide_3.json 
  β”œβ”€β”€ slide_4.svs
  └── ...

Here we give an example.

python patch_tmp.py

This will split each WSI at the specifwied magnification by looping through it, while the JSON file in this is an annotation file (containing the 4 coordinates of the annotation box). Finally, we will get the patch-level datasets!

patch_datasets/
  β”œβ”€β”€ slide_1/
    β”œβ”€β”€ slide_1-0_1_.png
    β”œβ”€β”€ slide_1-0_2_.png
    β”œβ”€β”€ slide_1-0_3_.png
    └── ...
  β”œβ”€β”€ slide_2/
    β”œβ”€β”€ slide_2-0_1_.png
    β”œβ”€β”€ slide_2-0_2_.png
    β”œβ”€β”€ slide_2-0_3_.png
    └── ...
  β”œβ”€β”€ slide_3/
  β”œβ”€β”€ slide_4/
  └── ...

gross key findings & forensic pathology diagnosis

We provide sample text here in one of our cohorts.

The gross key finding is a paragraph and forensic pathology diagnosis are text segments delineated by /.

text_xianjiaotong.csv
slide_name gross key findings forensic pathology diagnosis
slide_1 The mucosa is smooth, complete and pink, there is no bleeding, ulceration or perforation. Gastrointestinal congestion/Gastrointestinal tissue autolysis
slide_2 There is a tear in the bottom of the heart, which leads inward to the left ventricle, the myocardium is dark red, and the coronary artery is stiff. Coronary atherosclerotic heart disease/Myocardial infarction with heart rupture/Pericardial tamponade
slide_3 The envelope of both kidneys is complete and easy to peel, the surface and section are brown red, and the boundary between skin and medulla is clear. Renal autolysis/Congestion of kidney

prototypical contrastive learning

  • how to train the prototypical self-supervised contrastive learning?

NOTE: In our study, the CUDA version is 12.1 and python is 3.9. The computational experiments should be conducted on a system equipped with a minimum of eight NVIDIA GeForce RTX 3090 graphics cards. If you use fp16 for training, in our study, it's unstable.

python -m torch.distributed.launch --nproc_per_node=8  prototype_encoder/main_prototype.py   --use_bn_in_head True  --use_pre_in_head True  --use_fp16 False  --batch_size_per_gpu 96 --data_path /path/to/WSI_patch/train --output_dir /path/to/saving_dir

results:

/path/to/saving_dir/
  β”œβ”€β”€  log.txt 
  β”œβ”€β”€ checkpoint.pth
  β”œβ”€β”€ queue.pth
  └── ...

WSI patch generator both prototype-based & instance-based

If you implement prototype-based generation, use the patch_generation/guided_diffusion/get_ssl_models.py file.

If you implement instance-based generation, use the patch_generation/guided_diffusion/get_sl_models.py file.

default: prototype-based

train

IN the patch_generation folder, just run:

sh train.sh 

sampling:

  • prototype-based : the default loop iterates over all prototypes
sh sample_prototype.sh
  • instance-based: choose the instance what you like
sh sample.sh

WSI segmentaion

First we convert each WSI into a table. In the tabel, we are able to know which prototype each patch belongs to, the exact value of similarity and the coordinates of this patch in the WSI.

python wsi_seg/prototype_index.py

you will get the WSI table.

For example:

patch_name WSI_name x_axis y_axis pro_index sim_value
patch_1 WSI_1 0 0 2 0.9623
patch_2 WSI_1 1 0 56 0.8958
patch_3 WSI_1 1 2 3 0.9703

then just run, and you will get the final segmentation results

python wsi_seg/wsi_seg_prototype.py

cross-modality contrastive learning

how to train the modality fusion block

  • train
python main_fusion.py  --data_path xxx  --depth 2 --checkpoint xxx(prototype-encoder) --output_dir xxx --gate True --noise_ratio 0.5 --saveckp_freq 100 --warmup_epochs 50

At the inferrence time, a csv file will be returned containing the forensic diagnostic results predicted by the model for the samples provided.

  • inference
python score_modality.py  --checkpoint xxx(prototype-encoder)  --fusion_checkpoint xxx(fusion block)   --data_path xxx --threshold 0.88  --out_name xx

Multi-modality explainability

We will count the scores for each prototype and each word and turn them into a table.

WSI_name disease img_dict text_dict
WSI_1 The hemorrhage under the scalp {prototype:score} {word:score}
WSI_2 Gastrointestinal congestion {prototype:score} {word:score}
WSI_3 Gastrointestinal tissue autolysis {prototype:score} {word:score}
  • For a list of samples ~
python visual_modality_index.py

Examples: here we show the top 5 prototypes and top 5 words

Multi-modality attention visualization of SongCi

Connection

✌️ If you have a keen interest in forensic pathology and wish to contribute to this field, whether it be through data provision, inquiries about algorithm implementation, innovative suggestions, or a desire for comprehensive communication and collaboration, we encourage you to contact us. We eagerly anticipate engaging in discussions with you!πŸ˜† πŸ˜† πŸ˜†

  • Zhenyuan Wang Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center,Xi’an Jiaotong University Email: [email protected]
  • Chunfeng Lian School of Mathematics and Statistics, Xi'an Jiaotong University Email: [email protected]
  • Chen Shen Xi'an Jiaotong University Email: [email protected]

Citation

If you find SongCi useful for your your research and applications, please cite using this BibTeX:

@misc{shen2024largevocabularyforensicpathologicalanalyses,
      title={Large-vocabulary forensic pathological analyses via prototypical cross-modal contrastive learning}, 
      author={Chen Shen and Chunfeng Lian and Wanqing Zhang and Fan Wang and Jianhua Zhang and Shuanliang Fan and Xin Wei and Gongji Wang and Kehan Li and Hongshu Mu and Hao Wu and Xinggong Liang and Jianhua Ma and Zhenyuan Wang},
      year={2024},
      eprint={2407.14904},
      archivePrefix={arXiv},
      primaryClass={eess.IV},
      url={https://arxiv.org/abs/2407.14904}, 
}

Related Projects

dino

guided-diffusion

flamingo

songci's People

Contributors

shenxiaochenn avatar

Stargazers

 avatar  avatar  avatar Sanctuary avatar  avatar

Watchers

Kostas Georgiou avatar  avatar

Forkers

ladderlab-xjtu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.