Giter Site home page Giter Site logo

ai-breast-cancer's Introduction

AI-breast-cancer

This repo is the implementation for "MutiBCD: A Multimodal model that simulates the human diagnostic process for automated Breast Cancer Detection"

Abstract

In response to the complexities of breast cancer detection, our study introduces MutiBCD, a multimodal model that mimics the human diagnostic process for automated breast cancer detection. Integrating an image classifier with GPT-4, it evaluates mammographic images alongside patient complaints. The model’s dual-head autoencoder efficiently processes image data, eliminating the need for manual lesion delineation, while GPT-4 extracts critical information from patient narratives.

MutiBCD demonstrates superior diagnostic accuracy and efficiency, achieving an F1 score of 86.49% and a recall rate of 94.12%, which marks an improvement over traditional methods. Furthermore, its design, emphasizing interpretability, aligns with the intuitive experience of medical consultations. The encouraging results of MutiBCD in breast cancer detection indicate its potential for application in similar diagnostic contexts.

The MutiBCD model is characterized by its compact structure, flexible and efficient coupling, and the open-sourcing of its code https://github.com/zhangzihan-is-good/AI-breast-cancer, thereby enhancing the model's practical utility.

Directory

Dataset

This study utilized the following two datasets:

  • The Chinese Mammography Database (CMMD):
    This database was conducted on 1,775 patients from China with benign or malignant breast disease who underwent mammography examination between July 2012 and January 2016. The database consists of 3,728 mammographies from these 1,775 patients, with biopsy-confirmed types of benign or malignant tumors. For 749 of these patients (1,498 mammographies), the database also includes patients' molecular subtypes. Image data were acquired on a GE Senographe DS mammography system. The data can be obtained from this link

  • Chinese Breast Disease Clinical Imaging Database:
    This database includes 176 mammographic images and 84 corresponding patient complaints from 84 female breast disease patients.The data can be obtained from this link

Code Description

  • cal_mean_std.py: Calculate the mean and variance of mammography image datasets.
  • cal_para_quan.py: Calculate the parameter quantity of models.
  • config.yaml: A configuration file for setting up the project environment.
  • image_only.py, xgboost_muti.py, xgboost_text.py: Decision-makers for text-only, image-only, and combined text and image analysis.
  • load_data.py: Load datasets for processing.
  • losses.py: Compute loss functions for model training.
  • model.py: Build various types of models for image processing and analysis.
  • reconstruct.py: Test the reconstruction effects of image autoencoder models.
  • test_classifier.py, test_vit.py: Test various types of image classifiers.
  • train_ae.py: Train autoencoder models.
  • train_classifier.py, train_vit.py: Train various types of image classifiers.
  • utilss.py: Contain general utility functions for the project.
  • requirement.txt: The required python packages.

Installation

pip install -r requirements.txt

Usage

Specific parameters can be adjusted in config.yaml. The main experimental results can be reproduced through the following steps.
Train the AutoEncoder

python train_ae.py

Train the Classifier

python train_classifier.py

Train the XGboost

python xgboost_muti.py

Citing & Authors

If you find this repository helpful, feel free to cite our publication -

MutiBCD: A Multimodal model that simulates the human diagnostic process for automated Breast Cancer Detection

Contact: ZihanZhang, JuntongDu

ai-breast-cancer's People

Contributors

zhangzihan-is-good avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.