biomedsciai / fuse-med-ml Goto Github PK

A python framework accelerating ML based discovery in the medical field by encouraging code reuse. Batteries included :)

License: Apache License 2.0

Python 94.97% Jupyter Notebook 4.48% Shell 0.44% Makefile 0.05% Batchfile 0.06%

deep-learning machine-learning pytorch collaboration fuse-med-ml fusemedml fuse medical medical-imaging healthcare

fuse-med-ml's People

Contributors

Stargazers

Watchers

fuse-med-ml's Issues

Cache options

Hi,
Is there a way to enable flexible caching? For example if I have a bug which affects only certain cases, I would like to use a list of the cases I want to recache, and not recache the whole dataset.
Thanks

Add format checkers as pre-commit hooks

Is your feature request related to a problem? Please describe.
Currently the format changes must be applied manually before pushing to remote.

Describe the solution you'd like
The format changes will be applied automatically with a pre-commit hook.

Additional context
Ref:
https://pre-commit.com/hooks.html

target_name Null value in FuseLossDefault

If target_name in FuseLossDefault is None, which may be a feasible scenario (e.g. for unsupervised task losses) it crashes.

Add example to Fuse.dl section in the Fuse-med ML paper

Examples in fuse.data and fuse.ml sections are extremely useful. Please expand fuse.dl section to include a similar example to explain its functionality.

urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] on CICD

Describe the bug
get the following error message while trying to download with wget

ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)

In the traceback we have:

wget.download(src, destination_file)

It happened before with a different download target (some model's weights I think) - we did a WA and forgot about it. Now it's time to fix it 😄

FuseMedML version
0.2.9

Python version
3.8

To reproduce
run:

KITS21.download(kits_dir, cases)

Additional context
Probably related to CCC server.

Can't install both fuse and tensorflow

Currently FuseMedML (v0.2.9) requires protobuf version 3.20 but (pip installable) tensorflow (v2.11.0) requires <3.20.
Can we resolve this somehow? Thanks!

Trained Model

Dear all,

Is it possible to have the weights for the trained model regarding the DUKE fuse repository ResNet model?

Integration with HuggingFace Transformer

Is your feature request related to a problem? Please describe.
Integration with HuggingFace Transformer

Describe the solution you'd like
Create end-to-end example in fuse_examples that use visual transformer.

Describe alternatives you've considered
N/A

Additional context
N/A

Caching - print the pipeline representation when different hash detected

Pinned protobuf version exposes CVE-2022-1941

Describe the bug
Protobuf dependency currently pinned to 3.20.1 is exposing "high"-score vulnerability CVE-2022-1941.

FuseMedML version
0.2.5

Python version
3.8.1

To reproduce
Check the currently pinned version.

Expected behavior
A secure version of the dependency should be used.

Missing dependency for cv2

Following the instructions for the skin lesions example the following error is encountered,

Traceback (most recent call last):
  File "./fuse-med-ml/fuse_examples/classification/skin_lesion/runner.py", line 57, in <module>
    from fuse.data.visualizer.visualizer_default import FuseVisualizerDefault
  File "./fuse-med-ml/fuse/data/visualizer/visualizer_default.py", line 28, in <module>
    from fuse.utils.utils_image_processing import FuseUtilsImageProcessing
  File "./fuse-med-ml/fuse/utils/utils_image_processing.py", line 25, in <module>
    import cv2
ModuleNotFoundError: No module named 'cv2'

opencv-python is missing from the requirements.txt file and is not installed by default.

`NDict` merge function overrides values

Describe the bug
When calling the ndict.merge(other_ndict) when ndict and other_ndict are NDict objects with the same keys, the values of other_dict are overriding ndict's.

If you are not familiar with the NDict object, please visit read ndict docu.

FuseMedML version
version 0.2.4

Python version
3.8.13

To reproduce

from fuse.utils.ndict import NDict

A = NDict({"a": 0})
B = NDict({"a":1})

A.merge(B)
# expected: {'a': [0, 1]} (?) OR print a warning of the overriding
# but we get: {'a': 1}

Expected behavior
Merge values of the same keys into a sequence (list)
OR
Print a warning of the overriding

Standardize dependency management approach

Standard dependency management practice

Standard Python dependency management practice ([1][2][3]) would look as follows:

setup.py (or setup.cfg / pyproject.toml) for capturing only direct dependencies, with loosest applicable constraints
- this ensures interoperability, i.e. package can operate smoothly alongside others (which have their own dependencies and lifecycle), while mitigating risk of dependency hell
requirements*.txt for exhaustively capturing both direct and transitive dependencies, pinned to specific versions
- this ensures reproducibility, i.e. we can recreate the same environment for our package

Current setup

The approach currently followed by fuse seems to be a bit of a "mix-up":

setup.py does not directly define any dependencies, but rather sources various requirements*.txt files
requirements*.txt files contain various dependencies, but with mixed/unclear semantics:
- scope: I understand they do not contain (all) transitive dependencies => can lead to reproducibility issues
- constraints: mostly unconstrained but some times constrained or even pinned to specific versions => can lead to interoperability issues / "dependency hell"

Proposal

In order to improve interoperability and reproducibility, while also providing clearer semantics for library usage and maintenance, it would be best to align to the dependency management scheme outlined above.

[1] https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/#install-requires-vs-requirements-files
[2] https://pip.pypa.io/en/stable/topics/dependency-resolution/#possible-solutions
[3] https://pip.pypa.io/en/stable/topics/repeatable-installs/

create documentation project on Read the Docs

Hey!
This issue aims to added documentation for FuseMedML using Read the Docs automating building.

Currently we have this, which is clearly irrelevant. The end goal is to have Fuse's documentation nicely documented on Read the Docs to help AI researchers get started with our tools.
Please visit here for a good example.

Also note that the badge should be 'passing'.

References
See casualib library and their read the docs.
Read the Docs website

Please feel free to participate and solve that issue.
Happy Coding !

README budges.

Is your feature request related to a problem? Please describe.
Add more budges to the main README file. See other open-source projects for reference.

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
N/A

ViT

add a few things:

default parameters for depth, heads etc.
standard models like - vit_base, vit_large, …
the interpolation module for diff image sizes.
What about pre trained weights? and an option flag.

Originally posted by @egozi in #200 (comment)

Model deploy

Is your feature request related to a problem? Please describe.
Deploy a model including:

processing pipeline
source code
checkpoint
end-to-end model object, including processing and weights (with the source code)

Describe the solution you'd like
N/A
Describe alternatives you've considered
N/A
Additional context
N/A

integrate clearml to track experiments

Error in complex batches

fuse-med-ml/fuse/utils/data/collate.py

Lines 123 to 127 in fb11d7e

    
           batch_size = None 
        
           for key in keys: 
        
               if isinstance(batch[key], torch.Tensor): 
        
                   batch_size = len(batch[key]) 
        
                   break

When the batch contains various Tensors (is the case of DGL Graphs) it may happen that the first Tensor is not useful to infer the batch size, producing inconsistent behavior (python dictionary doesn't necessarily return sorted keys).

I propose inferring the number of samples in the batch by looking at the length of data.sample_id if it is present and the current behavior otherwise

    if 'data.sample_id' in keys:
        batch_size = len(batch['data.sample_id'])
    else:
        batch_size = None

    if batch_size is None:
        for key in keys:
            if isinstance(batch[key], torch.Tensor):
                batch_size = len(batch[key])
                break

fix `LightningDeprecationWarning`

Describe the bug\

....../python3.8/site-packages/pytorch_lightning/callbacks/base.py:22: LightningDeprecationWarning: pytorch_lightning.callbacks.base.Callback has been deprecated in v1.7 and will be removed in v1.9. Use the equivalent class from the pytorch_lightning.callbacks.callback.Callback class instead.

Example of fuse with knight dataset does not work with cpu

Describe the bug
Example folder: examples/fuse_examples/imaging/classification/knight/baseline
After setting the config.yaml with num_gpus : 0, this error follows:

Failed to run nvidia-smi
Traceback (most recent call last):
  File "/[..]/fuse-med-ml/examples/fuse_examples/imaging/classification/knight/baseline/fuse_baseline.py", line 221, in <module>
    main(config_path)
  File "/[..]/fuse-med-ml/examples/fuse_examples/imaging/classification/knight/baseline/fuse_baseline.py", line 106, in main
    GPU.choose_and_enable_multiple_gpus(cfg["num_gpus"], force_gpus=None)
  File "/[..]/fuse-med-ml/fuse/utils/gpu.py", line 74, in choose_and_enable_multiple_gpus
    raise Exception("could not auto-detect available GPUs")
Exception: could not auto-detect available GPUs

FuseMedML version
commit 6a90bf3

Python version
Python 3.9.13

To reproduce
Following steps:

Modify config.yaml with num_gpus : 0
(set env variables)
Execute examples/fuse_examples/imaging/classification/knight/baseline/fuse_baseline.py

Expected behavior
Script should not search GPUs if config.yaml with num_gpus : 0.

can't install fuse properly as part of a requirements.txt file

when placing fuse in a requirements.txt file, for example by adding a line that references the GitHub repo:
git+ssh://[email protected]/IBM/fuse-med-ml.git, and doing pip install -r requirements.txt, the behavior is different from when doing git clone [email protected]:IBM/fuse-med-ml.git && cd fuse-med-ml && pip install -e .

The latter works fine, so when building a project that depends on fuse and other packages, a workaround is to first prepare an environment in which fuse is installed using: git clone [email protected]:IBM/fuse-med-ml.git && pip install -e ., and THEN install the other packages using pip install -r requirements.txt.

If I try to install fuse by adding it as one of the requirements in requirements.txt, files that reside in the main source code directory, like VERSION.txt don't get copied to the site-packages, which leads to a crash in this line for example:
https://github.com/IBM/fuse-med-ml/blob/cd00ae0dd9ad381f840aabdd88791962d5f77fb6/fuse/data/__init__.py#L5
when doing: from fuse.data import OpBase.
Adding -e . after git+ssh://[email protected]/IBM/fuse-med-ml.git in the requirements.txt file didn't help.

even if the VERSION.txt file did get copied to the expected path, it would be problematic because there would be a version file with a non package specific name in the main site-packages directory, and not under fuse.
but I couldn't get it to copy using package_data argument in setup.py or adding it to MANIFEST.in

Auto-Detect when changing params in `dataset_balanced_division_to_folds(reset_split=False)`

Is your feature request related to a problem? Please describe.
When calling dataset_balanced_division_to_folds(reset_split=False) we don't look at the parameters:

    if os.path.exists(output_split_filename) and not reset_split:
        return load_pickle(output_split_filename)
    else:
...

Meaning that if output_split_filename exists, the changes won't take place. Thus the user needs to change it manually or worse, be confused with the results.

Describe the solution you'd like
When necessary, auto-reset the split file.

Describe alternatives you've considered

Save the parameters in the split file (or hash them) and read them each time, looking for a diff.
Save the parameters (or hash them) in a different file.
Manually inspect the parameters.

Additional context
:)

Add new metric - example + doc

Add an example in the eval package that shows how to add a new metric given a simple function that computes it.
List all the available metrics (with a link to the source file) at the begging of the main REAMDE

Debug Operations in data package

Is your feature request related to a problem? Please describe.
When building a data pipeline - we might need to debug and visualize the intermediate sample.

Describe the solution you'd like
Implement a collection of simple debug ops (operations). When debugging a pipeline, that operation can be added to the pipeline and used.
List of operations to implement:
Core Fuse:
OpPrintSampleDictKeys() - base print the sample keys with the type of the value and shape if it is numpy array or tensor
OpPrintSampleDict().call(keys)

fuseimg:
OpSave2DImage().call(key, pre_process)
OpSave3DImage().call(key, pre_process)
OpShow2DImage().call(key, pre_process)
OpShow3DImage().call(key, pre_process)
OpSaveMulti2DImages()
OpShowMulti2DImages()
.
.
.

Describe alternatives you've considered
N/A

Additional context
N/A

One click cross validation

Support end-to-end cross-validation with minimal user intervention

Support batching variable size tensors using nested tensors

Is your feature request related to a problem? Please describe.
Support batching variable size tensors using nested tensors (https://pytorch.org/tutorials/prototype/nestedtensor.html)
To avoid padding and improve the running time.

Describe the solution you'd like
Add such an option in CollateDefault as an alternative to CollateDefault.pad_all_tensors_to_same_size.

Describe alternatives you've considered
N/A

Additional context
N/A

A package named "examples" is installed separately from "fuse_examples"

Describe the bug
I can (seemingly) access all examples code using "import examples.fuse_examples..." instead of "import fuse_examples..." as intended.
I think this happens after running pip install -e .[all], before even running pip install -e examples.
this is confusing and can lead to users changing imports in a way that's not intended

FuseMedML version
0.2.2

Python version
3.7.12

To reproduce
pip install -e .[all] and then import examples (works also when python is ran from any arbitrary directory, not just the fuse-med-ml root dir)

Expected behavior
It should trigger a "Module not found" error

Test and declare support in python 3.8 and 3.9

Is your feature request related to a problem? Please describe.
Using FuseMedML with python 3.8 and 3.9

Describe the solution you'd like
run all the tests (run_all_unit_tests.sh) with python 3.8 and 3.9. Fix tests that fail if those exist.

Describe alternatives you've considered
N/A

Additional context
N/A

Separate contingency table from McNemar's test

There could be an ambiguity as to what variables to use for calculating the contingency table for McNemar's test.
The test can be applied in general to any two paired variables. Not necessarily supervised data.
In our implementation, we assume the use-case of comparing two classifiers. In some existing libraries, they use for such cases pred1_correct and pred2_correct as the two variables, meaning the boolean comparisons of each classifier's predictions to the ground truth. We currently use just pred1 and pred2 (without taking ground truth into account).

To avoid this ambiguity, I propose to implement a new ContingencyTable metric that can receive any two paired variables. and separate it from McNemar's test which would then get the contingency table as input.

Upgrade of GroupAnalysis is required

In the current version there is a problem of calculation CI for each sub-group in GroupAnalysis

Some updates are required:

managing error messages by setting error message as a result instead of printing entire back-trace
specification of the list of groups for evaluation
allow smooth calculation of CI for each sub-group

Enable classification metrics to use logits and class indicators

The current implementation of the classification metrics expects class labels as integers. The user might want to use a model that outputs logits to estimate the class.

In this case, it would be possible to spare the user from tuning the model, by enabling the metric computation to use logits or class scores directly. This can be easily done by using type_of_target function from sklearn and if the result is some sort of continous value, apply an argmax function.

An alternative is that the user will implement the argmax or even explore the operating point of the model, but the proposed solution would be the default in case the user takes no action.

State of the field section missing from the fuse-med-ML paper

According to the JOSS review guidelines, the authors should provide a section titled 'State of the field' which describes how this software compares to other commonly-used packages. This section is currently missing from the Fuse-Med-ML paper.

Add LossSampleWeighted

Add a loss wrapper that allows to weight per sample.

(In the same way that LossWarmUp wraps a loss)

Multiprocessing error when running KNIGHT baseline

Describe the bug
fatal error when loading KNIGHT dataset using RedHat Linux. Error does not happen in MacOS.

FuseMedML version
commit 6a90bf3

Python version
Python 3.8.13

To reproduce
python fuse-med-ml/examples/fuse_examples/imaging/classification/knight/baseline/fuse_baseline.py

Expected behavior
Loading and caching the dataset

Trace\

multiprocess pool created with 8 workers.
  0%|                                                                                                                                                                                                                                                                                                                                       | 0/240 [00:09<?, ?it/s]
Traceback (most recent call last):
  File "[...]/fuse/lib/python3.8/multiprocessing/pool.py", line 851, in next
    item = self._items.popleft()
IndexError: pop from an empty deque

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "baseline/fuse_baseline.py", line 221, in <module>
    main(config_path)
  File "baseline/fuse_baseline.py", line 116, in main
    train_ds, valid_ds = KNIGHT.dataset(
  File "fuse-med-ml/fuseimg/datasets/knight.py", line 307, in dataset
    train_dataset.create()
  File "fuse-med-ml/fuse/data/datasets/dataset_default.py", line 103, in create
    self._output_sample_ids_info = self._cacher.cache_samples(self._orig_sample_ids)
  File "fuse-med-ml/fuse/data/datasets/caching/samples_cacher.py", line 183, in cache_samples
    all_ans = run_multiprocessed(
  File "fuse-med-ml/fuse/utils/multiprocessing/run_multiprocessed.py", line 140, in run_multiprocessed
    ans = [x for x in iter]
  File "fuse-med-ml/fuse/utils/multiprocessing/run_multiprocessed.py", line 140, in <listcomp>
    ans = [x for x in iter]
  File "fuse-med-ml/fuse/utils/multiprocessing/run_multiprocessed.py", line 223, in _run_multiprocessed_as_iterator_impl
    for curr_ans in tqdm_func(
  File "[...]/fuse/lib/python3.8/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "[...]/fuse/lib/python3.8/multiprocessing/pool.py", line 856, in next
    self._cond.wait(timeout)
  File "[...]/fuse/lib/python3.8/threading.py", line 302, in wait
    waiter.acquire()
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

KeyboardInterrupt

End to end example using ddp (Distributed Data Parallel) strategy.

Tensorboard and CSV Logger

Is your feature request related to a problem? Please describe.

Display on tensorboard the train curves on top of the validation curve.
Create a csv file with results per epoch

Describe the solution you'd like
Implement a new logger that manages 3 pytorch lightning loggers (tensorboard for validation, tensorbaord for train and csv)
Then pass it to pl.Trainer(logger=logger)

Describe alternatives you've considered
I checked if we can manipulate the default PyTorch lighting logger, but didn't manage to get the behavior I want.

Additional context
N/A.

Make BackboneResnet3D backward compatible

Describe the bug
BackboneResnet3D expects the instance variable self.pool which was not present in models trained and stored with fuse-med-ml<0.2.8
See: https://github.com/BiomedSciAI/fuse-med-ml/blob/master/fuse/dl/models/backbones/backbone_resnet_3d.py#L83

FuseMedML version
FuseMedML 0.2.8 used.

Python version
3.8.15

To reproduce
Run forward pass with a model created with fuse-med-ml<0.2.8.

Expected behavior
Models generated with previous versions should run.

Start training on CPU if couldnt auto detect GPU/s

Describe the bug
If fuse does not auto-detect any GPU, it will just stop raise Exception. Instead I think we should continue with cpu (and leaving a warning saying so)

FuseMedML version: 0.2.4

Python version: Python 3.10.4

To reproduce
Run MNIST example without gpu (m1 mac)

Expected behavior
Start Running on CPU.

Additional context
Solvable in utils/gpu.py:

if available_gpu_ids is None:
    raise Exception("could not auto-detect available GPUs")

lr scheduling

Is your feature request related to a problem? Please describe.
Add more learning rate scheduling options

Describe the solution you'd like
Include a flag/operation to choose other lr scheduling than reduce-on-plateau.
In particular, cosine-annealing scheduling is very popular and should be part of the package.

Describe alternatives you've considered
NA

Additional context
NA

add `isort` as a pre-commit hook

Describe the solution you'd like
Adding isort to the pre-commit hooks and apply it to all files.

Describe alternatives you've considered
There are other libraries, isort is a popular one.

Additional context
references:
https://github.com/PyCQA/isort
https://pycqa.github.io/isort/docs/configuration/pre-commit.html
https://github.com/psf/black/blob/main/.pre-commit-config.yaml#:~:text=github.com/pycqa/-,isort,-rev%3A%205.10.1

Add "silent" parameter for the Evaluator

Currently we always print metric results to screen, regardless of whether we save to file or not, or if for other reasons a user doesn't want prints

Add a flag for clearml start function to enable/disable it

integrate mlflow to track experiments

Python 3.10 incompatibility

Currently fuse is incompatible with Python 3.10.x

for example, I encountered an ImportError: cannot import name 'Iterable' from 'collections'
because they changed the structure of the collections module and it needs to be imported as:
from collections.abc import Iterable rather than from collections import Iterable

Maybe there are other compatibility issues as well, for now I just reverted back to Python 3.7, but we should solve it sometime...

Remove `torchvision` version upper bound

Issue
Currently we have in the fuse/requirements.txt the following line:

torchvision>=0.8.1,<0.14.0  #removal of upper bound causes issues in tests?

This was originally added as a workaround for some issue (forgot which one).

Goal

Find the problem and solve it so we can remove the upper bound without harm the CICD checks

Apply `mypy` typing checker to all files

Is your feature request related to a problem? Please describe.
As described here, we use black, flake8 and mypy to enforce certain format criteria.
Currently we are ignoring some of the issues and some of files for the mypy checker. See ./.mypy.ini file for more details.

Describe the solution you'd like
Apply static typing in the ignored files so mypy will pass without ignoring any files (or at least much smaller amount).

NOTES

NO LOGIC SHOULD BE CHANGED. Only the static typing according the functions/classes properties.
This task might take too much time and effort in one-shot, so it can be done in a small doses - each PR with some amount of files fixed.

walkthrough_template.py is out of date

Need to update it to use the new evaluation package and out-of-date comments.

References missing in the Fuse-med-ML paper

The fuse-med-ml paper is missing references. Please add proper citations for the papers referenced in the text (e.g. the following papers:[@raboh2022context], [@rabinovici2022early], [@rabinovici2022multimodal], [@jubran2021glimpse], [@tlusty2021pre], [@golts2022ensemble], [@radiology])

Vulnerable protobuf version still exposed

Describe the bug
#193 was supposed to fix #192, however depending on the order pip decides to resolve dependencies (e.g. if it resolves the latest possible pytorch_lightning first), protobuf may still get resolved to the vulnerable version 3.20.1.

FuseMedML version
0.2.6

Python version
Exact Python version used. E.g. 3.8.13

To reproduce
Occurrence depends on dependency resolution order, which may vary.

Expected behavior
A secure version should be installed instead.

	batch_size = None
	for key in keys:
	if isinstance(batch[key], torch.Tensor):
	batch_size = len(batch[key])
	break

biomedsciai / fuse-med-ml Goto Github PK

fuse-med-ml's People

Contributors

Stargazers

Watchers

Forkers

fuse-med-ml's Issues

Standard dependency management practice

Current setup

Proposal

Recommend Projects

Recommend Topics

Recommend Org