Comments (6)
@paolodi You made this work with interpretations and tested it on UKBB tensors. What else must be done before the PR?
Also we should merge my Partners branch with master soon so I can revise my tmaps and explore to use interpretations. Can we discuss next week with the team?
from ml4h.
I need to:
- Resolve difference between intersect and union counts for test cohort.
- Add
missing_fraction
tosummary_stats_string
- Add
missing_fraction
tosummary_stats_continuous
- Add
variance
tosummary_stats_continuous
- Parallelize
from ml4h.
@paolodi note the latest version of explore is in recipes.py
in my branch er_tensorize_partners_ecgs
rather than er_explore_tensormaps
. I want to delete the latter branch as I have not used it for anything. What do you think?
from ml4h.
I'd leave it there, as that's the branch the PR refers to. Deleting it would close the PR again. I'll take care of merging it with er_tensorize_partners_ecgs and once it works again with master we can ask for an additional pair of eyes
from ml4h.
Thanks!
from ml4h.
@paolodi I finished more modifications (378a4ab), so please checkout recipes.py
from the latest er_tensorize_partners_ecgs
branch.
- Resolve difference between intersect and union counts for test cohort.
Differences are expected behavior because ∩ across tmaps will be lower than union. Duh.
- Add
missing_fraction
tosummary_stats_string
- Add
missing_fraction
tosummary_stats_continuous
- Add
variance
tosummary_stats_continuous
Parallelizeiterating through many hd5s is i/o bound; parallelizing slowed down overall run time
I also now calculate summary stats using Pandas functions whenever possible, since they are fast, and work on pd.Series
of np.arrays
. I had trouble using Numpy functions with our nested data structure.
The most important change is that tensor_to_df
is not called for each interpretation.
tensor_to_df
iterates through every tensor and consolidates them into a big dataframe.
Now, for each interpretation, we only iterate through the keys (of the df
) that belong to that interpretation.
Calling it once instead of three times reduces the run time to 33% of before!
Other changes for future PRs are:
- Improving performance with JIT compilation via Numba
- Modularizing most of
explore
into functions, and housing those functions inside ofexplore.py
- Plotting histograms for some of the summary stats (especially continuous)
from ml4h.
Related Issues (20)
- Type Checks in Conv Layer
- Tensorflow 2.2.0 errors HOT 1
- No OSX support for multithreading HOT 2
- p-values for bootstrapped performance comparison HOT 1
- Migrate to tf.data.Dataset and/or torch.utils.data.DataLoader HOT 1
- Support multi-dimensional TMAPs in output HOT 1
- Transformer Model Module
- Store MRI metadata in a DataFrame during hd5 production HOT 2
- ml4h install: vtk module is pinned to a version not supported in py3.8 HOT 4
- some tensormap key names need to be updated. HOT 1
- fix display of MRI images in plot_mri_tensor_as_animation HOT 2
- issues running the `mnist_survival_analysis_demo.ipynb` notebook HOT 3
- PCLR SavedModel different to PCLR.h5, unable to load PCLR.h5 HOT 2
- Jupyter, Welder failed to start after 10 minutes.
- missing function: slice
- Channels in mri_silhouettes
- PCLR_lead_I.h5
- droid docker image HOT 1
- Regarding the issue of DROID-LV prediction output values being too small HOT 1
- echo_supervised_inference_recipe.py wide_file
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ml4h.