Comments (4)
- Even with regions with large blank spaces / have high background ratio, two-stage HIPT should still learn relevant region-level embeddings. From my own experimentation, inference + evaluation regions with empty patches is fine. However, if you are averaging the region-level embeddings as a proxy for your slide-level embeddings (for large WSIs), you may be blurring the relevant signal in the WSI.
- Multiple WSIs per patient are used for survival prediction (risk is predicted at patient-level). CLAM-SB is a variation of Attention MIL with an additional instance subtyping loss that works for discriminative classification task. Using the instance subtyping loss does not translate immediately to regression tasks.
from hipt.
Thanks for clarifying!
from hipt.
Hi @Richarizardd, I had a quick follow-up question regarding point 2.
How did you work at patient-level with HIPT when there are multiple slides per patient?
Let's say patient A is mapped to 2 slides, slide1 and slide2:
- in slide1, there are M1 [4096, 4096] regions
- in slide2, there are M2 [4096, 4096] regions
-
did you simply extract region-level embeddings for both slides, then fed the last Transformer block the concatenated sequence of embeddings (of length M1 + M2)?
-
what's the reason behind only using IDCs for survival prediction?
-
are you still planning to upload the survival code to this repo? (#9)
-
in MCAT, you discretize survival times into 4 bins (using uncensored patients only), then based on the censorship status (either 0 or 1) you go from 4 to 8 discrete labels (done in these few lines). Yet, you only use the "initial" 4 discrete labels when training: the model outputs 4 logits and the survival dataset returns
disc_label
(between 0 and 3) and notlabel
(between 0 and 7) (see this line). Why not using the full 8 discrete labels? (or why bother creating 8 labels in the first place?)
from hipt.
After having dived into MCAT code, I found at that -- at least in MCAT -- you did concatenate the sequence of embeddings when multiple slides were available. I assume you did the same for survival prediction with HIPT.
I'm trying to reproduce HIPT Table 2 results for IDC, hence using the pre-extracted region-level features you kindly provided under 3-Self-Supervised-Eval/embeddings_slide_lib/embeddings_slide_lib/vit256mean_tcga_slide_embeddings/
.
However, features seem to be missing for 61 IDC slides (below a few slide_id with missing features):
TCGA-A2-A0T2-01Z-00-DX1.29A5C4C8-6AE8-44EE-98C2-ACBCBFBE9D60
TCGA-A7-A0CD-01Z-00-DX2.609CED8D-5947-4753-A75B-73A8343B47EC
TCGA-A7-A6VX-01Z-00-DX2.9EE94B59-6A2C-4507-AA4F-DC6402F2B74F
TCGA-A8-A06O-01Z-00-DX1.FA4495B2-5B13-4448-ADCB-EF5316E0955B
TCGA-A8-A06P-01Z-00-DX1.37660D0F-1595-43C5-9D30-58D6CB93B52C
TCGA-A8-A06R-01Z-00-DX1.41476D0D-BA72-4FB8-B143-9EB679F26D28
...
Any idea why these features are missing?
Based on 2-Weakly-Supervised-Survival/splits/5foldcv/tcga_brca/splits_0.csv
, these should be used for training / validation.
from hipt.
Related Issues (20)
- Some Questions HOT 1
- Getting raw patches after CLAM preprocessing HOT 5
- Issue with Attention Visualization HOT 2
- Some questions HOT 1
- Stochastic behavior when extracting the features for one image HOT 4
- Number of epochs required for finetuning HOT 1
- 'Weakly-Supervised Training' task - Issue with fast_cluster_ids.pkl file HOT 2
- Process of training subtyping task on custom WSIs dataset using provided pretrained model HOT 2
- cannot load pretrained model weights HOT 2
- Pretrained ViTWSI-4096 model HOT 2
- tar_patch_4096 with webdataset API HOT 2
- Recommended GPUs HOT 2
- Worse Performance in CAMELYON16 only
- Worse Performance in CAMELYON16 only HOT 5
- How can I get ’vits_tcga_pancancer_dino_pt_patch_features‘? HOT 1
- WSI preprocessing HOT 5
- Batch-wise extract features HOT 3
- Extracting features from 4096 x 4096 patches (M x L x D)
- Using the Features in CLAM HOT 2
- name 'get_patch_attention_scores' is not defined HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hipt.