Comments (5)
Thank you for your interest in CPA.
About the first question, the notebook will be updated soon to contain more meaningful and curated splits for combinatorial perturbations. I'll update as soon as possible here.
As for your second question, the model.predict() method, takes the perturbations and dosages from the perturbation_key
and dosage_key
columns of your input adata
and applies those perturbations to the basal latent obtained from each cell.
So in the tutorial example you mentioned:
- The model takes the perturbations from the
cond_harm
column of the data and adds those perturbations in the predict method to the output.
cpa.CPA.setup_anndata(adata,
perturbation_key='cond_harm',
control_group='ctrl',
dosage_key='dose_value',
categorical_covariate_keys=['cell_type'],
is_count_data=True,
deg_uns_key='rank_genes_groups_cov',
deg_uns_cat_key='cov_cond',
max_comb_len=2,
)
So if you'd like to predict a specific perturbation for a given cell, you can change the perturbation or dosage in the mentioned columns of your adata
.
Feel free to reply if there are further issues.
from cpa.
Ah, okay, thanks for that information.
But the cond_harm column takes a single value and not a list, which means that I can only apply a single perturbation to the basal latent representation. Is that correct ?
And the content of dosage_key are strings like '1.0+1.0' (and not float values). Then, how can I specify a new value (e.g. 1.5) in a way that CPA understands it?
Thanks
from cpa.
You can apply combinations of perturbations. CPA uses strings with the following format for specifying perturbations and dosage values in the adata:
- The value of the
cond_harm
column:"PERT1"
--> A single perturbation (e.g."SGK1"
)"PERT1+PERT2"
--> Combination of perturbations PERT1 and PERT2 (e.g."FOXL2+HOXB9"
)- So you can specify your combination of perturbations using the
+
character as the split between different perturbations and CPA will understand them.
- The same thing applies to the
dosage
column. The dosages are given to the model as strings of the following format:"1.0"
--> Dosage 1.0 when we have one perturbation. (e.g."1.5"
or any other number)"1.0+1.5"
--> Dosages 1.0 and 1.5 for PERT1 and PERT2 respectively.- CPA will split these strings using the
+
character and converts the string numbers to floats ("1.0+1.5" --> [1.0, 1.5]
)
It is actually done in the setup_anndata
method of the model:
Lines 294 to 317 in c63d5cf
As you can see in the code, setup_anndata
creates lists of perturbation ids and respective dosages from the given strings in the perturbation and dosage columns of adata.obs
and saves them in adata.obsm
and uses this as the input data to the model, for example:
- If you check your
adata
after runningsetup_anndata
you will see the following obsm values:obsm: 'X_pca', 'X_umap', 'perts', 'perts_doses', 'deg_mask', 'deg_mask_r2'
Here perts
is the list of perturbation IDs which is used to retrieve perturbation embeddings from the PerturbationNetwork
and pert_doses
is the respective dosages.
1 perturbation:
- ID zero for perturbations is used for padding because vectors need to be the same length.
2 perturbations:
I hope this helps and again, free to reply if there are further issues.
from cpa.
Very good, that's what I was looking for !
Actually, I only now looked at your "Batch Correction in Expression Space" tutorial with the description of custom_predict( ) and how to use it. That is obviously the function I need !
Many thanks
from cpa.
I am sorry, but I have to reopen this :-(
Looking at custom_predict I see that it allows me to select individual categorical covariates that I want to add, but it only allows me to add all or none perturbations. So that means if I want to add individual perturbations, I have to follow your advice from above !?
I think, I'm also confused what the difference is between perturbations and categorical covariates. I thought perturbations would be continuous variables, but in many of the tutorials the perturbation comes in form of discrete values (IFN stimulation or not, gene knockout or not, etc). Does that mean these tutorials could have been written differently by declaring those 'perturbations' as categorical covariates ??
Thanks
from cpa.
Related Issues (20)
- Trained models
- v0.5 "ValueError: Expected a parent" in tutorial HOT 1
- model.predict() error: cannot unpack non-iterable NoneType object HOT 2
- cpa.pl.plot_history does not include valid HOT 1
- Normab.ipynb example notebook is not working with an updated version of cpa HOT 1
- Predicting using trained model. HOT 5
- How to use external gene embedding in CPA?
- Model Training Error
- How to optmize CPA hyperparamters HOT 6
- Prediction on unseen dataset without overlap
- Predicting the gene expression on a new anndata variable with specific perturbation.
- Norman tutorial question
- Invalid dashes in "βextra-index-url" Installation instructions
- Generalization to unseen categories
- The code is licensed for Yosef Lab HOT 1
- HyperParameter Tuning Script
- Tuner isn't logging plan_kwargs on WandB
- How do I apply CPA to new control cells?
- Issue Installing CPA HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cpa.