Giter Site home page Giter Site logo

Comments (4)

adamklie avatar adamklie commented on June 8, 2024 1

So helpful. Thanks again for taking the time to explain, I really appreciate it!

from decoupler-py.

PauBadiaM avatar PauBadiaM commented on June 8, 2024

Hi @adamklie

Glad you find the package useful!

As you know decoupler takes an input matrix of gene expression (GEX) and a prior knowledge network (Net) which we transform into matrix format internally. In your case, your GEX is made out of the contrasts' statistics between conditions (if you only have one then it is only one row). When you run ulm, for each "sample" (in your case contrasts) and TF, you fit a univariate linear model. The response variable is the observed GEX (in your case the observed change between conditions) and the explanatory are the weights for that TF. Once the model is fitted, we extract the t-value of the fitted model as the "activity". If the genes that belong to a certain TF show increased activity when their weights are positive, the slope will be positive and we will have positive activities. If there is disagreement, for example genes with negative logFCs and positive weights, the slope will be negative.

image

To briefly answer your question, yes, we fit a separate model for each TF in ulm. However, in the case of mlm we fit one for each sample/contrast, where we include all TFs at the same time, thus the name "multivariate".

Hope this is helpful! Feel free to ask more questions if needed.

from decoupler-py.

adamklie avatar adamklie commented on June 8, 2024

Thank you so much for the detailed explanation and the figure! Makes it very clear how this is working. Now I'm trying to think about when you might expect one to work better than the other. I would expect that many of the explanatory TFs would have correlated weights that might make it harder to fit a mlm, but that many target genes should be explained by the action of multiple TFs. I guess its not hard to try both and inspect the fit, but are there other considerations or sub-selections of the network or data that make sense before fitting the mlm?

from decoupler-py.

PauBadiaM avatar PauBadiaM commented on June 8, 2024

Very good points. In the end there is no free lunch, there is a tradeoff.

The advantage of ulm is that we don't care if two TFs have highly correlated weights because we are testing then independently, but you might get false positives and we are not accounting for TF interaction effects when modeling activities.

The advantage of mlm is that since it is multivariate, it accounts for these interaction effects when modeling activities but if the network is too co-linear it might break for those TFs. BTW, you can always check how co-linear a given net is for your data using:

dc.check_corr(net, mat=mat)

If you see that some TFs pairs have high correlations (> 0.9), you should definitely double check how the obtained activities look for these when using mlm.

Therefore, if you are not sure of which one to pick, you can always use the consensus score, which by default models activities based on the results of ulm, norm_wsum and mlm. In our benchmarks we have seen that depending on the data, one of these three methods outperforms the other two, but that their consensus is always the slightly better alternative.

Hope this is helpful!

from decoupler-py.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.