Giter Site home page Giter Site logo

Comments (2)

cwarden45 avatar cwarden45 commented on August 17, 2024

I think this could be relevant for a number of individuals.

I think these might be the most relevant points:

  • I would consider COHCAP a method to generate hypothesis about candidates, which can be validated.
  • I think it is already understood for this specific question, but I think it is common to need or benefit from changing parameter settings for different projects.
  • In most cases, the methylated thresholds (methyl.cutoff and unmethyl.cutoff) are being used to help filter results, kind of like the delta beta parameter (delta.beta.cutoff). I think the main exception is the Average-by-Site workflow, where a discrete status per site is being assigned. Otherwise, the statistical test is usually for continual beta or percent methylation values.

In general, I don't want people to only use COHCAP for analysis.

For example, for RRBS analysis, I would tend to use methyKit and COHCAP for testing. Sometimes I thought methylKit was better, sometimes I emphasized COHCAP more (with familiarity with all of the parameters that can be changed). However, I apologize in advance that I can't provide assistance with using those templates for specific projects (and I think the best solution is to have your own template, rather than using that exact template as a starting point).

While I don't think I currently have papers that I can reference for the DNA Methylation parts, you can see some data for the general principle for RNA-Seq here:

http://cdwscience.blogspot.com/2019/11/requiring-at-least-some-methods-testing.html

So, if you think of COHCAP like the ANOVA test for log2-transformed values for RNA-Seq, there are situations where using the methods with a negative binomial model (such as edgeR and DESeq2) can have advantages. The data linked above also includes limma-voom, which makes different statistical assumptions. However, the standard statistical test is often not horrible, and I think having an independently calculated expression value was helpful in comparing the different methods for every project. In fact, I think the less specific statistical test can sometimes help. In that context, I think the the less specialized ANOVA might be helpful applications like miRNA-Seq (which I am not currently providing data for in that link, and I also don't think I currently have publications to cite), but you can also see the sample for E-MTAB-7033 where I might argue that recovery of the causal gene with a relatively more symmetric set of up- and down-regulated genes might have been an advantage for the ANOVA test (at least with the STAR alignment). I think the ANOVA results also tended to have smaller gene lists, if you wanted to focus on a smaller number of individual genes to characterize. I am not sure if things like a larger sample size and/or less commonly tested multivariate models might also be considerations, but the point is that I would not completely take a relatively standard ANOVA test off the table as an option (and ANOVA / t-test / linear regression are often what is used for the p-value calculation in COHCAP).

from cohcap.

cwarden45 avatar cwarden45 commented on August 17, 2024

Also, I believe something about the binomial distribution was mentioned (along with mentioning the methylation thresholds in a separate sentence). COHCAP does not use a bimodal distribution for a statistical test, but methylated thresholds are used for the distribution of beta values (or percent methylation values). Some may call that distribution "bimodal".

Ideally, if the peaks were clear enough, then I think you would have 3 combinations of status for the 2 alleles in a human sample (homozygous unmethylated, hemizygous methylated, and homozygous methylated).

The default thresholds of 0.3 and 0.7 are meant to capture the 2 most clear homozygous peaks (perhaps in something like a cell line experiment). However, if you have heterogeneous data, that may not be possible. So, I think that is a threshold of 0.3 for both values could be used (basically merging the hemizygous and homozygous peaks, if different cells might have different methylation values).

If you keep the delta beta threshold of 0.2 (or at the very least 0.1), then that should also help avoid getting results that are only slightly above 0.3 in one group and slightly below 0.3 in the other group.

That said, I think the threshold may have more to do with being "bimodal" (or trimodal), rather than binomial assignments per read or bead (if you could determine that). For example, that is why the binomial or beta-binomial distribution is used for some BS-Seq analysis (including methylKit, with overdispersion).

from cohcap.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.