Giter Site home page Giter Site logo

Comments (2)

DongzeHE avatar DongzeHE commented on June 14, 2024

Hello @mdmanurung

Thanks so much for choosing alevin-fry.

Shortly, either removing the ambiguous counts or splitting it 50/50 into spliced (S) and unspliced (U) counts is fine. These are what people usually do in their research.

TL;DR:
When we say a gene in a cell has an unspliced UMI, it means the splicing status of the mRNA molecule represented by this UMI is ambiguous; i.e., the reads of this UMI mapped equally well to some spliced transcripts and some introns of this gene. Therefore, when calculating the S/U ratio, the count of these ambiguous UMIs can be either ignored, because their splicing status is ambiguous; or split half-half into S and U counts because their reads mapped equally well to S and U.

More generally, this question relates to an active research question people are exploring now. That is, can we compare S and U counts directly without any transfer learning or domain adaptation? This is mainly because introns have internal poly-A stretches, and those stretches could become potential priming sites. If this happens, the priming mechanism of spliced transcripts (poly-A tail priming) might be totally different from that of unspliced transcripts (poly-A tail priming + internal poly-A priming). See this technical note from 10x and this paper.

In addition, one caveat in the spliced and unspliced count inferred by alevin-fry, and all other mainstream quantification tools, is that unspliced UMI counts are represented by intronic UMIs counts. However, as we know, unspliced transcripts also contain exons, which means we prefer to assign UMIs as spliced compared with unspliced ones. People do this because they (and we) want to include as many UMIs as possible in our (spliced) count matrix.

These are all the dark sides of the question. Nonetheless, if we assume that the assumptions held by single-cell are valid and the effect of these caveats is minor, simply removing ambiguous counts or splitting them 50/50 into S and U is fine.

Best,
Dongze

from alevin-fry.

mdmanurung avatar mdmanurung commented on June 14, 2024

Hi Dongze,

Thank you so much for the detailed answer. I'll need some time to let that sink in.

Best,
Mikhael

from alevin-fry.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.