Giter Site home page Giter Site logo

What does the AF column really mean? about lofreq HOT 8 CLOSED

csb5 avatar csb5 commented on August 15, 2024
What does the AF column really mean?

from lofreq.

Comments (8)

andreas-wilm avatar andreas-wilm commented on August 15, 2024

Hi George, yes you're right: AF is the allele frequency of the called SNP (or indel). And 1.0 means 100% of bases accounted for are the variant base.

from lofreq.

george-githinji avatar george-githinji commented on August 15, 2024

Thank you for your response. So you are saying that if the reference base is A and the alternative is C, then the allele frequency at this position is 1.0. And therefore the choice of the reference is particulary important. Does lofreq perform a de novo assembly of the reads before calling the variants or does it base its calls squarely on the provided bam file and reference?

I want to imagine a situation where you might want to correct the reference based on the available reads. For example if a position is an A on the reference, and there are more reads that support a T and a few reads that support a G, what is the allele frequency? It is possible that A is distant and incorrect relative to the available reads and that G are the real minority variants. How does lofreq handle this?

from lofreq.

andreas-wilm avatar andreas-wilm commented on August 15, 2024

Hi George,
there used to be an option to call a consensus base first before calling,
but many people abused this, misunderstood it and eventually it became
difficult to support in the code, so we removed it. So yes, it's based on
the reference only.

Your example holds if you only have Cs in the column yes. You usually have
many reads mapping to a particular position. If all of them support a
variant of a particular type then the AF is 100%.

Andreas

On 29 June 2016 at 20:39, George [email protected] wrote:

Thank you for your response. So you are saying that if the reference base
is A and the alternative is C, then the allele frequency at this position
is 1.0. And therefore the choice of the reference is particulary important.
Does lofreq perfome a de novo assembly of the reads before calling the
variants or does it base its calls squarely on the provided reference?


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#37 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ABC5CT8S7dbpEE6VwfAZh4CC_r9-YqVLks5qQmeEgaJpZM4JA5Ke
.

Andreas Wilm
[email protected] | [email protected] | 0x7C68FBCC

from lofreq.

george-githinji avatar george-githinji commented on August 15, 2024

So the DP4 column may be more important to identify the actual reads and maybe ignore the reference based AF score?

from lofreq.

andreas-wilm avatar andreas-wilm commented on August 15, 2024

DP4 lists reference and alternate base counts on forward and reverse
strand. It ignores any other alleles/errors at that position, so the allele
frequency derived from DP4 might not correspond 100% with the given AF
value.

Is there something particular you're looking for or trying to do?

Andreas

On 29 June 2016 at 20:53, George [email protected] wrote:

So the DP4 column may be more important to identify the actual reads and
maybe ignore the reference based AF score?


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#37 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ABC5CestEcldeEHKtMrlAnpNL9XpJF_kks5qQmq1gaJpZM4JA5Ke
.

Andreas Wilm
[email protected] | [email protected] | 0x7C68FBCC

from lofreq.

george-githinji avatar george-githinji commented on August 15, 2024

Yes. I am calling variants on a number of samples and initially I used the same reference across all the samples. The samples were collected at different time points and from different individuals. However I realised that the results might be biased by the choice of reference particularly the AF scores when using lofreq. I would be happy to discuss this on email and I can provide more details on what I am trying to accomplish and if lofreq would be of assistance or at least understand its limits.

from lofreq.

andreas-wilm avatar andreas-wilm commented on August 15, 2024

Using the same reference across all samples is pretty much the only way to
be able to compare across samples. Otherwise you need to call against a
sample specific assembly and then the position don't mean much. Feel free
to discuss [email protected]

On 29 June 2016 at 21:04, George [email protected] wrote:

Yes. I am calling variants on a number of samples and initially I used the
same reference across all the samples. The samples were collected at
different time points and from different individuals. However I realised
that the results might be biased by the choice of reference particularly
the AF scores. I would be happy to discuss this on email and I can provide
more details on what I am trying to accomplish.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#37 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ABC5CespcvYKOLL0SJDhwvJlbOWijXJKks5qQm1KgaJpZM4JA5Ke
.

Andreas Wilm
[email protected] | [email protected] | 0x7C68FBCC

from lofreq.

george-githinji avatar george-githinji commented on August 15, 2024

Thank you! Sent an email. :)

from lofreq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.