With Lofreq, what does the AF column really represent? My assumption is that it is the

What does the AF column really mean? about lofreq HOT 8 CLOSED

csb5 commented on August 15, 2024

What does the AF column really mean?

from lofreq.

Comments (8)

andreas-wilm commented on August 15, 2024

Hi George, yes you're right: AF is the allele frequency of the called SNP (or indel). And 1.0 means 100% of bases accounted for are the variant base.

from lofreq.

george-githinji commented on August 15, 2024

Thank you for your response. So you are saying that if the reference base is A and the alternative is C, then the allele frequency at this position is 1.0. And therefore the choice of the reference is particulary important. Does lofreq perform a de novo assembly of the reads before calling the variants or does it base its calls squarely on the provided bam file and reference?

I want to imagine a situation where you might want to correct the reference based on the available reads. For example if a position is an A on the reference, and there are more reads that support a T and a few reads that support a G, what is the allele frequency? It is possible that A is distant and incorrect relative to the available reads and that G are the real minority variants. How does lofreq handle this?

from lofreq.

andreas-wilm commented on August 15, 2024

Hi George,
there used to be an option to call a consensus base first before calling,
but many people abused this, misunderstood it and eventually it became
difficult to support in the code, so we removed it. So yes, it's based on
the reference only.

Your example holds if you only have Cs in the column yes. You usually have
many reads mapping to a particular position. If all of them support a
variant of a particular type then the AF is 100%.

Andreas

On 29 June 2016 at 20:39, George [email protected] wrote:

Thank you for your response. So you are saying that if the reference base
is A and the alternative is C, then the allele frequency at this position
is 1.0. And therefore the choice of the reference is particulary important.
Does lofreq perfome a de novo assembly of the reads before calling the
variants or does it base its calls squarely on the provided reference?

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#37 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ABC5CT8S7dbpEE6VwfAZh4CC_r9-YqVLks5qQmeEgaJpZM4JA5Ke
.

Andreas Wilm
[email protected] | [email protected] | 0x7C68FBCC

from lofreq.

george-githinji commented on August 15, 2024

So the DP4 column may be more important to identify the actual reads and maybe ignore the reference based AF score?

from lofreq.

andreas-wilm commented on August 15, 2024

DP4 lists reference and alternate base counts on forward and reverse
strand. It ignores any other alleles/errors at that position, so the allele
frequency derived from DP4 might not correspond 100% with the given AF
value.

Is there something particular you're looking for or trying to do?

Andreas

On 29 June 2016 at 20:53, George [email protected] wrote:

So the DP4 column may be more important to identify the actual reads and
maybe ignore the reference based AF score?

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#37 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ABC5CestEcldeEHKtMrlAnpNL9XpJF_kks5qQmq1gaJpZM4JA5Ke
.

Andreas Wilm
[email protected] | [email protected] | 0x7C68FBCC

from lofreq.

george-githinji commented on August 15, 2024

Yes. I am calling variants on a number of samples and initially I used the same reference across all the samples. The samples were collected at different time points and from different individuals. However I realised that the results might be biased by the choice of reference particularly the AF scores when using lofreq. I would be happy to discuss this on email and I can provide more details on what I am trying to accomplish and if lofreq would be of assistance or at least understand its limits.

from lofreq.

andreas-wilm commented on August 15, 2024

Using the same reference across all samples is pretty much the only way to
be able to compare across samples. Otherwise you need to call against a
sample specific assembly and then the position don't mean much. Feel free
to discuss [email protected]

On 29 June 2016 at 21:04, George [email protected] wrote:

Yes. I am calling variants on a number of samples and initially I used the
same reference across all the samples. The samples were collected at
different time points and from different individuals. However I realised
that the results might be biased by the choice of reference particularly
the AF scores. I would be happy to discuss this on email and I can provide
more details on what I am trying to accomplish.

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
#37 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/ABC5CespcvYKOLL0SJDhwvJlbOWijXJKks5qQm1KgaJpZM4JA5Ke
.

Andreas Wilm
[email protected] | [email protected] | 0x7C68FBCC

from lofreq.

george-githinji commented on August 15, 2024

Thank you! Sent an email. :)

from lofreq.

What does the AF column really mean? about lofreq HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent