Giter Site home page Giter Site logo

Comments (9)

fgvieira avatar fgvieira commented on May 28, 2024

Hi Williams,

sorry for the late reply but I've been away on holidays. Did you manage to fix it?
You can check the size of the input file as stated on the readme file.

from ngsf.

RCWilliams avatar RCWilliams commented on May 28, 2024

Thanks for your response!
I can see where I would use the number of sites to calculate the size, but is there a way to see the number of sites? (I'm sorry if this should be very obvious!)

from ngsf.

fgvieira avatar fgvieira commented on May 28, 2024

not quite sure I understand you question... You are trying to run ngsF with a certain number of sites but it is giving an error. That usually is because the number of sites/individuals do not match the size of the file, either because the file is truncated or the number of sites/indiv is wrong. You can have an idea of the size of the file ngsF expects through the formula on the readme file. Briefly: total_size_bytes = 3 * 8 * n_ind * n_sites

What is the command line you are using and how big (in bytes and uncompressed) is your file?

from ngsf.

RCWilliams avatar RCWilliams commented on May 28, 2024

I just remade the glf (because of what you said about it potentially being truncated) and this has fixed the n_sites problem! The size wasn't matching, so thank you!

My command line is:

ngsF --n_ind 1 --n_sites 29464184 --glf S1_June17.glf --out het_S1

and I am now receiving:

[check_interv] ERROR: value is NaN!

Is this again user error at my end?

from ngsf.

fgvieira avatar fgvieira commented on May 28, 2024

Not sure what you are trying to do, but why are you only using one individual?
ngsF needs allele frequencies so, either you have several samples from the same population (ideally around 20), or you need to provide the allele frequencies yourself (--init_values and --freq_fixed).

That said, 29464184 sounds like quite a large number of SNPs for just one sample. Did you call SNPs first?

cheers,

from ngsf.

RCWilliams avatar RCWilliams commented on May 28, 2024

That must be why I'm running into issues then, because I don't have population level data. I have four high coverage individuals from four species, that are each ~10 my diverged from the reference genome (which I think is why I have a large number of SNPs).

I generated my glf from a bam using pileup in samtools-hybrid, is this not correct?

Thanks so much for the time you’ve spent on this!

from ngsf.

fgvieira avatar fgvieira commented on May 28, 2024

Do you have allele frequency data from other sources (reference panels, other dataset, etc..)?
If not, then I'm afraid you can't use ngsF...

from ngsf.

RCWilliams avatar RCWilliams commented on May 28, 2024

No I don't, which is the problem that I keep running into. Thank you so much for your time on this, I think I will keep this analysis on the side lines until I have more data available. Thanks again!

from ngsf.

fgvieira avatar fgvieira commented on May 28, 2024

No problem, and let me know if you have any other questions.

from ngsf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.