Giter Site home page Giter Site logo

popstats's People

Contributors

pontussk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

popstats's Issues

invalid literal for int() with base 10: 'C'

Hey,

I am have converted my vcf file using the vcf2tped.py script. After that, I want to try to estimate FAB. But when I try to run it using the command:
python popstats.py --file input.tped --pops INDV1,INDV2,INDV3,INDV4 --ancestor ANC_INDV --FAB

I get the error:
Traceback (most recent call last):
File "/home/uramakri/ryanr/softwares/popstats/popstats.py", line 710, in
chromosome = int(col[0].lstrip('chr'))
ValueError: invalid literal for int() with base 10: 'CM000001.4'

Could you please let me know how to resolve this? the reference genome is dog not humans.

Anubhab

f3vanilla requests 4 populations

Hey Pontus!

I tried running popstats with the --f3vanilla option and it seems to require four populations to be specified, even though the --f3 correctly asks for 3.

$ python2.7 popstats.py --f3vanilla --pops 1,2,3 --file F3_popstats.merged 
Traceback (most recent call last):
  File "/Users/lamnidis/Software/popstats/popstats.py", line 185, in <module>
    poplabel4=poplist[3].split('+')
IndexError: list index out of range


$ python2.7 popstats.py --f3vanilla --pops 1,2,3,3 --file F3_popstats.merged 
#Runs without raising an error.

while --f3 runs normally with 3 populations:

$python2.7 popstats.py --f3 --pops 1,2,3 --file F3_popstats.merged 
#Runs without raising an error

Thank you!
Thiseas

Defining pops in tfam file

Hi Ponstuck,

I have a tped file with all autosomes and tfam file with the individuals (286) from 7 different populations. Each of the pops have a specific identifier the consists of a 3 letter pop code and individual identifier e.g KPA_562. I was inquiring how I would use the python script to extract the pops I want to study? The example file in the link you had provided in one of the issues I read is not active.

Thank you.

--pi not an option

Hi,
I recently started using your code to estimate f2, since I couldnt seem to get AdmixTools to compile correctly. I have got most of the tests listed in the README to work correctly, although --pi doesnt seem to be recognized as an option. I also cant find it listed in the python file as an option. Any plans to update this?

Also I couldnt find the vc2tped.py mentioned in the README so just used VCFTOOLS --plink-tped option.

Any chance you might implement something where --pop option could accept a "file" with a list of which individuals belong to which populations? Currently, it is a bit cumbersome to list all individuals that belong to a population on the command line (e.g., ind1+ind2+ind3,ind4+ind5+ind6) and for calculating f2 needs to be rearranged since it only looks at pops in 1 and 2 position. What about a pairwise calculation for anything involving only 2 pops?

thank you,

About the input files

Hi, I am not clear about the input files. For example, I have Pop1, Pop2, Pop3 and Outgroup. I want to calculate D-statistics (Pop1 , Pop2 ; Pop3 , Outgroup). Do i need to put these four populations in one tped file? or in 4 tped files ? Thanks.

Basic question of input files

Hi again Pontusk,

I was wondering if I use an input that has not not been filtered for Bi-allelic SNPs only will popstats just automatically ignore those SNPs with more than two alleles or will I have to provide an input that is already filtered for bi-allelic SNPs only?

Best,
Alex

DArTseq Data to popstats

Hello,
I am using DArTseq data in my analysis. I have missing values and this data type is binary. I was able to upload my data file to Plink and then I transformed them to use in popstats. However, I get errors when I tried to upload it to popstats. Do you think this is due to my data type and missing values?
Thank you :)

Best regards
Buddhika

Question about 'Dp_main = sum(t_list) / sum(n_list)'

Hi Pontus
I am trying to run the D-statistics. I followed your suggestion to put the population name in the first column. Then I feed popstats.py with my .tped and .tfam file then it shows

########################################
Traceback (most recent call last):
File "popstats.py", line 2105, in
Dp_main = sum(t_list) / sum(n_list)
ZeroDivisionError: integer division or modulo by zero
########################################

Command I used was
python2 popstats.py -p good.tped -f good.tfam --not23
-b 1000000 --pops pop1,pop2,pop3,pop4 --informative

What kind of error in my data lead to this?

Thank you!

sample larger than population

Hi,

I get a sample size error when I try to execute:

python popstats.py --file input --pops POP_CI,POP_SI,POP_AR,POP_TH --ancestor POP_ANC --FAB

I have only one ancestral individual. Does the program need more than one outgroup? do all outgroup individuals have to be from the same speceis?

Thanks
Anubhab

about h4

Dear Pontus,

I was trying to understant the parameters for the h4 test (LD option) and I have some doubts

the --LD option is the classic D of LD but what about the LD window? which are the unit for this option? I thought that were cM but the default is 5000 (in the code).

Thank you for your time

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.