Giter Site home page Giter Site logo

greatapes_sims's People

Contributors

mufernando avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

greatapes_sims's Issues

Don't remove msprime mutations in the recaptiation region

so if you keep the initial slim generation in the tree sequence
ie don't simplify
then you can always tell which muts happened in the slim period versus in the recapitation period
so you should only thin mutations in the slim period
so only mutations for which the parent of the edge on which the mutation occurred lived since the start > of the slim period
can efficiently get that out of the tables

shifting coordinates in exon tsv

bedtools intersect -a {params.ex} -b ../../meta/maps/'{wildcards.rand_id}'.bed | cut -f 1-3 | awk -v OFS='\t' 'NR==1 {{a=$2}} {{print $1,$2-a, $3-a}}' > {output.ex_f}

you are doing it wrong. the shift should be based on the start coordinate, not the first start of the exon bed file.

vectorized genomic elements

f = readFile(exonfile);
i = 0;
last = 0;
for (line in f) {
coord = asInteger(strsplit(line[0], "\t")[1:2]);
// just in case my file goes for longer than the stretch I want to simulate
if (coord[0] > asInteger(L)-1) {
break;
}
if(coord[1] > asInteger(L)-1) {
coord[1] = asInteger(L)-1;
}
// the -1 in the end coordinate is bc bed are not inclusive, but slim is
initializeGenomicElement(g1, coord[0], coord[1]-1);
last = coord[1];
i=i+1;
}
if(i==0) {
initializeGenomicElement(g1, 0, asInteger(L)-1);
}
// hacky way of making the last base always be a genomic element - downstream this matters for recapitation/overlay w/ mutation
if (last < asInteger(L)-1 & i != 0) {
initializeGenomicElement(g1,asInteger(L)-2, asInteger(L)-1);

since SLiM 3.3 you can define genomic elements with vectors of start and end positions. It should be faster than the loop.

reconciling rec map in hapmap with actual chr sizes

The recombination maps from HapMap do not span the full chromosome, but I am using the actual chromosome sizes to set simulations.

Deal with this discrepancy in windowed.snake. Check that last position in rec map is the end of the chr, if not, add another line with rec = 0.

a

bedtools intersect -a {params.ex} -b ../../meta/maps/'{wildcards.rand_id}'.bed | cut -f 1-3 | awk -v OFS='\t' 'NR==1 {{a=$2}} {{print $1,$2-a, $3-a}}' > {output.ex_f}

aa

ploidy being set as 1

see cc6c9cd

because we can have more than one subpopulation per tree sequence and we want to calculate some stats to each subpop individually, I need to subset my genotype matrix.
it is possible to get the indices belonging to a population for each haploid genome using tskit with ts.samples(population=i), but not for diploid genomes. If I were to treat the genotype matrix with ploidy=2, I wouldn't be able to do the subsetting this way.

ploidy should not matter for the stats we are using so far (pi and dxy), but could be an issue with other stats.

snakemake won't work for more than one param comb

as it is right now, the edge rules are taking lists of file names, but snakemake does not run each rule once per element in the list, instead it puts everything in the input placeholder.

work in progress...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.