Comments (7)
@andrewkern I should have some time to take a look at this over the weekend. Hopefully there will be an easy fix.
from relernn.
hi there-- can you share the command line you used to run the program?
from relernn.
Oh yes, here it is. I've also run it with --forceDiploid I believe and gotten the same error.
DIR="$PWD"
MU="2.4e-9"
ReLERNN_SIMULATE --vcf Leioheterodon_madagascarensis_B_biallelic.vcf --genome Leioheterodon_madagascarensis_B_biallelic.vcf.bed --phased --projectDir ${DIR} --assumedMu ${MU} --upperRhoThetaRatio 5 --assumedGenTime 2 --nTrain 12800 --nVali 2000 --nTest 100 --nCPU 40
from relernn.
A quick guess @jradrion, given that this is from simulate. The msprime API moved such that by default sample number is diploid individuals and not haploid chrom number-- I wonder if that's messing stuff up?
from relernn.
@andrewkern This isn't an issue with msprime output as the error occurs prior to simulation. Looks like there may be an issue with differing ploidy levels across the chromosomes (scaffolds/contigs/etc). The code creates a mask from missing genotypes for each chromosome and then concatenates the masks using np.concatenate()
. The error is being cause because the shapes on the arrays being concatenated differ (some arrays are created using 2n=6 but at least one chromosome appears to be being interpreted as 1n=3). The error message is calling out the array at index 6, which from the stdout I think will correspond to the following chromosome, 8739_RagTag
. The genotypes for that chromosome aren't in the first 50k lines of the attached .vcf file. @dylandebaun can you provide the .vcf data for it? I'm traveling the next three days and will not have a chance to look at this, but I'll try to get back to you when I have the chance. You could also try excluding this chromosome and see if you get the same error. Right now I'm not sure where that differing number of haplotypes is being introduced.
from relernn.
Ah that's what the array was referring to! Yes, that was the issue, one of my vcf filtering steps removed that chromosome from the vcf file, which might explain why it wasn't caught in the original ReLERNN error that told telling me which of my chromosomes were hemizygous. I reran it without that chromosome in the bed file and again with the correct vcf and both ran to completion.
Thank you so much for the help!
from relernn.
good detective work @jradrion. I'm going to close this issue @dylandebaun.
from relernn.
Related Issues (20)
- Question about software usage HOT 1
- Is there a method to convert the result to other window size based results HOT 1
- ReLERNN_PREDICT_HOTSPO unable to run HOT 1
- what is the recommended threshold for maf filtering HOT 3
- Which python version needed to successfully run ReLERNN? HOT 1
- How many diploid samples should be at least used HOT 2
- Separate predictions on different samples from same vcf? HOT 1
- Issue with seed in examples HOT 3
- missing comma from line 9 in setup.py HOT 1
- Issue with chromosome length HOT 1
- very different corrections when bootstrapping HOT 5
- ReLERNN_SIMULATE not splitting vcf file properly HOT 2
- ReLERNN_TRAIN step has a problem: ValueError: The filepath provided must end in `.keras` (Keras model format) HOT 1
- NumPy Version Related Error HOT 3
- The train step needs large memory HOT 2
- Using --mask option does not change the output HOT 1
- The filepath provided must end in `.keras` (Keras model format) HOT 2
- Illegal instruction 4 errors when running the example HOT 3
- Wrong auto-estimate of #CPUs (Slurm) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from relernn.