at-cg / panaligner Goto Github PK
View Code? Open in Web Editor NEWLong read aligner for cyclic and acyclic pangenome graphs
License: Other
Long read aligner for cyclic and acyclic pangenome graphs
License: Other
Hi,
Thanks for this new addition to the graph alignment ecosystem, it is very much appreciated.
I am having issues with a simple test in which I take a 3 node GFA and then concatenate a path and align the resulting sequence back to the GFA to see if it is found again.
Here is the GFA:
S 1 tcatggtacttgtcaccttctgtcctatttgagtcatgatgcatttgtctatcatctatctccctgcaatgaacatccctggggtggggccagggtctgttttgcttggtgctgcattcccattgccaagaagacctcacaggcatatcaagcacacagcaggtgcataacacttgctgagtttgctgaatgaatCCCTCCTCTGTATCCTACACATCCAACTGGTCACCACGTCCTGTGCGGTAGATGTCCTTTTTAGCAGGTCCCCTGCACTTCACTGTGTTTGTTAAGCCTTGGCTTGCTCTGtcatttcattttgcttaggTGGCGGCAGCAGCCCCTGACAGCCCCTGGCCTCACAGCCCCTGGCCTCAACCCATATCCTCTCCACTCTGCACACAGCTGCTGGGAGACTTTTCCTTAAAACTtggtttctctcctttcccttctgaAAACTGTCAGTGGCTTCCTGTGGCTCAGAGGAATCAATCGAAACATGGTGCTGCTCACAAGGAAACACAACGTTTCTAGGGCCAGTGCTGTCTTTGTCCTGCCTTCCTCCAAATTCAGCCACGTTAAGCCTCAAGCAGATCCTCCCACACAGCTGGGCTTCTTCTGCCTTTCTGCCAGGACTTGCTGCTCCCAGGAGCCTCCTCCCCCAGTCCCTGCAGCCCCAGGTCAGGTGTCTCCCCACCCAGCCTGGCTCAGCTGTCCTTTCCCGTGCAACTACAGGTTCATACAATTTGCTATTGCTCCATCTACTCTATGTGACTCTTGGTTTCTTGAAGGCAGGTATGggactctttctttccttctttgctttcttttctctttctctttctttctttttctttctttctttctttctttc
S 2 tttctttctttctttctttctttctttctttctttctt
S 3 tctctctctctctctctcttttcttttctttctttctctctttctgtttttagagactgggtcttgctgtcacccaggttggagtgcagtggtgcaatcatggctcaccgaactcctgggctcaagtgagcctcttgcctcagcctcccaattagttgggactacaggcatgtgccactacacctggctaattgttattattattattattattattattatttttgtaaagacagggtcttgctctgtttcctaggctggtcttgaacccctggcctcaaatgatcctcctgcctcagcctcccaaagtgctgggattccaggagtaagataccatgtctggccTGAGAAATTTTCTAAAAGGCATGTTTTTGCACTGtccaatatgtttttctttttatcccagcctctagcacaagtgcctgactcgaacgaagcagggactctgaaatatcttgaatgaacaagtAAATGGCACTTGAATAGGTGGCCTCTAATGtgcagaggaaggaaaagggagctgacgctttccgagtgttcagtaccttctaggcaccatgctgggccttttatgtagttctcgatgaatcctcatggcatcccGCTTTGCAGAATGATTCTAGTTACCCAGAGAGCAAGTAGCCCTTAGGGTCAGGACATGCATCTGAGTTGGTCTAGTCCCAAAGCTCCTGGTCCTCTCCTGACATCActtaaacagagaaacagaactccCCTTGGGCTTCTAGGGGCGCTGGGTTCAGGAGGCACAGCCACTCCCTTTGTTCTTCCTGGCAGCTGCCCCACCAGCAGTGAGCCCATCCCacctctgggttttcttttt
L 1 + 2 + 0M
L 1 + 3 + 0M
L 2 + 3 + 0M
When I concatenate 1,2,3 or 1,3 and align them to the GFA, I never get an end-to-end alignment, and node 2 is always missing:
1 871 1 871 + >1 871 1 871 870 870 60 tp:A:P NM:i:0 cm:i:51 s1:i:714 s2:i:0 dv:f:0.0756 cg:Z:870=
3 841 5 836 + >3 841 5 836 831 831 60 tp:A:P NM:i:0 cm:i:48 s1:i:672 s2:i:0 dv:f:0.0790 cg:Z:831=
1_2_3 1750 1 857 + >1 871 1 857 856 856 60 tp:A:P NM:i:0 cm:i:50 s1:i:714 s2:i:0 dv:f:0.0751 cg:Z:856=
1_2_3 1750 930 1745 + >3 841 21 836 815 815 60 tp:A:P NM:i:0 cm:i:47 s1:i:671 s2:i:0 dv:f:0.0795 cg:Z:815=
1_3 1712 1 857 + >1 871 1 857 856 856 60 tp:A:P NM:i:0 cm:i:50 s1:i:700 s2:i:0 dv:f:0.0751 cg:Z:856=
1_3 1712 892 1707 + >3 841 21 836 815 815 60 tp:A:P NM:i:0 cm:i:47 s1:i:672 s2:i:0 dv:f:0.0795 cg:Z:815=
Here is a visualization showing how neither the graph nor the linear query are fully covered:
I have tried multiple different combinations of parameters to attempt to find seeds/minimizers in node 2:
Defaults:
PanAligner -c -x lr -o /tmp/tmpn2oq6iyx/reads_vs_graph.gaf bad_tandem_graph.gfa bad_tandem_nodes.fasta
/tmp/tmpn2oq6iyx/reads_vs_graph.gaf
Permissive:
PanAligner -c -k 14 -n 1,1 -m 1,1 -x lr -o /tmp/tmpyid87tic/reads_vs_graph.gaf bad_tandem_graph.gfa bad_tandem_nodes.fasta
/tmp/tmpyid87tic/reads_vs_graph.gaf
I've also tried increasing the -g
and -r
parameters to allow the extension step to find more alignments, but that does not seem to have any effect. Alternatively, extreme 0/1 values of -f
also have no effect.
Could you help me with this case? I am having difficulty understanding why the base level alignment step doesn't bridge the chains here. Ideally, I would use base level alignment for a majority of the work, and only rely on minimizers for anchoring the flanks of the tandem repeat.
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.