Giter Site home page Giter Site logo

Comments (2)

brendanf avatar brendanf commented on July 20, 2024 1

Ok, I think I understand. I've never actually looked in to how read mappers work. In my case, I am calculating pairwise distances using fully global alignment between metabarcoding amplicon reads, so I know (because of the primer locations) that the start and end of both sequences should align. However there are usually indels between, so the read lengths are not the same. The ideal case for me would be a fully global version of the algorithm, which also allows unequal read lengths.

But since the current implementation is semi-global, then it sounds like, if I follow your suggestion to pad the end of the shorter sequence with As, then because they would typically be "aligned" to an end gap on the other sequence, then the edit distance calculated by SneakySnake would not include them? In that case it sounds like it could work; I will after all use a full aligner on the sequence pairs which pass SneakySnake. I'll at least give it a shot to see whether it filters out enough read pairs to save total execution time in my test data.

from sneakysnake.

mealser avatar mealser commented on July 20, 2024

Hi Brendan,
Thanks for your interest.
Supporting unequal lengths is not a limitation of the algorithm.
The current implementation of SneakySnake follows what most read mappers (including minimap2) do, which is requiring the first k characters (k-mer) of both sequences to be exactly matching. This is a semi-global fashion. If you have such sequences, then you just need to run SneakySnake with a sequence length equal to the minimum length of the two input sequences and it should work directly. If you don't have such sequences, you could pad the shortest sequence with, for example, A's and run SneakySnake as usual. It should also work well assuming you choose the appropriate edit distance threshold value.

I am curious to know how it goes with you and I can update the main function accordingly if needed.
Thanks.

from sneakysnake.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.