Giter Site home page Giter Site logo

Sequences from BY-LGL are all frame shifted due to `-` being `N` in S:69/70del about sars-cov-2-sequenzdaten_aus_deutschland HOT 5 OPEN

robert-koch-institut avatar robert-koch-institut commented on July 20, 2024
Sequences from BY-LGL are all frame shifted due to `-` being `N` in S:69/70del

from sars-cov-2-sequenzdaten_aus_deutschland.

Comments (5)

MarieLataretu avatar MarieLataretu commented on July 20, 2024 2

Reminds me of this issue nanoporetech/medaka#351, and the BAM looks very similar
grafik

from sars-cov-2-sequenzdaten_aus_deutschland.

icestorm972 avatar icestorm972 commented on July 20, 2024

As far as I remember it's not only at S:del69/70 but occured for other deletions, too (but 69/70 is the most prominent, common and problematic one).

When analyzed by nextclade the problematic sequences result in (3×n-1) dashes and an extra N

(@corneliusroemer wrote you a question about that in July on Twitter, if thats normal)

P.S.
If it's still the same as 2 months ago:
When you look at the unaligned/original FASTA sequence, it seems like that instead of a single '-' at deletions to indicate a gap of indetermined length, the erroneous sequence submissions have always an 'N'

from sars-cov-2-sequenzdaten_aus_deutschland.

MarieLataretu avatar MarieLataretu commented on July 20, 2024

Back in January, it was also S:69/70

When you look at the unaligned/original FASTA sequence, it seems like that instead of a single '-' at deletions to indicate a gap of indetermined length, the erroneous sequence submissions have always an 'N'

medaka variant calls a 5 instead of 6 nt deletion. If there is an 'N' for this missing deletion, it means that the position is masked afterwards due to low coverage 🤔

from sars-cov-2-sequenzdaten_aus_deutschland.

corneliusroemer avatar corneliusroemer commented on July 20, 2024

Thanks for pointing to the medaka issue @MarieLataretu

So I guess the lab is not using the suggested workaround: solved with sup (super-acc) basecalling and respective medaka model and neither fixes the issue manually. Such a frame shift in S is totally unviable.

If you drop the sequences in nextclade.org you will see the issue immediately. Weird that GISAID allowed these frame shifted sequences through - I thought they check for frameshifts.

I see - I probably shouldn't have opened this issue here as the submission didn't go through RKI? Or am I wrong?

from sars-cov-2-sequenzdaten_aus_deutschland.

MarieLataretu avatar MarieLataretu commented on July 20, 2024

I didn't had time to look at the frame shift sequences (and metadata) data in DESH.

There is only one sample sequenced at RKI with this frame shift (at least since the last frame shift wave at the beginning of 2022). However, we use the sup model and the frame shift still appears.

A workaround is to use the nanopolish mode instead of medaka in the ARTIC workflow.

from sars-cov-2-sequenzdaten_aus_deutschland.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.