Package designed to check descriptions of sequence variants according to the Human Genome Sequence Variation Society (HGVS) guidelines.
Please see ReadTheDocs for the latest documentation.
Tool suite for HGVS variant descriptions
Home Page: https://mutalyzer.nl
License: MIT License
Package designed to check descriptions of sequence variants according to the Human Genome Sequence Variation Society (HGVS) guidelines.
Please see ReadTheDocs for the latest documentation.
One of the examples results in an ERETR
error.
The name checker crashes on the following description.
NG_012337.1(SDHD):c.274G>T
It would be nice if whenever a legacy locus selector is used, we try to find it in the reference model and present the user with a selectable list of options. E.g., in this particular example, we could say something like
Transcript "SDHD" not found, but the a gene was found by that name. Please choose from:
NG_012337.1(NM_003002.2):c.274G>T (succinate dehydrogenase complex, subunit D, integral membrane protein)
Likewise, we could allow for the HGNC id in the same way.
Note: I do not suggest to resolve the full legacy locus selectors (e.g., SDHD_v1
). In this case I would discard everything after the _
and follow the same procedure described above.
For the following description:
NC_000016.9:g.15815278C>T
No affected transcripts were found, while Mutalyzer 2 finds 13.
In the position_convert
endpoint, there seem to be multiple ways of providing input (i.e., via a description and via a combination of other input fields). It would be cleaner to split this into two different endpoints.
This variant inserts two consecutive C
s. It is corrected however, to a duplication that does not contain two consecutive C
s.
git checkout refactor
git clone [email protected]:mlefter/mutalyzer-visualization-vuetify.git
thanks!
When the following request is done to the API:
curl -X GET "http://v3.mutalyzer.nl/api/reference_model/NM_002001.2" -H "accept: application/json"
we get the following response:
"model": {
"id": "NM_002001.2",
"type": "record",
"location": {
"type": "range",
"start": {
"type": "point",
"position": 1191
},
"end": {
"type": "point",
"position": 0
}
},
...
The start
and end
positions seem to be swapped.
Suggestion/Improvement
When entering something like, "chr1:g.169519049C>T" into Name Checker,
It would be helpful for MUT to return, or go ahead and convert, "chr1" to NC_000001.10
Perhaps we should not send the entire reference model and reference sequence to the JavaScript client by default. This could be done on request, if it is absolutely needed.
Variant
NM_002001.4:c.55_56insTTTT
is converted to:
NC_000001.11(NM_002001.4):c.55_56insTTTT
Which is not correct because there is an intron between c.55 and c.56.
It is unclear how to map this variant, for now it would be nice to raise an error.
When the following description is offered, the website stalls indefinitely.
CCDS4702.1:c.123C>T
The following description gives an error, while none is expexted.
The title of description_extract
says: "Convert a position".
Add support for repeated sequences using the following format:
start
_
end
SEQ
[
repeat_number
]
where SEQ
is the repeat unit, which:
repeat_number_seq
times between start
and end
locations in the reference sequence.repeat_number_seq >= 0
end - start + 1 % |SEQ| = 0
repeat_number
of times in the observed sequence, with repeat_number >= 0
.Currently everything is converted in the frontend to sequences. If the input is provided as variants, these should be given to the backend as is for performance reasons.
When a transcript is used in the name checker, there is no need to use the transcript ID both as reference sequence ID and selector.
It would be nice to have the descriptions in the section "Equivalent descriptions" link to a new name check run.
When checking the following description:
LRG_24:g.5525C[4]
The non-informative message "Some response error occured." appears. I would expect either a message stating that the operation is not supported, or a normalised result.
Hi everyone !
We are trying to use your API to convert cdna to genomic position. Overall, it's working pretty fine, but we had a problem with one conversion :
https://v3.mutalyzer.nl/positionconverter?referenceId=NC_000003.11&fromSelectorId=NM_014850.4&fromCoordinateSystem=c&position=2392&toSelectorId=&toCoordinateSystem=g&includeOverlapping=true
The problem is the version of the NM : NM_014850.4 doesn't work, but NM_014850.3 works fine.
For the NCBI, the .4 version is the one accepted since november 2018 (https://www.ncbi.nlm.nih.gov/nuccore/NM_014850), is this time gap normal ? And if yes, where can I find the accepted NM list for a given NC ?
Thanks,
Quentin Riché-Piotaix, PhD
Bioinformatic Engineer,
CHU Poitiers
The following descriptions are (rightfully) silently corrected. However a warning about why they were corrected would be in order.
NG_012337.1:g.7125+1G>T
NG_012337.1:g.7125G>TA
For the first description I would expect a warning about using an intronic position without a proper exon boundary.
For the second description I would expect a warning about the type (operator) used.
I can see the description model of the (possibly wrong) input, but the description model after normalisation is missing. Arguably, we should only offer the normalised model, if any at all.
The following request:
curl -X GET "http://v3.mutalyzer.nl/api/get_selectors/NC_000001.11" -H "accept: application/json"
results in an ERETR
error. It is unclear why.
It would be nice to have a more readable formatting of the errors.
The following description (generated by Mutalyzer) is not accepted: NG_012337.1(NM_003002.2):r.([274g>u;278u>g])
Hi team,
We are trying to use your API to convert cDNA sequenced to genomic positions: https://v3.mutalyzer.nl/positionconverter?referenceId=NM_000334.4&fromSelectorId&fromCoordinateSystem=c&position=9877&toSelectorId&toCoordinateSystem=g&includeOverlapping=true
The results obtain is not valid with this version of the API, in this example it should be: NC_000017.10:g.62013765C>T as we correctly obtain when using Mutalyzer v2: https://mutalyzer.nl/position-converter?assembly_name_or_alias=GRCh37&description=NM_000334.4%3Ac.9877G%3EA
Thanks!
Leslie Matalonga
--
Leslie Matalonga, PhD
Clinical Genomics Specialist
CNAG-CRG
Tel:934020828
The following description:
NC_000016.9:g.[15815278C>T;15815278del]
is normalised to:
NC_000016.9:g.15815278C>T
Part of the description is discarded, but no warning or errors are given.
Some internal server error is triggered when checking the following variant description.
NC_000001.11:g.114750024_114750025ins[(123);114750025_114750040]
The following conversion gets corrected to LRG_199:g.=
, which does not look right to me.
The following variant description :
NC_000016.9:g.[15815278C>A;15815279del]
is erroneously normalised to:
NC_000016.9:g.15815277_15815279dup
When a transcript is used in the name checker, error ESELECTORMODELNOEXONS
may be raised. The name checker can and should continue in this case, by assuming that the whole transcript is one big exon.
In the following example, the description can not be interpreted because of internal inconsistencies.
NG_012337.1:g.7125delGACinsT
According to the position, one nucleotide is deleted, but according to the (optional) sequence, three nucleotides are deleted. In case of such inconsistencies, I would suggest to halt instead of silently correcting the description.
The following allele descriptions are handled differently:
NG_008376.4:g.[6933del;6932_6933insC]
, and:
NG_008376.4:g.[6932_6933insC;6933del]
.
NC_000012.11:g.78582566_78582568delinsGATAA
should be normalized to NC_000012.11:g.78582569_78582571delinsGATAA
An internal server error is triggered for the following:
https://v3.mutalyzer.nl/api/view_variants/test
The following variant description:
LRG_303:g.6883_6884insTTTCGCCCC
is correctly normalised to:
LRG_303:g.6875_6883dup
However, when an other variant is added upstream, e.g.:
LRG_303:g.[11del;6883_6884insTTTCGCCCC]
it is incorrectly normalised to:
LRG_303:g.[11del;6883_6884insCGCCCCTTT]
Perhaps this is a bug in the mutator
module?
When checking description NP_002993.1:p.Asp92Glu
, no suggestions for back translations are given. Mutalyzer 2 used to do this.
Hi,
I've installed the mutalyzer 3.0.0a2 dev0 from source, but can not be applied.
Errors were attached.
File "/bioinfo/software/miniconda3/bin/mutalyzer_name_checker", line 33, in <module>
sys.exit(load_entry_point('mutalyzer==3.0.0a2.dev0', 'console_scripts', 'mutalyzer_name_checker')())
File "/bioinfo/software/miniconda3/bin/mutalyzer_name_checker", line 25, in importlib_load_entry_point
return next(matches).load()
File "/bioinfo/software/miniconda3/lib/python3.7/site-packages/importlib_metadata/__init__.py", line 167, in load
module = import_module(match.group('module'))
File "/bioinfo/software/miniconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/bioinfo/software/miniconda3/lib/python3.7/site-packages/mutalyzer-3.0.0a2.dev0-py3.7.egg/mutalyzer/cli.py", line 4, in <module>
from mutalyzer.name_checker import name_check
File "/bioinfo/software/miniconda3/lib/python3.7/site-packages/mutalyzer-3.0.0a2.dev0-py3.7.egg/mutalyzer/name_checker.py", line 1, in <module>
from .description import Description
File "/bioinfo/software/miniconda3/lib/python3.7/site-packages/mutalyzer-3.0.0a2.dev0-py3.7.egg/mutalyzer/description.py", line 10, in <module>
from mutalyzer_mutator import mutate
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 668, in _load_unlocked
File "<frozen importlib._bootstrap>", line 638, in _load_backward_compatible
File "/bioinfo/software/miniconda3/lib/python3.7/site-packages/mutalyzer_mutator-0.2.0-py3.7.egg/mutalyzer_mutator/__init__.py", line 17, in <module>
File "/bioinfo/software/miniconda3/lib/python3.7/site-packages/mutalyzer_mutator-0.2.0-py3.7.egg/mutalyzer_mutator/__init__.py", line 7, in _get_metadata
File "/bioinfo/software/miniconda3/lib/python3.7/site-packages/pkg_resources/__init__.py", line 482, in get_distribution
raise TypeError("Expected string, Requirement, or Distribution", dist)
TypeError: ('Expected string, Requirement, or Distribution', None)
Any tips to fix this error?
Thanks,
Junfeng
The following pattern is found a number of times (e.g., 1, 2, 3, 4) in this project.
if something:
return a
elif something_else:
return b
This however leads to an inconsistency in return type when neither something
nor something_else
is true. A default return value is preferred here.
Also see the recommendation "Either all return statements in a function should return an expression, or none of them should." (pep8).
The example on the Name Checker page results in an error. It would be better to only show working examples.
The description extractor page does not show any output.
It seems that normalizing duplications on the reverse strand is not performed correctly:
NC_000001.11(NM_032833.5):c.65_66insGGCTTCCGGTTCTGGCC
is wrongly normalized to NC_000001.11(NM_032833.5):c.66_82dup
. On the transcript reference it seems fine: NM_032833.5:c.65_66insGGCTTCCGGTTCTGGCC
is normalized to NM_032833.5:c.49_65dup
.NC_000009.11:g.21974758_21974759insC
should be normalized to NC_000009.11(NM_000077.5):c.68dup
and not to NC_000009.11(NM_000077.5):c.69dup
. Next, when NC_000009.11(NM_000077.5):c.69dup
is used as in put it is wrongly normalized to NC_000009.11(NM_000077.4):c.70dup
. It seems like there is a shifting problem.NG_012337.1(NM_012459.2):c.5_6dup
is wrongly normalized to NG_012337.1(NM_012459.2):c.7_8dup
.The result of the back translation of NM_003002.4:p.(Asp92Tyr)
is NM_003002.4:c.(274G>T)
, but should this not be NM_003002.4:r.(274g>u)
?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.