gnames / gnparser Goto Github PK
View Code? Open in Web Editor NEWGNparser normalises scientific names and extracts their semantic elements.
License: MIT License
GNparser normalises scientific names and extracts their semantic elements.
License: MIT License
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/2
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/6
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/49
Saccharomyces cerevisiae-agavica-sylvestre Carbajal, 1901
For now we treat them as unparseable tail
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/17
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/13
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/1
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/31
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/18
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/39
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/42
There are many different functions now in CLI app and they all need to be tested.
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/5
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/3
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/28
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/19
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/15
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/8
Currently test-data.txt file contains about 500 tests, but they show as one test. I think I can use table test framework from ginkgo
to break them into separate tests.
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/21
It is useful if we need to change output format, or find a bug that affects many tests.
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/33
Port web server from Scala gnparser to Go gnparser
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/40
Related to GlobalNamesArchitecture/gnparser#474
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/25
created by @mjy at https://gitlab.com/gogna/gnparser/-/issues/26
As a developer I want to see rule names in grammar.peg names the same way.
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/12
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/11
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/14
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/30
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/35
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/4
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/32
We had a one day internal hackathon and came up with a format that would be useful for taxonworks and probably for some other projects. It has more flattened format that is mostly ranks for genus, species, var, form, etc. Now I need to make a gRPC method for it and add this format to gnparser Ruby gem
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/23
It is very rare when a name-string has more than 1 year. I am going to remove multiple year output
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/44
Related to GlobalNamesArchitecture/gnparser#480
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/9
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/46
Pereskia subg. Maihuenia Philippi ex F.A.C.Weber, 1898 means that subgenus Maihuenia is included into
genus Pereskia.
We need to add subg. as a rank for non-species "binomials" and then the canonical form for this name
should be Maihuenia instead of Pereskia
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/37
I will experiment here and try to add underscore to the parser itself. If all goes well, there will be cleaning task that will only for on removing html tags and html entities from names for now.
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/7
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/27
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/34
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/29
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/41
It is often useful in command line environment to chain different processes together with pipes. Currently parser supports only one name parsing, which is pretty useless. I want to be able to do something like:
gnparser -c durty_names.txt | gnparser -f pretty -j 300 > result.txt
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/38
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/22
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/16
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/10
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/48
ICN
60.6. Diacritical signs are not used in scientific names. When names (either new or old) are drawn from words in which such signs appear, the signs are to be suppressed with the necessary transcription of the letters so modified; for example ä, ö, ü become, respectively, ae, oe, ue; é, è, ê become e; ñ becomes n; ø becomes oe; å becomes ao. The diaeresis, indicating that a vowel is to be pronounced separately from the preceding vowel (as in Cephaëlis, Isoëtes), is a phonetic device that is not considered to alter the spelling; as such, its use is optional. The ligatures -æ- and -œ-, indicating that the letters are pronounced together, are to be replaced by the separate letters -ae- and -oe-.
ICZN
32.5.2.1. In the case of a diacritic or other mark, the mark concerned is deleted, except that in a name published before 1985 and based upon a German word, the umlaut sign is deleted from a vowel and the letter "e" is to be inserted after that vowel (if there is any doubt that the name is based upon a German word, it is to be so treated).
We need to decide on a 'less harmful' approach here.
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/43
Making a project for the issue transfer at https://github.com/dimus/issues-gl2gh
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/20
We are skipping html entities for now, will address them later.
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/45
Acipenser gueldenstaedti colchicus natio danubicus Movchan, 1967, a real name from WoRMS, is a quadrinomial of legacy rank natio.
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/47
Fungal names often have a sanctioning author (Fr. or Pers.) following a colon after the basionym or combination authorship. This is currently unparsed.
Example: Boletus versicolor L. : Fr.
ICNfap Article 50: http://www.iapt-taxon.org/nomen/main.php?page=r50E&emph=sanctioned
https://en.wikipedia.org/wiki/Sanctioned_name
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/36
created by @dimus at https://gitlab.com/gogna/gnparser/-/issues/24
Implement gRPC service
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.