adsabs / adspipelinemsg Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU Affero General Public License v3.0
License: GNU Affero General Public License v3.0
Steve points out that proto3 is there as well.
I've read a little and I liked what I saw: they want to be more explicit, removed defaults, got rid off 'optional' (simplified internal logic), support for Go, proto3 is production now (proto2 is 'maintained')
Please consider WHY and IF we should stick with proto2 (moving now would be easy).
https://stackoverflow.com/questions/36742747/proto2-vs-proto3-in-c-sharp
nonbib data that goes to Solr
The AdsDataSqlSync repository needs to queue metrics records and non-bib data.
In [9]: r.bibcode = 'foo'
In [10]: r
Out[10]: <adsmsg.bibrecord.BibRecord at 0x7f9f5ed9acd0>
In [11]: str(r)
Out[11]: '<adsmsg.bibrecord.BibRecord object at 0x7f9f5ed9acd0>'
In [12]: r.__str__()
Out[12]: '<adsmsg.bibrecord.BibRecord object at 0x7f9f5ed9acd0>'
In [13]: r.bibcode
Out[13]: 'foo'
In [14]: r.serialize()
---------------------------------------------------------------------------
EncodeError Traceback (most recent call last)
<ipython-input-14-6d620e231470> in <module>()
----> 1 r.serialize()
/dvt/workspace2/ADSPipelineMsg/adsmsg/msg.pyc in serialize(self)
26 Returns a serialized protocol buffer message
27 """
---> 28 return self._data.SerializeToString()
29
30 def is_valid(self):
EncodeError: Message adsmsg.BibRecord is missing required fields: bibcode,JSON_fingerprint,metadata,text
so I was expecting the following to work:
r = BibRecord()
r.bibcode = 'foo'
r.serialize()
i doesn't because Msg is storing things in r._data
Also, it would be really really useful to be able to do:
r = BibRecord(bibcode='foo') # and by extension r = BibRecord(**{'bibcode': 'foo', ....})
nonbib and metrics are missing.
Augment pipeline will receive messages to compute canonical affiliation and send it on to master pipeline.
So now that I get to understand 'fullrecord' better I see it is made of several parts.
Since protobuf encourages composition (as opposed to inheritance - which is bad); and also to make it easy on devs, I'd like to ask for the following:
Needed value types include timestamp and hashtable/dict.
The 'Full' suggests that it also contains fulltext, orcid claims, citations, metrics etc.
I'd suggest calling it 'BibRecord' because it will be used by the Bibliographic pipeline and only holds metadata
nonbib data now includes NED identifiers and they are supported by AdsDataSqlSync. protobuf needs to be updated pass field to master pipeline.
this is the structure of metadata
https://github.com/adsabs/ADSimportpipeline/blob/master/aip/libs/solr_updater.py#L52
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.