nexusformat / niac Goto Github PK
View Code? Open in Web Editor NEWIssue for the NIAC to discuss (no code)
Issue for the NIAC to discuss (no code)
/polls Option1 'Option 2' "Option 3"
When proposing a new class (application or base) there is the contributed definitions staging area. In order to add fields to a base class or further develop an existing application definition: Where is the place where that can be done and get visibility in the community.
Pull requests are a possibility, but very much hidden from public view.
The relates somewhat to the discussion on versions #1
There are coming up or likely to come up for votes:
neutron event data may not have anything that's usefully plottable without serious processing
Report from Dectris, plus other changes that are proposed for NXmx
See nexusformat/definitions#602
Proposed for discussion by @mkoennecke
An issue (yes, another one) has come up regarding strings in NeXus. HDF5 now allows to write variable length strings. Some popular software, like h5py, actually uses this feature. The problem is that the variable length strings API in HDF5 is not very well though out. Thus the existence of both normal strings and variable length strings in a NeXus file imposes a burden on reading code which does not use the NeXus API. The burden is that you have to check first if this a variable length string or not and then proceed to read the data differently.
I think we need to discuss this and decide if we standardize on normal strings or if we allow both forms of strings and impose the additional burden for reading string data on users.
Missing a way to specify how the axes are to be displayed.
Adding additional qualifiers would not break any existing code while it would allow to see the data the way expected by the users.
Using predefined plot types could be one way:
NXdata@signal="I"
NXdata@axes=["q"]
NXdata@plot_type="semilogy"
but that is not generic enough, while something more generic might not be acceptable:
NXdata@signal="I"
NXdata@axes=["q"]
NXdata@plot_type=["linear", "logarithmic"] (list of representations to be associated to each of the axes with the last one to be associated to the signal)
Any proposal allowing to solve that problem is welcome.
At more and more places data is streamed: i.e you get possibly multidimensional data with time stamps.
Nexus ought to have a plan how to store this kind of data. Previous work include neutron event storage but we have two different versions of this. A more generalized version is needed.
Should we paint the bike shed?
/polls red green 'no color'
Proposed changes are here:
Do we want to got ahead with that?
Obviously versioning support would help.
Typical use cases:
The combination of multiple signals with the use of NXprocess would solve many provenance issues.
Proposed Implementation
In the past, everything was ready for the implementation via allowing to have multiple datasets with the signal attribute in an NXdata group and by playing with attribute values "1", "2", "3". That approach is not any longer possible.
- Proposal 1: Allow signal to be an array of string with the first element of the array to be treated as it is now the case with single signals.
- Proposal 2: Define a signals attribute being an array of strings with the first element of the array to be treated as it is now the case with single signals.
Proposed by @rayosborn
See nexusformat/definitions#671 and nexusformat/definitions#544 and nexusformat/definitions#791
proposed for discussion by @mkoennecke
Relates to the research project: nexusformat/definitions#382
In #42, Cleanup of issues: triage into still relevant, minor (someone just edit this), to be discussed
proposed by @mkoennecke
Changes discussed in
nexusformat/definitions#433
The code uploaded to the hdf5xmp repository provides a framework to add metadata (and thumbnails) to HDF5 files (or in sidecar .xmp files) and a set of plugins to provide included thumbnails to file browsers. Does anybody foresee any issues with the approach used? Does the NIAC support furthering this implementation?
See nexusformat/definitions#789
Issue originally raised by @vasole proposed for NIAC202 discussion by @prjemian
I have posted the message below to [email protected] but apparently the message did not get through or got stopped somewhere.
Dear colleagues,
I think there is a problem with the BasicWriter.py example found at:
http://download.nexusformat.org/sphinx/examples/h5py/index.html
Unfortunately, that example is targeting Python 2.x and that can lead to
misunderstandings.
If I read the documentation, NX_CHAR fields should be UTF-8 encoded strings.
Running the example, all the attributes are read as bytes and not as
UTF-8 encoded strings when read under Python 3.
The worst thing is that if the example is run under Python 3 (just by
adding the missing parenthesis to the print statements), the generated
file is different because indeed, strings are strings and not bytes
without any encoding.
Please, clarify is NX_CHAR fields should correspond to UTF-8 encoded
strings or to byte strings.
The example should be fixed and all the attribute strings should have a
lower case U character prepended in order to:
Best regards,
Armando
Diffractometers are relatively common instruments and a very common operation is the conversion from data to Q or reciprocal space based on the position of the motors associated to the diffractometer. Currently one has to look for all the positioners in an entry to find the common mnemonics used as function of the diffractometer geometry (phi, chi, theta, twotheta, mu, delta, ...).
Define a new base class NXdiffractometer to be added to NXinstrument family.
Provide the name of the implemented geometry. We could rely on articles and/or the names used by other software like SPEC for the geometry name and/or the motor names in order not to reinvent the wheel.
Provide the reference (or the URL) where the geometry is described. I would start defining the classics:
and let the community to enlarge the list with the different variants.
I think we have enough time to contact the diffractometer users at our respective facilities for their feedback prior to the NIAC meeting.
Anyone wanting to discuss issues at the 2020-2 NeXus Code Camp is asked to please add a comment here describing the topics they want to work on together with the core NeXus developers. We mostly want to be prepared for the number and types of issues being brought to us.
Proposed by @phyy-nx
nexusformat/definitions#711
I have implemented validation of general NeXus files in cnxvalidate in branch issue-13. This needs to be reviewed, possibly modified, and then merged with master.
Review items in https://github.com/nexusformat/definitions/milestone/9
proposed for discussion by @prjemian
A problem occurs when we do a vote but not enough NIAC members are in attendance or respond in an email vote. We do not have a quorum or any other ruling in the NIAC constitution.
We will continue working with the procedure to determine the result of the vote on the votes received. But at the next NIAC meeting we should either write this procedure into the constitution or come up with a quorum or another solution for this issue.
Discuss reserved prefixes for class, field and attributes
See nexusformat/definitions#769 and nexusformat/definitions#770
Proposed for discussion by @mkoennecke
These will be in contributed definitions soon
Anyone wanting to discuss issues at the 2020 NIAC meeting is asked to please add a comment here describing the topics they want brought before the NIAC. We mostly want to be prepared for the number and types of issues being brought to us.
I would like to draw your attention that by setting changing or free to interpret specifications you are cooking a messy format:
NX_CHAR: | any string representation All strings are to be encoded in UTF-8. Includes fixed-length strings, variable-length strings, and string arrays. Some file writers write strings as a string array of rank 1 and length 1. Clients should be prepared to handle such strings. |
---|
That clients should be ready to handle such strings does not mean that they are acceptable. I find legitimate that readers make the life of the users of such files difficult so that in turn they make sure the writers write proper files.
The NeXus API was writing things properly. At least I never encountered such discrepancies in old files (remember the 2010 workshop at the ESRF?). If people decide to use other things, they have to write correctly and not put the burden at the readers.
A specification can be bad, but a changing specification is even worse.
When I think about the lengthy discussion concerning auxiliary_signals that could have been avoided by allowing NXdata@signal to be a list of strings instead of a string I get really angry. It was said that allowing a list of strings could break code when reading the above specification the clients should have been already prepared!!!!
I really wonder how many NIAC members are actually developing software for other facilities because as a developer I cannot understand some decisions.
Hi NIAC, here is a little input from McStas for the NOBUGS 2018 satellite:
McStas have been early adopters, NeXus in the McStas code tree since ~ 2003, using the napi.h c-API. It was therefore not so nice to hear that NeXus will eventually deprecate this interface, even though our napi use is relatively basic and can likely be replaced by direct calls to HDF5.
Our code is very platform independent and is deployed to both Windows machines, Mac’s and Linux’es by our users. It is therefore very important that good deployment strategies for the NeXus libs exist for all of those platforms, ideally by availability of installer packages.
Specifically: The latest available macOS installer on GitHub is for NeXus 4.3.0 and built for 32bit systems...! (https://github.com/nexusformat/code/releases/download/4.3.0/NeXus-4.3.0.dmg)
-> It is beyond the average capability of a McStas user to build NeXus locally - and I don’t
feel it should be my task to deploy NeXus with McStas. :-)
A current use of NeXus in McStas is for the transfer of instrument geometry and event-data to Mantid, including a ‘hack' that generates an XML-based IDF. Any direct support for Mantid- oriented geometry information in NeXus will be welcomed, especially if this happens via the availability of e.g. a c-api...
Best,
Peter Willendrup on behalf of the McStas / McXtrace team
in contrib from APS
@prjemian do you want that disussed?
The default attribute associated to NXentry has proven to be an excellent idea.
NXdata groups can be present at any level (particularly common is their use inside NXprocess and even NXdetector). It would be desirable to have the possibility to check if those groups have an attribute default indicating the default plot associated to them. That would improve again a lot user experience.
In recent presentations a the Research Data Alliance meeting in Berlin, the subject of uncertainties associated to the data was mentioned, in particular in the frame of application definitions.
Reading the documentation at:
http://download.nexusformat.org/doc/html/design.html?higlight=uncertainty#design-fields
and
http://download.nexusformat.org/doc/html/classes/base_classes/NXdata.html#nxdata
I see two possible ways of specifying uncertainties for a dataset but it is not clear to me if appending _errors to a dataset name is to be considered the official generic solution. To me it looks more like an example. If it so, please make it absolutely clear in the documentation and this issue can be closed.
Decide about the proper way to associate uncertainties to datasets:
I can see arguments in favor and against of any of the above and I could easily defend any of them.
I only ask you to take into account in your evaluation the fact that most likely we'll be dealing with links (internal or even externals).
location to upload examples for the telco
agenda: http://wiki.nexusformat.org/Telco_20170117
connection: https://plus.google.com/hangouts/_/j72qwlvegiojjpt3a36pfhow5ua
This is a suggestion to use NeXus for electronic logbooks. There are some advantages to this. This is
a discussion topic to figure out if it is a good idea and in order to decide if we want to do down this road at all.
in contributed from DLS
http://download.nexusformat.org/doc/html/classes/contributed_definitions/NXcanSAS.html#nxcansas
see issue nexusformat/definitions#420
I've added a new milestone to the code repository for the 4.4.4 point release. In addition I added a couple of issues to this milestone. We could discuss this during the code camp.
This is the NeXus extension proposed and developed by James Parkhurst at Diamond for recording integrated reflection intensities after data reduction. The extension is already needed by scientists doing serial crystallography to decrease number of files in directories produced by processing thousands of individual stills at once.
Aaron Brewster brought this up. He encountered some problem storing processed MX data in this group.
There is no infrastructure to support this. We only work of the latest version. With science progressing this will not be viable for the long term.
After discussion with jkrueger1 at FRM-2 there were the following change requests for NXdirecttof.
This regarding applying NXdirectof to the FRM-2 instrument TOFTOF.
Regarding #10, there is no record of a final NIAC vote on whether to accept variable length strings.
Question came up today in the context of a file written by Diamond Light Source with a link (target attribute written as variable length string) that was not recognized as a NeXus link by NeXpy.
this is work in progress
It is common for data to be stored as an unnormalized signal with a separate normalization array. This allows multiple data sets to be merged or rebind reliably when the weights are not defined by the uncertainties. It is common in Mantid, where NXdata groups stored within MDHistoWorkspace entries contain a numevents
array that should be used to normalize the data
array. The X-ray software CCTW stores signals and weights of individual runs separately before merging them together.
I propose that we add an optional attribute normalization
to the NXdata class attributes, to accompany a signal
attribute. When it is present, plotting software should divide the signal
array by the normalization
array before plotting or performing other operation on the normalized data. This should be specified at the group level because the arrays are often stored in separate files so placing the attribute on the array itself might not be possible.
Final vote on recently revised NXmx for Gold Standard being used by DECTRIS
for their FileWriter 2. See Bernstein HJ, Förster A, Bhowmick A, Brewster AS, Brockhauser S, Gelisio L, Hall DR, Leonarski F, Mariani V, Santoni G, Vonrhein C. Gold Standard for macromolecular crystallography diffraction data. IUCrJ. 2020 Sep 1;7(5).
proposed for discussion by @yayahjb
I propose that we add an optional string attribute normalization to the NXdata class attributes, to accompany a signal
attribute. When it is present, the signal
array should be masked by an integer or boolean array specified by the attribute. Values of 1 in the mask array would correspond to masked data. For example, when read by a Python package, the Numpy array containing the signal
data could be converted to a MaskedArray
using the mask
array. This should be specified at the group level because the mask array could be stored in a separate file so placing the attribute on the array itself might not be possible.
This is functionally similar to #34.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.