nexusformat / niac Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 0.0 13.67 MB

Issue for the NIAC to discuss (no code)

niac's People

Contributors

Stargazers

Watchers

niac's Issues

[Test] - What do you think ?

/polls Option1 'Option 2' "Option 3"

Infrastructure to deal with changes to base classes or application definitions

When proposing a new class (application or base) there is the contributed definitions staging area. In order to add fields to a base class or further develop an existing application definition: Where is the place where that can be done and get visibility in the community.

Pull requests are a possibility, but very much hidden from public view.

The relates somewhat to the discussion on versions #1

New proposed definitions (2016)

There are coming up or likely to come up for votes:

NXcanSAS, see issue nexusformat/definitions#420

NXdata required for all cases?

neutron event data may not have anything that's usefully plottable without serious processing

Discuss mutations of the Eiger format

Report from Dectris, plus other changes that are proposed for NXmx

[NIAC2020] Suggested improvements to the NXdata base class definition

See nexusformat/definitions#602

Proposed for discussion by @mkoennecke

Strings in NeXus

An issue (yes, another one) has come up regarding strings in NeXus. HDF5 now allows to write variable length strings. Some popular software, like h5py, actually uses this feature. The problem is that the variable length strings API in HDF5 is not very well though out. Thus the existence of both normal strings and variable length strings in a NeXus file imposes a burden on reading code which does not use the NeXus API. The burden is that you have to check first if this a variable length string or not and then proceed to read the data differently.

I think we need to discuss this and decide if we standardize on normal strings or if we allow both forms of strings and impose the additional burden for reading string data on users.

[NIAC 2018 Proposals] Missing NXdata indications about plot type (linear, semilogx, semilogy, loglog ...)

Missing a way to specify how the axes are to be displayed.

Adding additional qualifiers would not break any existing code while it would allow to see the data the way expected by the users.

Using predefined plot types could be one way:

NXdata@signal="I"
NXdata@axes=["q"]
NXdata@plot_type="semilogy"

but that is not generic enough, while something more generic might not be acceptable:

NXdata@signal="I"
NXdata@axes=["q"]
NXdata@plot_type=["linear", "logarithmic"] (list of representations to be associated to each of the axes with the last one to be associated to the signal)

Any proposal allowing to solve that problem is welcome.

Streaming NeXus

At more and more places data is streamed: i.e you get possibly multidimensional data with time stamps.
Nexus ought to have a plan how to store this kind of data. Previous work include neutron event storage but we have two different versions of this. A more generalized version is needed.

test poll: the bikeshed color

Should we paint the bike shed?

/polls red green 'no color'

Discuss revision of existing classes to use 2014 attributes

Proposed changes are here:

nexusformat/definitions#443

Do we want to got ahead with that?
Obviously versioning support would help.

[NIAC 2018 Proposals] NXdata@auxiliary_signals (It was How to specify multiple signals?)

Typical use cases:

Plotting several counters at scanned positions.
Fitting raw data and willing to generate a plot with the raw data and the fitted data
Elemental mapping where a map is generated for each element
....

The combination of multiple signals with the use of NXprocess would solve many provenance issues.

Proposed Implementation

In the past, everything was ready for the implementation via allowing to have multiple datasets with the signal attribute in an NXdata group and by playing with attribute values "1", "2", "3". That approach is not any longer possible.

~~- Proposal 1: Allow signal to be an array of string with the first element of the array to be treated as it is now the case with single signals.~~
~~- Proposal 2: Define a signals attribute being an array of strings with the first element of the array to be treated as it is now the case with single signals.~~

Proposal 3: Define a new set of groups NXplot1d, NXplot2d, NXplot3d, ... allowing so.
Proposal 4: Allow a new NXdata attribute named auxiliary_signals containing the additional datasets that can be plotted alongside with the signal. The proposal is so flexible that auxiliary_axes could be considered for the future but for the main goal of this issue is not needed.

[NIAC2020] Discuss future of python in NeXus API language bindings

Proposed by @rayosborn

[NIAC2020] clarify naming conventions for fields etc.

See nexusformat/definitions#671 and nexusformat/definitions#544 and ~~nexusformat/definitions#791~~

proposed for discussion by @mkoennecke

Discuss relative virtues of "interfaces" versus "features"

Relates to the research project: nexusformat/definitions#382

[NIAC2020] issue triage

In #42, Cleanup of issues: triage into still relevant, minor (someone just edit this), to be discussed

proposed by @mkoennecke

update NXsample

Changes discussed in
nexusformat/definitions#433

[NIAC 2018 Discussion] Discuss HDF5XMP metadata

The code uploaded to the hdf5xmp repository provides a framework to add metadata (and thumbnails) to HDF5 files (or in sidecar .xmp files) and a set of plugins to provide included thumbnails to file browsers. Does anybody foresee any issues with the approach used? Does the NIAC support furthering this implementation?

[NIAC2020] Missing @creator_version in NXroot

See nexusformat/definitions#789

Issue originally raised by @vasole proposed for NIAC202 discussion by @prjemian

NX_CHAR

I have posted the message below to [email protected] but apparently the message did not get through or got stopped somewhere.

Dear colleagues,

I think there is a problem with the BasicWriter.py example found at:

http://download.nexusformat.org/sphinx/examples/h5py/index.html

Unfortunately, that example is targeting Python 2.x and that can lead to
misunderstandings.

If I read the documentation, NX_CHAR fields should be UTF-8 encoded strings.

Running the example, all the attributes are read as bytes and not as
UTF-8 encoded strings when read under Python 3.

The worst thing is that if the example is run under Python 3 (just by
adding the missing parenthesis to the print statements), the generated
file is different because indeed, strings are strings and not bytes
without any encoding.

Please, clarify is NX_CHAR fields should correspond to UTF-8 encoded
strings or to byte strings.

The example should be fixed and all the attribute strings should have a
lower case U character prepended in order to:

Comply with the documentation
Generate same output under Python2 and Python3
Generate same input under Python2 and Python3

Best regards,

Armando

[NIAC 2018 Proposals] Add NXdiffractometer or NXgoniometer to NXinstrument

Rationale

Diffractometers are relatively common instruments and a very common operation is the conversion from data to Q or reciprocal space based on the position of the motors associated to the diffractometer. Currently one has to look for all the positioners in an entry to find the common mnemonics used as function of the diffractometer geometry (phi, chi, theta, twotheta, mu, delta, ...).

Proposal

Define a new base class NXdiffractometer to be added to NXinstrument family.
Provide the name of the implemented geometry. We could rely on articles and/or the names used by other software like SPEC for the geometry name and/or the motor names in order not to reinvent the wheel.
Provide the reference (or the URL) where the geometry is described. I would start defining the classics:
- W.R. Busing and H.A. Levy. Acta Cryst. 22 (1967) 457-464. (aka. FOURC)
- M. Lohmeier and E. Vlieg. J. Appl. Cryst. 26 (1993) 706-716 (aka. SIXC)
- H. You J. Appl. Cryst. 55 (1999) 614-623 (aka. 4S+2D or PSIC)

and let the community to enlarge the list with the different variants.

I think we have enough time to contact the diffractometer users at our respective facilities for their feedback prior to the NIAC meeting.

Code Camp 2020-2 topics

Anyone wanting to discuss issues at the 2020-2 NeXus Code Camp is asked to please add a comment here describing the topics they want to work on together with the core NeXus developers. We mostly want to be prepared for the number and types of issues being brought to us.

[NIAC 2018 Proposal] NXptycho definition

[NIAC2020] Math support in NeXus

Proposed by @phyy-nx
nexusformat/definitions#711

cnxvalidate: Review validation of NeXus files without application definitions

I have implemented validation of general NeXus files in cnxvalidate in branch issue-13. This needs to be reviewed, possibly modified, and then merged with master.

[NIAC2020] NXDL 2020.10 release

Review items in https://github.com/nexusformat/definitions/milestone/9

proposed for discussion by @prjemian

Clarify Voting Procedures

A problem occurs when we do a vote but not enough NIAC members are in attendance or respond in an email vote. We do not have a quorum or any other ruling in the NIAC constitution.

We will continue working with the procedure to determine the result of the vote on the votes received. But at the next NIAC meeting we should either write this procedure into the constitution or come up with a quorum or another solution for this issue.

[NIAC2020] reserved prefixes for names

Discuss reserved prefixes for class, field and attributes

See nexusformat/definitions#769 and nexusformat/definitions#770

Proposed for discussion by @mkoennecke

[NIAC 2018 Proposals] Look at NXquadric classes and examples

These will be in contributed definitions soon

NIAC 2020 topics

Anyone wanting to discuss issues at the 2020 NIAC meeting is asked to please add a comment here describing the topics they want brought before the NIAC. We mostly want to be prepared for the number and types of issues being brought to us.

[NIAC 2018 Discussion] Messy specifications

I would like to draw your attention that by setting changing or free to interpret specifications you are cooking a messy format:

NX_CHAR:	any string representation All strings are to be encoded in UTF-8. Includes fixed-length strings, variable-length strings, and string arrays. Some file writers write strings as a string array of rank 1 and length 1. Clients should be prepared to handle such strings.

That clients should be ready to handle such strings does not mean that they are acceptable. I find legitimate that readers make the life of the users of such files difficult so that in turn they make sure the writers write proper files.

The NeXus API was writing things properly. At least I never encountered such discrepancies in old files (remember the 2010 workshop at the ESRF?). If people decide to use other things, they have to write correctly and not put the burden at the readers.

A specification can be bad, but a changing specification is even worse.

When I think about the lengthy discussion concerning auxiliary_signals that could have been avoided by allowing NXdata@signal to be a list of strings instead of a string I get really angry. It was said that allowing a list of strings could break code when reading the above specification the clients should have been already prepared!!!!

I really wonder how many NIAC members are actually developing software for other facilities because as a developer I cannot understand some decisions.

[NIAC 2018 Discussion] - Input from the McStas team

Hi NIAC, here is a little input from McStas for the NOBUGS 2018 satellite:

McStas have been early adopters, NeXus in the McStas code tree since ~ 2003, using the napi.h c-API. It was therefore not so nice to hear that NeXus will eventually deprecate this interface, even though our napi use is relatively basic and can likely be replaced by direct calls to HDF5.
Our code is very platform independent and is deployed to both Windows machines, Mac’s and Linux’es by our users. It is therefore very important that good deployment strategies for the NeXus libs exist for all of those platforms, ideally by availability of installer packages.
Specifically: The latest available macOS installer on GitHub is for NeXus 4.3.0 and built for 32bit systems...! (https://github.com/nexusformat/code/releases/download/4.3.0/NeXus-4.3.0.dmg)
-> It is beyond the average capability of a McStas user to build NeXus locally - and I don’t
feel it should be my task to deploy NeXus with McStas. :-)
A current use of NeXus in McStas is for the transfer of instrument geometry and event-data to Mantid, including a ‘hack' that generates an XML-based IDF. Any direct support for Mantid- oriented geometry information in NeXus will be welcomed, especially if this happens via the availability of e.g. a c-api...

Best,
Peter Willendrup on behalf of the McStas / McXtrace team

look at NXspecdata

in contrib from APS
@prjemian do you want that disussed?

[NIAC 2018 Proposals] Extend the default attribute to any NeXus group

The default attribute associated to NXentry has proven to be an excellent idea.

NXdata groups can be present at any level (particularly common is their use inside NXprocess and even NXdetector). It would be desirable to have the possibility to check if those groups have an attribute default indicating the default plot associated to them. That would improve again a lot user experience.

Resolve clash with offset where two uses were ratified

nexusformat/definitions#273 (comment)

[NIAC 2018 Proposals] Clarify/generalize use of uncertainties

Rationale

In recent presentations a the Research Data Alliance meeting in Berlin, the subject of uncertainties associated to the data was mentioned, in particular in the frame of application definitions.

Reading the documentation at:

http://download.nexusformat.org/doc/html/design.html?higlight=uncertainty#design-fields

and

http://download.nexusformat.org/doc/html/classes/base_classes/NXdata.html#nxdata

I see two possible ways of specifying uncertainties for a dataset but it is not clear to me if appending _errors to a dataset name is to be considered the official generic solution. To me it looks more like an example. If it so, please make it absolutely clear in the documentation and this issue can be closed.

Proposal

Decide about the proper way to associate uncertainties to datasets:
- One way (only adding _errors or attribute uncertainties recommended)
- Two ways (both ways recommended)
- No way (specific to application definitions and therefore each application definition will decide)

My View

I can see arguments in favor and against of any of the above and I could easily defend any of them.

I only ask you to take into account in your evaluation the fact that most likely we'll be dealing with links (internal or even externals).

2017-01-17 telco

location to upload examples for the telco

agenda: http://wiki.nexusformat.org/Telco_20170117
connection: https://plus.google.com/hangouts/_/j72qwlvegiojjpt3a36pfhow5ua

NIAC 2018: NeXus as logbook format?

This is a suggestion to use NeXus for electronic logbooks. There are some advantages to this. This is
a discussion topic to figure out if it is a good idea and in order to decide if we want to do down this road at all.

look at NXcontainer

in contributed from DLS

propose to ratify NXcanSAS as application definition

http://download.nexusformat.org/doc/html/classes/contributed_definitions/NXcanSAS.html#nxcansas

see issue nexusformat/definitions#420

current HTML documentation snapshot: https://github.com/nexusformat/NIAC/blob/master/2016/NXcanSAS.pdf
structure: https://github.com/canSAS-org/NXcanSAS_examples/blob/master/resources/NXcanSAS_outline.txt
example data files: https://github.com/canSAS-org/NXcanSAS_examples/tree/master/1d_standard
structure of one of these files: https://github.com/canSAS-org/NXcanSAS_examples/blob/master/1d_standard/structure/cs_collagen.h5.txt

NAPI pointe release 4.4.4

I've added a new milestone to the code repository for the 4.4.4 point release. In addition I added a couple of issues to this milestone. We could discuss this during the code camp.

Discuss/ratify NXreflections

This is the NeXus extension proposed and developed by James Parkhurst at Diamond for recording integrated reflection intensities after data reduction. The extension is already needed by scientists doing serial crystallography to decrease number of files in directories produced by processing thousands of individual stills at once.

Review NXprocess

Aaron Brewster brought this up. He encountered some problem storing processed MX data in this group.

Versioning of base classes and application definitions

There is no infrastructure to support this. We only work of the latest version. With science progressing this will not be viable for the long term.

Review NXdirecttof

After discussion with jkrueger1 at FRM-2 there were the following change requests for NXdirecttof.
This regarding applying NXdirectof to the FRM-2 instrument TOFTOF.

TOFTOF has no fermi chopper but disk choppers only. Suggestion: allow both
NXmonitor: they do not have a time-of-flight monitor, make data, time_of_flight optional
Jens Krueger will add requests for additional fields

What is the decision on variable length strings?

Regarding #10, there is no record of a final NIAC vote on whether to accept variable length strings.

Question came up today in the context of a file written by Diamond Light Source with a link (target attribute written as variable length string) that was not recognized as a NeXus link by NeXpy.

look at proposed new NeXus home page - draft

this is work in progress

preview: https://htmlpreview.github.io/?https://github.com/nexusformat/NIAC/blob/master/2016/www_page_486/index.html

[NIAC 2018 Proposals] Attribute to define signal normalization

It is common for data to be stored as an unnormalized signal with a separate normalization array. This allows multiple data sets to be merged or rebind reliably when the weights are not defined by the uncertainties. It is common in Mantid, where NXdata groups stored within MDHistoWorkspace entries contain a numevents array that should be used to normalize the data array. The X-ray software CCTW stores signals and weights of individual runs separately before merging them together.

I propose that we add an optional attribute normalization to the NXdata class attributes, to accompany a signal attribute. When it is present, plotting software should divide the signal array by the normalization array before plotting or performing other operation on the normalized data. This should be specified at the group level because the arrays are often stored in separate files so placing the attribute on the array itself might not be possible.

[NIAC2020] Final vote on NXmx

Final vote on recently revised NXmx for Gold Standard being used by DECTRIS
for their FileWriter 2. See Bernstein HJ, Förster A, Bhowmick A, Brewster AS, Brockhauser S, Gelisio L, Hall DR, Leonarski F, Mariani V, Santoni G, Vonrhein C. Gold Standard for macromolecular crystallography diffraction data. IUCrJ. 2020 Sep 1;7(5).

proposed for discussion by @yayahjb

[NIAC Proposals] Attribute to add masks to signals

I propose that we add an optional string attribute normalization to the NXdata class attributes, to accompany a signal attribute. When it is present, the signal array should be masked by an integer or boolean array specified by the attribute. Values of 1 in the mask array would correspond to masked data. For example, when read by a Python package, the Numpy array containing the signal data could be converted to a MaskedArray using the mask array. This should be specified at the group level because the mask array could be stored in a separate file so placing the attribute on the array itself might not be possible.

This is functionally similar to #34.

nexusformat / niac Goto Github PK

niac's People

Contributors

Stargazers

Watchers

niac's Issues

Rationale

Proposal

Rationale

Proposal

My View

Recommend Projects

Recommend Topics

Recommend Org