Giter Site home page Giter Site logo

niac's People

Contributors

benajamin avatar mkoennecke avatar prjemian avatar zjttoefs avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

niac's Issues

Infrastructure to deal with changes to base classes or application definitions

When proposing a new class (application or base) there is the contributed definitions staging area. In order to add fields to a base class or further develop an existing application definition: Where is the place where that can be done and get visibility in the community.

Pull requests are a possibility, but very much hidden from public view.

The relates somewhat to the discussion on versions #1

Strings in NeXus

An issue (yes, another one) has come up regarding strings in NeXus. HDF5 now allows to write variable length strings. Some popular software, like h5py, actually uses this feature. The problem is that the variable length strings API in HDF5 is not very well though out. Thus the existence of both normal strings and variable length strings in a NeXus file imposes a burden on reading code which does not use the NeXus API. The burden is that you have to check first if this a variable length string or not and then proceed to read the data differently.

I think we need to discuss this and decide if we standardize on normal strings or if we allow both forms of strings and impose the additional burden for reading string data on users.

[NIAC 2018 Proposals] Missing NXdata indications about plot type (linear, semilogx, semilogy, loglog ...)

Missing a way to specify how the axes are to be displayed.

Adding additional qualifiers would not break any existing code while it would allow to see the data the way expected by the users.

Using predefined plot types could be one way:

NXdata@signal="I"
NXdata@axes=["q"]
NXdata@plot_type="semilogy"

but that is not generic enough, while something more generic might not be acceptable:

NXdata@signal="I"
NXdata@axes=["q"]
NXdata@plot_type=["linear", "logarithmic"] (list of representations to be associated to each of the axes with the last one to be associated to the signal)

Any proposal allowing to solve that problem is welcome.

Streaming NeXus

At more and more places data is streamed: i.e you get possibly multidimensional data with time stamps.
Nexus ought to have a plan how to store this kind of data. Previous work include neutron event storage but we have two different versions of this. A more generalized version is needed.

[NIAC 2018 Proposals] NXdata@auxiliary_signals (It was How to specify multiple signals?)

Typical use cases:

  • Plotting several counters at scanned positions.
  • Fitting raw data and willing to generate a plot with the raw data and the fitted data
  • Elemental mapping where a map is generated for each element
    ....

The combination of multiple signals with the use of NXprocess would solve many provenance issues.

Proposed Implementation

In the past, everything was ready for the implementation via allowing to have multiple datasets with the signal attribute in an NXdata group and by playing with attribute values "1", "2", "3". That approach is not any longer possible.

- Proposal 1: Allow signal to be an array of string with the first element of the array to be treated as it is now the case with single signals.
- Proposal 2: Define a signals attribute being an array of strings with the first element of the array to be treated as it is now the case with single signals.

  • Proposal 3: Define a new set of groups NXplot1d, NXplot2d, NXplot3d, ... allowing so.
  • Proposal 4: Allow a new NXdata attribute named auxiliary_signals containing the additional datasets that can be plotted alongside with the signal. The proposal is so flexible that auxiliary_axes could be considered for the future but for the main goal of this issue is not needed.

[NIAC 2018 Discussion] Discuss HDF5XMP metadata

The code uploaded to the hdf5xmp repository provides a framework to add metadata (and thumbnails) to HDF5 files (or in sidecar .xmp files) and a set of plugins to provide included thumbnails to file browsers. Does anybody foresee any issues with the approach used? Does the NIAC support furthering this implementation?

NX_CHAR

I have posted the message below to [email protected] but apparently the message did not get through or got stopped somewhere.

Dear colleagues,

I think there is a problem with the BasicWriter.py example found at:

http://download.nexusformat.org/sphinx/examples/h5py/index.html

Unfortunately, that example is targeting Python 2.x and that can lead to
misunderstandings.

If I read the documentation, NX_CHAR fields should be UTF-8 encoded strings.

Running the example, all the attributes are read as bytes and not as
UTF-8 encoded strings when read under Python 3.

The worst thing is that if the example is run under Python 3 (just by
adding the missing parenthesis to the print statements), the generated
file is different because indeed, strings are strings and not bytes
without any encoding.

Please, clarify is NX_CHAR fields should correspond to UTF-8 encoded
strings or to byte strings.

The example should be fixed and all the attribute strings should have a
lower case U character prepended in order to:

  • Comply with the documentation
  • Generate same output under Python2 and Python3
  • Generate same input under Python2 and Python3

Best regards,

Armando

[NIAC 2018 Proposals] Add NXdiffractometer or NXgoniometer to NXinstrument

Rationale

Diffractometers are relatively common instruments and a very common operation is the conversion from data to Q or reciprocal space based on the position of the motors associated to the diffractometer. Currently one has to look for all the positioners in an entry to find the common mnemonics used as function of the diffractometer geometry (phi, chi, theta, twotheta, mu, delta, ...).

Proposal

  • Define a new base class NXdiffractometer to be added to NXinstrument family.

  • Provide the name of the implemented geometry. We could rely on articles and/or the names used by other software like SPEC for the geometry name and/or the motor names in order not to reinvent the wheel.

  • Provide the reference (or the URL) where the geometry is described. I would start defining the classics:

    • W.R. Busing and H.A. Levy. Acta Cryst. 22 (1967) 457-464. (aka. FOURC)
    • M. Lohmeier and E. Vlieg. J. Appl. Cryst. 26 (1993) 706-716 (aka. SIXC)
    • H. You J. Appl. Cryst. 55 (1999) 614-623 (aka. 4S+2D or PSIC)

and let the community to enlarge the list with the different variants.

I think we have enough time to contact the diffractometer users at our respective facilities for their feedback prior to the NIAC meeting.

Code Camp 2020-2 topics

Anyone wanting to discuss issues at the 2020-2 NeXus Code Camp is asked to please add a comment here describing the topics they want to work on together with the core NeXus developers. We mostly want to be prepared for the number and types of issues being brought to us.

Clarify Voting Procedures

A problem occurs when we do a vote but not enough NIAC members are in attendance or respond in an email vote. We do not have a quorum or any other ruling in the NIAC constitution.

We will continue working with the procedure to determine the result of the vote on the votes received. But at the next NIAC meeting we should either write this procedure into the constitution or come up with a quorum or another solution for this issue.

NIAC 2020 topics

Anyone wanting to discuss issues at the 2020 NIAC meeting is asked to please add a comment here describing the topics they want brought before the NIAC. We mostly want to be prepared for the number and types of issues being brought to us.

[NIAC 2018 Discussion] Messy specifications

I would like to draw your attention that by setting changing or free to interpret specifications you are cooking a messy format:

NX_CHAR: any string representation All strings are to be encoded in UTF-8. Includes fixed-length strings, variable-length strings, and string arrays. Some file writers write strings as a string array of rank 1 and length 1. Clients should be prepared to handle such strings.

That clients should be ready to handle such strings does not mean that they are acceptable. I find legitimate that readers make the life of the users of such files difficult so that in turn they make sure the writers write proper files.

The NeXus API was writing things properly. At least I never encountered such discrepancies in old files (remember the 2010 workshop at the ESRF?). If people decide to use other things, they have to write correctly and not put the burden at the readers.

A specification can be bad, but a changing specification is even worse.

When I think about the lengthy discussion concerning auxiliary_signals that could have been avoided by allowing NXdata@signal to be a list of strings instead of a string I get really angry. It was said that allowing a list of strings could break code when reading the above specification the clients should have been already prepared!!!!

I really wonder how many NIAC members are actually developing software for other facilities because as a developer I cannot understand some decisions.

[NIAC 2018 Discussion] - Input from the McStas team

Hi NIAC, here is a little input from McStas for the NOBUGS 2018 satellite:

  • McStas have been early adopters, NeXus in the McStas code tree since ~ 2003, using the napi.h c-API. It was therefore not so nice to hear that NeXus will eventually deprecate this interface, even though our napi use is relatively basic and can likely be replaced by direct calls to HDF5.

  • Our code is very platform independent and is deployed to both Windows machines, Mac’s and Linux’es by our users. It is therefore very important that good deployment strategies for the NeXus libs exist for all of those platforms, ideally by availability of installer packages.

  • Specifically: The latest available macOS installer on GitHub is for NeXus 4.3.0 and built for 32bit systems...! (https://github.com/nexusformat/code/releases/download/4.3.0/NeXus-4.3.0.dmg)

  • -> It is beyond the average capability of a McStas user to build NeXus locally - and I don’t
    feel it should be my task to deploy NeXus with McStas. :-)

  • A current use of NeXus in McStas is for the transfer of instrument geometry and event-data to Mantid, including a ‘hack' that generates an XML-based IDF. Any direct support for Mantid- oriented geometry information in NeXus will be welcomed, especially if this happens via the availability of e.g. a c-api...

Best,
Peter Willendrup on behalf of the McStas / McXtrace team

[NIAC 2018 Proposals] Extend the default attribute to any NeXus group

The default attribute associated to NXentry has proven to be an excellent idea.

NXdata groups can be present at any level (particularly common is their use inside NXprocess and even NXdetector). It would be desirable to have the possibility to check if those groups have an attribute default indicating the default plot associated to them. That would improve again a lot user experience.

[NIAC 2018 Proposals] Clarify/generalize use of uncertainties

Rationale

In recent presentations a the Research Data Alliance meeting in Berlin, the subject of uncertainties associated to the data was mentioned, in particular in the frame of application definitions.

Reading the documentation at:

http://download.nexusformat.org/doc/html/design.html?higlight=uncertainty#design-fields

and

http://download.nexusformat.org/doc/html/classes/base_classes/NXdata.html#nxdata

I see two possible ways of specifying uncertainties for a dataset but it is not clear to me if appending _errors to a dataset name is to be considered the official generic solution. To me it looks more like an example. If it so, please make it absolutely clear in the documentation and this issue can be closed.

Proposal

  • Decide about the proper way to associate uncertainties to datasets:

    • One way (only adding _errors or attribute uncertainties recommended)
    • Two ways (both ways recommended)
    • No way (specific to application definitions and therefore each application definition will decide)

My View

I can see arguments in favor and against of any of the above and I could easily defend any of them.

I only ask you to take into account in your evaluation the fact that most likely we'll be dealing with links (internal or even externals).

NIAC 2018: NeXus as logbook format?

This is a suggestion to use NeXus for electronic logbooks. There are some advantages to this. This is
a discussion topic to figure out if it is a good idea and in order to decide if we want to do down this road at all.

propose to ratify NXcanSAS as application definition

NAPI pointe release 4.4.4

I've added a new milestone to the code repository for the 4.4.4 point release. In addition I added a couple of issues to this milestone. We could discuss this during the code camp.

Discuss/ratify NXreflections

This is the NeXus extension proposed and developed by James Parkhurst at Diamond for recording integrated reflection intensities after data reduction. The extension is already needed by scientists doing serial crystallography to decrease number of files in directories produced by processing thousands of individual stills at once.

Review NXprocess

Aaron Brewster brought this up. He encountered some problem storing processed MX data in this group.

Review NXdirecttof

After discussion with jkrueger1 at FRM-2 there were the following change requests for NXdirecttof.
This regarding applying NXdirectof to the FRM-2 instrument TOFTOF.

  • TOFTOF has no fermi chopper but disk choppers only. Suggestion: allow both
  • NXmonitor: they do not have a time-of-flight monitor, make data, time_of_flight optional
  • Jens Krueger will add requests for additional fields

What is the decision on variable length strings?

Regarding #10, there is no record of a final NIAC vote on whether to accept variable length strings.

Question came up today in the context of a file written by Diamond Light Source with a link (target attribute written as variable length string) that was not recognized as a NeXus link by NeXpy.

[NIAC 2018 Proposals] Attribute to define signal normalization

It is common for data to be stored as an unnormalized signal with a separate normalization array. This allows multiple data sets to be merged or rebind reliably when the weights are not defined by the uncertainties. It is common in Mantid, where NXdata groups stored within MDHistoWorkspace entries contain a numevents array that should be used to normalize the data array. The X-ray software CCTW stores signals and weights of individual runs separately before merging them together.

I propose that we add an optional attribute normalization to the NXdata class attributes, to accompany a signal attribute. When it is present, plotting software should divide the signal array by the normalization array before plotting or performing other operation on the normalized data. This should be specified at the group level because the arrays are often stored in separate files so placing the attribute on the array itself might not be possible.

[NIAC Proposals] Attribute to add masks to signals

I propose that we add an optional string attribute normalization to the NXdata class attributes, to accompany a signal attribute. When it is present, the signal array should be masked by an integer or boolean array specified by the attribute. Values of 1 in the mask array would correspond to masked data. For example, when read by a Python package, the Numpy array containing the signal data could be converted to a MaskedArray using the mask array. This should be specified at the group level because the mask array could be stored in a separate file so placing the attribute on the array itself might not be possible.

This is functionally similar to #34.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.