Giter Site home page Giter Site logo

ncompare's Introduction

ncompare


Project Status: Active – The project has reached a stable, usable state and is being actively developed Documentation Status Python Versions Package version Code style Mypy checked Contributions welcome

Compare the structure of two NetCDF files at the command line. ncompare generates a view of the matching and non-matching groups and variables between two NetCDF datasets.

Installing

Install the latest version of the package from the Python Package Index (PyPI):

pip install ncompare

Usage

To compare two netCDF files, pass the filepaths for each of the two netCDF files directly to ncompare, as follows:

ncompare <netcdf file #1> <netcdf file #2>

With an additional --file-text argument specified, a common use of ncompare may look like this example:

ncompare S001G01.nc S001G01_SUBSET.nc --file-text subset_comparison.txt

A more complete usage demonstration with example output is shown in this example notebook.

Options

  • -h, --help : Show this help message and exit.
  • --file-text [FILE_PATH]: Text file to write output to.
  • --file-csv [FILE_PATH]: Comma-separated values (CSV) file to write output to.
  • --file-xlsx [FILE_PATH]: Excel file to write output to.
  • --only-diffs : Only display variables and attributes that are different
  • --no-color : Turn off all colorized output.
  • --show-attributes : Include variable attributes in the table that compares variables.
  • --show-chunks : Include chunk sizes in the table that compares variables.
  • -v (--comparison_var_name) [VAR_NAME]: Compare specific values for this variable.
  • -g (--comparison_var_group) [VAR_GROUP]: Group that contains the comparison_var_name.
  • --column-widths [WIDTH, WIDTH, WIDTH]: Width, in number of characters, of the three columns in the comparison report
  • --version : Show the current version and then exit.

Contributing

Contributions are welcome! For more information see CONTRIBUTING.md. ncompare is licensed under the NASA Open Source Agreement, which is included [in this repository's license directory](license/LAR-20274-1_ncompare NetCDF structural comparison tool_NOSA 1.3.pdf) and on the Open Source Initiative website.

Developing

Development within this repository should occur on a feature branch. Pull Requests (PRs) are created with a target of the develop branch before being reviewed and merged.

Installing locally

For local development, one can clone the repository and then use poetry or pip from the local directory:

git clone https://github.com/nasa/ncompare.git
(Option A) using poetry:

ii) Follow the instructions for installing poetry here.

iii) Run poetry install from the repository directory.

(Option B) using pip:

ii) Run pip install . from the repository directory.

Testing locally

If installed using a poetry environment, the tests can be run with:

poetry run pytest tests

Or from another virtual environment, one can use:

pytest tests

To run as a locally installed poetry module

poetry run ncompare <netcdf file #1> <netcdf file #2>

Why ncompare?

The cdo (climate data operators) tool does not support NetCDF4 groups. Moreover, nco operators' ncdiff function computes value differences, but --- as far as the developers of this tool are aware --- nco does not have a simple function to show structural differences between NetCDF4 datasets. Note that h5diff, provided in the HDF5 software, can also be used to find differences. In comparison to h5diff, ncompare is written and runnable in Python; ncompare provides aligned and colorized difference report for quicker assessments of groups, variable names, types, shapes, and attributes; and can generate report files formatted for other applications. However, note that h5diff provides comparison of some otherwise "hidden" hdf5 properties, such as _Netcdf4Dimid or _Netcdf4Coordinates, which are not currently assessed by ncompare.

Known limitations

  • ncompare uses xarray to access the root-level dimensions. In some cases, xarray will miss dimensions whose names do not also exist as variable names in the dataset (also known as non-coordinate dimensions).
  • Some underlying HDF5 properties, such as _Netcdf4Dimid or _Netcdf4Coordinates, are not currently assesssed by ncompare.

Notices:

Copyright 2023 United States Government as represented by the Administrator of the National Aeronautics and Space Administration. All Rights Reserved.

Third-Party Software:

This software calls the following third-party software, which is subject to the terms and conditions of its licensor, as applicable at the time of licensing. Third-party software is not bundled with this software, but may be available from the licensor.

License hyperlinks are provided here for information purposes only:

item license link
colorama BSD-3-Clause https://opensource.org/licenses/BSD-3-Clause
netCDF4 MIT License https://opensource.org/licenses/MIT
numpy BSD-3-Clause https://opensource.org/licenses/BSD-3-Clause
openpyxl MIT License https://opensource.org/licenses/MIT
xarray Apache License, version 2.0 https://www.apache.org/licenses/LICENSE-2.0
Python Standard Library Python Software Foundation (PSF) License Agreement https://docs.python.org/3/license.html#psf-licenseDisclaimers

No Warranty:

THE SUBJECT SOFTWARE IS PROVIDED "AS IS" WITHOUT ANY WARRANTY OF ANY KIND, EITHER EXPRESSED, IMPLIED, OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY THAT THE SUBJECT SOFTWARE WILL CONFORM TO SPECIFICATIONS, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR FREEDOM FROM INFRINGEMENT, ANY WARRANTY THAT THE SUBJECT SOFTWARE WILL BE ERROR FREE, OR ANY WARRANTY THAT DOCUMENTATION, IF PROVIDED, WILL CONFORM TO THE SUBJECT SOFTWARE. THIS AGREEMENT DOES NOT, IN ANY MANNER, CONSTITUTE AN ENDORSEMENT BY GOVERNMENT AGENCY OR ANY PRIOR RECIPIENT OF ANY RESULTS, RESULTING DESIGNS, HARDWARE, SOFTWARE PRODUCTS OR ANY OTHER APPLICATIONS RESULTING FROM USE OF THE SUBJECT SOFTWARE. FURTHER, GOVERNMENT AGENCY DISCLAIMS ALL WARRANTIES AND LIABILITIES REGARDING THIRD-PARTY SOFTWARE, IF PRESENT IN THE ORIGINAL SOFTWARE, AND DISTRIBUTES IT "AS IS."

Waiver and Indemnity:

RECIPIENT AGREES TO WAIVE ANY AND ALL CLAIMS AGAINST THE UNITED STATES GOVERNMENT, ITS CONTRACTORS AND SUBCONTRACTORS, AS WELL AS ANY PRIOR RECIPIENT. IF RECIPIENT'S USE OF THE SUBJECT SOFTWARE RESULTS IN ANY LIABILITIES, DEMANDS, DAMAGES, EXPENSES OR LOSSES ARISING FROM SUCH USE, INCLUDING ANY DAMAGES FROM PRODUCTS BASED ON, OR RESULTING FROM, RECIPIENT'S USE OF THE SUBJECT SOFTWARE, RECIPIENT SHALL INDEMNIFY AND HOLD HARMLESS THE UNITED STATES GOVERNMENT, ITS CONTRACTORS AND SUBCONTRACTORS, AS WELL AS ANY PRIOR RECIPIENT, TO THE EXTENT PERMITTED BY LAW. RECIPIENT'S SOLE REMEDY FOR ANY SUCH MATTER SHALL BE THE IMMEDIATE, UNILATERAL TERMINATION OF THIS AGREEMENT.

ncompare's People

Contributors

danielfromearth avatar dependabot[bot] avatar tkantz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ncompare's Issues

Improve test suite

The test suite is missing checks on the layout of the output, such as whether variables are being included or excluded from the output and how differences within each group appear.

Enable `ncompare` to work with greater group depths

Currently ncompare works with netCDF hierarchies containing no more than one level of groups. To expand the types of netCDF data files with which this tool will work, it should walk through group trees of depth greater than one.

Fix author attr for poetry

Great project here; thank you!

I attempted to install the package from a local clone of the repo using poetry. However, I got an error about the "author" line in 'pyproject.toml' being formatted incorrectly.

I changed line 7 of pyproject.toml from:

authors = ["[email protected]"]

to:

authors = ["Daniel Kaufman [email protected]"]

And installation via poetry worked perfectly.

Fyi, the full screen-output -- without poetry's nice coloring, sadly -- of the original poetry-fail was:

$ poetry install
Installing dependencies from lock file

Package operations: 3 installs, 16 updates, 0 removals

• Updating six (1.16.0 /home/conda/feedstock_root/build_artifacts/six_1620240208055/work -> 1.16.0)
• Updating numpy (1.23.5 /home/conda/feedstock_root/build_artifacts/numpy_1668919096861/work -> 1.23.2)
• Updating pyparsing (3.0.9 /home/conda/feedstock_root/build_artifacts/pyparsing_1652235407899/work -> 3.0.9)
• Updating python-dateutil (2.8.2 /home/conda/feedstock_root/build_artifacts/python-dateutil_1626286286081/work -> 2.8.2)
• Updating pytz (2022.7.1 /home/conda/feedstock_root/build_artifacts/pytz_1673864280276/work -> 2022.2.1)
• Updating attrs (22.2.0 /home/conda/feedstock_root/build_artifacts/attrs_1671632566681/work -> 22.1.0)
• Updating cftime (1.6.2 /home/conda/feedstock_root/build_artifacts/cftime_1666833854447/work -> 1.6.1)
• Installing et-xmlfile (1.1.0)
• Updating iniconfig (2.0.0 /home/conda/feedstock_root/build_artifacts/iniconfig_1673103042956/work -> 1.1.1)
• Updating packaging (23.0 /home/conda/feedstock_root/build_artifacts/packaging_1673482170163/work -> 21.3)
• Updating pandas (1.5.3 -> 1.4.4)
• Updating pluggy (1.0.0 /home/conda/feedstock_root/build_artifacts/pluggy_1667232663820/work -> 1.0.0)
• Installing py (1.11.0)
• Updating tomli (2.0.1 /home/conda/feedstock_root/build_artifacts/tomli_1644342247877/work -> 2.0.1)
• Updating colorama (0.4.6 /home/conda/feedstock_root/build_artifacts/colorama_1666700638685/work -> 0.4.5)
• Updating netcdf4 (1.6.2 /home/conda/feedstock_root/build_artifacts/netcdf4_1668678824560/work -> 1.6.0)
• Installing openpyxl (3.0.10)
• Updating pytest (7.2.1 -> 7.1.3)
• Updating xarray (2023.1.0 /home/conda/feedstock_root/build_artifacts/xarray_1674166302925/work -> 2022.6.0)

Invalid author string. Must be in the format: John Smith [email protected]

I have bolded the error statement that led me to make the correction detailed above.

Thanks again! -Scott

How do I run tests on this repo?

I would like to add a PR for performance improvements but I also need to make sure the tests aren't breaking.
Is there a command to run the tests?

improve readme in a few ways

(As part of ncompare's PyOpenSci review (issue #146))

  • clarify poetry vs non-poetry in readme
    • "The tests can indeed run with poetry run pytest tests but pytest tests from a python venv works too, it could be nice to have this clarified in the README as both installations are previously described."
  • license
    • "The license is partially available in the README and as pdf in the license directory: as this is a specific NASA license and probably less known by the users, I suggest to have its name specified in the README perhaps with its link at the OSI website, for better discoverability."
  • Add readme badges for

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.