opennavigationsurface / bag Goto Github PK

The Bathymetric Attributed Grid library

License: BSD 3-Clause "New" or "Revised" License

CMake 3.57% C++ 76.70% C 2.89% SWIG 4.21% Python 12.27% Shell 0.35%

bag's Introduction

Bathymetric Attributed Grid (BAG) - Open Navigation Surface Project

This repository contains the specification of BAG data format and the necessary library code and examples required to build and work with data in the BAG format:

api - This is the primary API directory and contains the source for the Bathymetric Attributed Grid format (BAG).
configdata - Required XML support files. You must have an environment variable called BAG_HOME mapped to this directory in order to run the API functions.
docs - Miscellaneous and historical documentation resides here
examples - Contains programs to demonstrate some of the API functionality. In particular bag_create and bag_read are good starting points.
python - Contains Python units tests and examples that make use of the SWIG interface.
tests - Contains C++ tests.

The BAG specification and library are produced by the Open Navigation Surface project.

Documentation

Documentation for the BAG specification and library can be found here, in particular:

Installing and using the BAG library

The BAG library, and its dependencies, can be installed in a Conda environment (for example, Anaconda or Miniconda).

If you only want the C++ library, install libbaglib.

To install the Python bindings (along with the C++ library) install bagpy.

Once installed, you can test C++ library by building the BAG examples as a standalone project.

Note: You can use FindBAG.cmake in your own projects to locate libbaglib installed via conda.

Likewise, you can run the Python tests using the Conda-provided bagpy bindings by first installing the test dependencies into your conda environment:

pip install -r requirements.txt

Then run the tests (Linux and macOS):

BAG_SAMPLES_PATH=./examples/sample-data python -m pytest python/test_*.py

Under Windows, run:

set BAG_SAMPLES_PATH=examples\sample-data
python -m pytest python\test_*.py

Building and using the BAG library

Comprehensive build instructions can be found here.

For a Quick Start using make to build C++ applications on Linux, see QUICKSTART.MD.

bag's People

Contributors

Stargazers

Watchers

Forkers

rolker giumas mapsworks jice73 adamsjc glenrice-noaa brian-r-calder selimnairb wadv1232 mustafademir0 akshitamav heathhenley

bag's Issues

Convert FSD into GitHub wiki format

From 2019-03-20 meeting minutes 2.1.7

Implement supported HDF5 library upgrade during library release

Once #13 is complete, add to library.

From 2019-03-20 meeting minutes 2.2

Investigate implementing SWIG to provide Python API

BAG 2.0 library documentation

Complete the higher level documentation for the library, detailing how the library is built, and typical use cases. May also include writing 'sample' programs to illustrate typical library usage

Create examples of compound metadata layers for variable-resolution BAGS

Currently, the only example (e.g., examples/bag_compound_layer.cpp) for using compound metadata layers in v2 BAGs demonstrates usage with single-resolution BAGs. Since compound layer API supports variable resolution, there should be an example for this use case as well.

Update documentation to build directly off of the code

We should consider maintaining the documentation directly in the code and building other forms through a documentation engine (such as Sphinx or Doxygen).

Upgrade the existing VR support in the new 2.0 API

Upgrade the existing VR support. No changes to the internal file format or basic memory structures.

CI: Refactor Windows tests from Appveyor to GitHub Actions workflow

Appveyor builds are broken due to a strange Conda error (unable to find tqdm, see here). Given this, and since the GitHub Actions workflows seem to work well, it's probably time to consolidate the Windows build to GH Actions. In the process, we should also move away from using Conda for Windows builds and instead build dependencies by hand.

BAG::VRRefinements class needs custom read method

An attempt to read data from the refinements layer calls the base Layer::read method which does a bounds check against the dimesnsions specified in the dataset's Descriptor.

The problem is that the refinements do not use those dimensions by design!

Layer's read method is not virtual, so it seems like it wasn't meant to be overridden.

Stage branch for BAG 2.0 work

Create the new branch for the BAG 2.0 branch, and get 3rd party libraries updated.

Review the existing (old) branch for BAG.
Create a new branch for the new work, bring over any useful work from the old branch
Get all work items entered into the issue tracker in the BAG github project.
Get latest copies of 3rd party libs (HDF5, ZLIB, libxml2) and get the building.

oss-fuzz: Issue 66336: opennavsurf-bag:bag_extended_fuzzer: Direct-leak in H5O__pline_copy

oss-fuzz issue: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66336
details: https://oss-fuzz.com/testcase-detail/4808595577307136

Implement Python bindings for BAG 2.0

Add python bindings to BAG 2.0. This will include the implementation of the C API (which the Python bindings will wrap).

NOTE: Investigate the usage of SWIG to generate the C and Python APIs directly from C++.

Merge BeeCrypt-optional branch

From 2019-03-20 meeting minutes 2.1.4

Provide issue tracker topics for development priorities

From 2019-03-20 meeting minutes in 2.6.

getLayer overloads.

The combination of using a plain enum insted of an enum class and the comments suggesting the name parameter is optional can lead to the wrong version of getLayer being used.

BAG/api/bag_dataset.h

Line 82 in 5f24dd3

Layer* getLayer(LayerType type, const std::string& name) &;

For example, the following does not return a Layer* as one might expect (if one does not notice the lack of default for name).

auto elevationLayer = bag->getLayer(BAG::LayerType::Elevation);

I would suggest making BAG::LayerType an enum class to prevent implicit conversion to an int and adding defaults to the name parameters.

Create examples of using compound metadata layers with variable resolution bags

Currently, the only example for using compound metadata layers in v2 BAGs demonstrates usage with single-resolution BAGs. Since compound layer API supports variable resolution, there should be an example for this use case as well.

Design BAG 2.0 raster API

Complete the design of the raster interfaces for the BAG 2.0 update. This is mostly complete, but some additional work is needed.

oss-fuzz: Issue 66355: opennavsurf-bag:bag_extended_fuzzer: ASSERT: entry->size < H5C_MAX_ENTRY_SIZE

oss-fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66355
details: https://oss-fuzz.com/testcase-detail/5841316470652928

Review of the current v2.0 branch

The following comments/suggestions are based on the code available in b7f024a

General

The CHANGES.txt file is outdated. The large changes in release 2.0 should be briefly described there.
~~The docs\build_instructions.html file is outdated. The new instructions should be much shorter.~~
The examples\readme.txt file is outdated. There are existing example scripts that are missing (bag_vr_create.cpp), and there are listed scripts that are not actually present (e.g., bag_signfile).
~~How to build the doxygen documentation?~~
The encryption with beecrypt is absent. Is it because of the decision to have a stand-alone tool to do this task? If so, why do we still have encryption-related errors? e.g., BAG_CRYPTO_NO_SIGNATURE_FOUND in bag_errors.h?
~~What is the current use of the extlibs folder?~~

CMake

~~Why enforcing the maximum version of CMake to 3.16? CMake 3.17 has already been released. I understand the need of a minimum version. Thus, I would suggest to remove the ...3.16:~~
- ~~cmake_minimum_required(VERSION 3.0...3.16) (see https://cmake.org/cmake/help/latest/command/cmake_minimum_required.html)~~

Catch2 tests

The catch2 tests fail to run within Visual Studio because they don't find the BAG dll. Manually copying the BAG dll is annoying and error-prone:

More generally, how running the tests from Visual Studio? It does not seem that Visual Studio recognize them as tests by default (as it happens with other testing frameworks). If it does, some guidance on how to do it would help.

C++ Library

The concept of interleaved layers should be clearly presented/described. The selected name does not really help to understand what they are. I would suggest to add legacy (or something similar) to such a name. It would also be useful to add a reference to the ONS meeting when the unpacking of the NODE group and the ELEVATION group was discussed and approved.

Python binding

The build of the Python binding currently requires to run Visual Studio as administrator. Otherwise there is a CMake error: CMake Error: failed to create symbolic link: operation not permitted. I suggest to copy the required files rather than creating a symbolic link.
The Python binding should be built out of source: 'One nice and highly recommended feature of CMake is the ability to do out of source builds. In this way you can make all your .o files, various temporary depend files, and even the binary executables without cluttering up your source tree.' (see https://www.cs.swarthmore.edu/~adanner/tips/cmake.php).
The CMake install command ignores the Python binding. What is the current way to install the Python binding? I was not able to find any setup.py (https://packaging.python.org/tutorials/packaging-projects/). It would also be useful to put the module under a namespace (e.g., ons) to avoid to pollute the site-packages folder.
The Python tests do not use a testing framework. The current hard-coded solution of using assert() is difficult to maintain as well as unusual for Python libraries. Python has a built-in testing framework named unittest (https://docs.python.org/3.6/library/unittest.html) very simple to use.
~~The micro151.bag test file is not present.~~
There is NOT documentation on how to use the Python binding. The Python tests can be somehow useful, but they cannot fully substitute the need of proper documentation. The most popular way to document Python libraries is by using Sphynx (https://www.sphinx-doc.org/en/master/).

Continuous integration

~~Travis and AppVeyor tests Python 3.6 and 3.7. The current version of Python is 3.8, thus it should be also added.~~
Travis script has developer-specific calls (LD_LIBRARY_PATH=/home/travis/build/jice73/BAG/build/api PYTHONPATH=/home/travis/build/jice73/BAG/build/api/swig_i python python/test_all.py) that should be removed. Otherwise, how is this going to work when executed by the upstream GtiHub repository?

CC: @GlenRice-NOAA @jice73 @brian-r-calder

oss-fuzz: Issue 66345: opennavsurf-bag:bag_extended_fuzzer: ASSERT: H5_addr_defined(addr) && addr <= file->maxaddr

oss-fuzz issue: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66345
details: https://oss-fuzz.com/testcase-detail/6048558834843648

Confirm compatibility of HDF5 1.10

From 2019-03-20 meeting minutes 2.1.2

Refactor Python installer

Python installer uses a now-deprecated setup.py approach of distutils/setuputils. This should be refactored to use the more modern pyproject.toml approach to packaging. As part of this, the following problems with the current packaging should be addressed:

Design a new Variable Resolution (VR) API

Design the new VR interface for BAG 2.0. This will replace the existing 'vr' interface and data structures.

Design Metadata API

Design the API extensions to BAG 2.0 to add the new Metadata layer capabilities defined in NOAA proposal in #2

oss-fuzz: Issue 66360: opennavsurf-bag:bag_extended_fuzzer: Abrt in __cxxabiv1::failed_throw

oss-fuzz issue: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66360
details: https://oss-fuzz.com/testcase-detail/5107579080474624

BAG 2.0 additional testing

Complete higher-level testing of the library. During the BAG 2.0 development project, unit tests will be written to ensure correct functionality of the core library. This task will be allotted for higher-level tests (i.e. integration tests), or to increase the unit test coverage (i.e. corner cases, error handling, etc)

Method to skip large areas of null data while loading.

One issue I encountered while writing the bagViewer ( https://github.com/OpenNavigationSurface/bagViewer ) was having to deal with large datasets with large areas of null data. In order to not consume too much more memory than necessary, I came up with a tiled approach to loading the data. By loading the data one tile at the time, I could determine if a tile contained only null data and discard it.
The solution is workable as it limits the amount of memory consumed to mostly usable data, but it does take a lot of time loading as all the data has to be loaded and checked, null or not.
How can we design and implement a mechanism that would help us figure out which parts of the dataset contain actual useful data?
Maybe some sort of tile index as metadata?

Varialbe resolution bags do partially mitigate this problem. Would encoding single resolution bags as variable resolution fix this problem?

Issue #2 mentions a "previous pre-proposal also discussed a stand alone table for vector georeferenced data." A coarse coverage polygon might do the trick here.

BAG library update

The BAG library is a bit long in the tooth. We should consider a complete rewrite of the library to clean it up. Should this update be in C++, or stick with the existing C code base convention?

oss-fuzz: Issue 66354: opennavsurf-bag:bag_extended_fuzzer: ASSERT: "bad sizeof size" && 0

oss-fuzz issue: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66354
details: https://oss-fuzz.com/testcase-detail/5791586621390848

Build issue on Ubuntu 20.04

Attempting to configure the V2 branch of the project using CMake on Ubuntu 20.04 with dependencies installed via apt results in the following error:

HDF5::HDF5 not found.

The problems seems to be that in CMake's supplied FindHDF5.cmake module, the HDF5::HDF5 target was added in a later version of CMake than the one included with Ubuntu 20.04.

Unless someone has another idea, I'll try to add checks and workarounds in CMakelist to overcome this problem.

This does raise the question of which range of platforms and versions do we want V2 to support?

The ROS based framework I currently work on for uncrewed marine systems is based on Ubuntu 20.04 and ROS Noetic. Conda is not used and apt based dependencies are much prefered. I'll work under the assumtion that this is within the range we want to support for now.

Unit Testing framework for C++

Evaluate an appropriate Unit Testing framework for C++ code in the BAG library.

Easy to use
Minimal overhead
Integration with CI

Add ability to export different layers

Hello all,

Had a request that came specifically from a user of NOAA's archived VR bag. They requested more information about the resolutions contained within the VR bag. This information could be very useful for future users and I would like to see it as an additional layer along with an Uncertainty layer. I also see the utility of exporting the density layer as well. By adding those items into the bag standard that would allow for greater utility and functionality.

Please let me know if you would like me to expand on this further.

Tyanne

oss-fuzz: Issue 66363: opennavsurf-bag:bag_extended_fuzzer: Out-of-memory in bag_extended_fuzzer

oss-fuzz issue: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66363
details: https://oss-fuzz.com/testcase-detail/5177840182034432

Add mobile-friendly HTML template to www.opennavsurf.org

The current HTML template for www.opennavsurf.org is not mobile friendly: the navigation overlaps the content. The template should be replaced with one that is mobile-friendly.

BAG Spatial Metadata Proposal

A Proposal to

The Open Navigation Surface Working Group

for updating the

Bathymetric Attributed Grid

specification with tables and layers for metadata (updated proposal)

2019 March 13

Introduction & Background

The Bathymetric Attributed Grid (BAG) was created to service the needs of the hydrographic community and target the concept of a Navigation Surface on a survey by survey basis. The design incorporates depth, uncertainty, and some metadata for the associated survey, but the ability to provide metadata with an associated geospatial context is limited. Some metadata, such as those attributes associated with S-101 (e.g., Quality of Bathymetric Data), may be better described with some geospatial context to enable different quality designations within an area. Importantly, these data may describe survey information that are not covered directly by the bathymetric layer. For example, it would be valuable to carry survey wide quality information, even when some of the survey coverage was contributed by side scan. Here we propose a key-raster and value-table pair that would enable the description of metadata on a node by node basis (Figure 1).

Figure 1 - georeference metadata as key-raster value-table pairs.

The previous pre-proposal also discussed a stand alone table for vector georeferenced data. This approach will not be discussed further in this proposal but may be redressed in the future.

Generically, a key-raster and value-table pair could serve the needs of many different types of metadata. The driving desire here is to enable the transport of a S-101 Quality of Bathymetric Data-like meta attribute within the BAG. In recognition of these two distinct parts of this proposal the generic data type to satisfy storing this data within the BAG is in Part A, while a specific implementation of this data type will be discussed for the Quality of Bathymetric Data in Part B. As a matter of convenience we have added a Part C, not previously discussed, such that georeferenced data licensing could be considered as part of this proposal and as a distinct implementation of Part A.

Part A - Georeferenced Metadata

Describing metadata on a node by node basis is important when full spatial resolution of the metadata is desirable and there is only one result for each node. Assuming many of the metadata values will be repeated as a set, a raster containing a key at each node can reference a table row of metadata values.

Requirements

Spatially keyed metadata included in the BAG.
Minimally impact the size of a BAG.
Leave the current XML metadata unchanged.

Project Description

As a generic metadata container the key-raster value-table pair may be reused to describe various metadata types. To enable reuse of this type of container without polluting the root node we propose storing all georeferenced metadata in a dedicated node under the BAG_root node named "Georef_metadata". Each implemented instance of the key-raster value-table pair would then be in an additional sub-node with the name of the represented layer as the name of the subnode. The subnode metadata header would contain a georeferenced metadata type, such as the types described in part B or part C of this proposal. These specific types should not need to be interpreted directly by a library using the BAG, but instead are meant to provide context for the specific meaning of the values stored in the value table for the downstream user. See part B of this proposal for an example.

Within each subnode the single resolution key-raster and value-table pair would be named as "keys" and “values” respectively. In the case that variable resolution metadata keys are available it would be stored as “varres_keys”, reference the same “value” metadata table, and correspond to the indices described in varres_metadata in the same manner as varres_refinements.

An overview level layout of the proposed update is given in Figure 2.

Figure 2 - The BAG structure with proposed structure additions. Each of the georeferenced metadata field have their own type (eg "Type A").

In Figure 2 there happen to be three georeferenced metadata layers. The metadata Alpha is of metadata type A, while the Cheese and Juice layers are of types B and C respectively.

If the assumption holds that there are a limited number of valid combinations of the metadata values in the raster, the number of entries in the table should be small and the raster should be highly compressible (raster in Figure 3, table in Table 1).

Figure 3 - Left, the elevation layer, right, the node metadata layer with table indice keys. The resolution of both rasters is the same despite their depiction here for clarity.

Table 1 - An example of a table corresponding to Figure 3.

The table information is matched to the relevant raster, and vice versa, through corresponding keys. This enables finding metadata information based on location using the georeferenced raster keys to look up metadata in the table, or finding the locations that have particular metadata by searching the raster for keys that correspond to particular metadata in the table.

Because the key-raster layer corresponds to the other raster layers found in the root node, the georeferencing information within the BAG should also govern this layer. The same number of rows and columns should exist in the key-raster as exist in the root level arrays for elevation and uncertainty.

Keys in the raster are only unsigned integers and should correspond to the row number in the metadata table. Because the HDF structure is self describing there is not a need to predefine the byte size of the key-raster data type. A single byte will likely satisfy the needs of many BAG files, but could be expanded if a large number of rows are required in the metadata table. The byte size required can be determined at the time of BAG formulation. Because the data type for the key-raster may differ from other raster arrays in the file, the no data value is zero ("0"), and the first row in the value-table shall be correspond to integer one (“1”).

Impact to the existing BAG library should be minimal as the reading and writing of the proposed additions should not change the existing code and could be handled entirely separately. The level of effort to implement a reader and writer for the proposed format has not been formally undertaken, however we expect this to be a modest effort.

Feedback on this proposal to consider:

Some preliminary feedback and suggestions on this proposal are included with additional thoughts.

There is concern that needing to write a raster or refinements layer when the values are the same for the surveyed area is a waste of space, even despite compression. If this is a cause for concern we propose that if there is only one row in the value-table metadata that the key raster or varres_key not be required with the assumption that the metadata applies to all locations where there is data in the elevation layer.
Making this layer type compatible with* CF conventions** (http://cfconventions.org/) would enable easy viewing without special parsers.* A thorough investigation into the CF conventions has not been undertaken. The value of enabling the CF convention needs to be further discussed and defined.
Making each metadata value (column in the value-table) its own layer would be another way to convey this information and also be potentially more flexible. While this is true, it could require many more raster layers which would contain largely the same values. While compressible, lots of rasters could also add significant clutter while increasing the ambiguity of the metadata available. Having specific metadata types means there can be an expectation of the information available, which is valuable when targeting specific needs, and reduce the number of datasets available in the file structure.
What would happen if one of the columns that we need to read are not there for some reason? This is the reason for the data type definitions in the following parts of this proposal. We think a check to ensure compliance with the data type definition can be put in place when the BAG is accessed for read. In this case the layer would be labeled as invalid and ignored if it doesn’t meet the define table parameters.
How would this type of data structure fit within the GDAL data structure? The raster attribute table (RAT) in GDAL is attached to a raster band and would be a natural structure to house proposed raster and table pair. The only difficulty anticipated by the GDAL team is the RAT is not compatible with all HDF data types, thus forcing some form of data type mapping if the GDAL library is not updated to be include these additional types in the RAT format. More information can be found here: https://www.gdal.org/classGDALRasterAttributeTable.html#details

Part B - Quality of Bathymetric Data Metalayer

As noted in the Cathedral and the Bazaar (E.S. Raymond 2001), "Every good work of Software starts by scratching a developer's personal itch." In this case the NOAA National Bathymetric Source (NBS) Project desires to use the BAG format to convey both bathymetry and a description of the quality of the bathymetry. While the current format contains a vertical uncertainty layer we wish to include additional information we often provide in the sub attributes of the S57 metaobject M_QUAL. Thinking forward, this should be formulated as S-101 Quality of Bathymetric Data meta type with the assumption that the information may be backported to S-57 where needed. We think adding this information aligns well with enabling the idea of a navigation surface. We also think this additional information will increase the value of the products we produce and enable seamless internal use of our data.

The specific use case envisioned for the NBS project adds value in two phases. First, processes downstream can utilize the bathymetry quality information directly in the BAG without having to carry around a supplemental file. The NBS supersession logic uses this quality information to sort which data should be used as part of the national bathymetry, so having quality directly integrated into the bag is valuable. Also, this is the same information that is currently provided to our Marine Chart Division as a supplemental file, so it would streamline that processes as well. Second, the NBS project currently creates a BAG which contains the amalgamation of many different sources with different qualities. Having these various qualities tracked within the BAG would simplify the products from the NBS and streamline our process.

It is worth noting that NOAA commissioned surveys rarely have more than one or two designations for the quality of data within the S-57 format. We expect the structure proposed in Part A to be able to hold many surveys with a roughly proportional number of entries in the value-table.

Requirements

Carry the Quality of Bathymetric Data attributes, or some reformulation thereof, forward such that the ideas of object detection capability, coverage, and uncertainty can be captured for use with the bathymetry.
Enable the representation of survey coverage as distinct information from bathymetry.

Project Description

As an implementation of the georeferenced metadata layer type discussed in Part A of this proposal, this section is meant to define a georeferenced metadata type corresponding to the S-101 Quality of Bathymetric Data. This definition is expected to be more of a registration of the data type as a library accessing this information should just pass it to a user for interpretation. A writer of this information may wish to implement some checks for consistency in the metadata type dependencies declared in S-101 and included the discussion here.

The data type declared in the node containing the key and value data should be labeled as "quality_of_bathy_data_1_0", where the “1_0” (one_zero) is meant to convey a version number for the data type. While changes in data type are not expected to impact reading the data, it may imply changes to how to use the data or the expected fields that may be present in the value table columns. The version “1_0” is meant to correspond to the S-101 standard version 1.0.

The proposed fields for the metadata table are depicted in Figure 4 and summarized in Table 2.

Figure 4 - The quality_of_bathy_data_1_0 type as an implementation of Figure 2.

Table 2 - A summary of the metadata fields corresponding to S-101 quality of bathymetric data to be contained in the "values" table. Values which directly reference the S-101 definition are noted. See later discussion on the other items.

Column Name	Column Type	Note
temporal_variability	Unsigned Integer	Valid numbers are 1 to 6, corresponding to S-101 encoding
data_assessment	Unsigned Integer	Valid numbers are 1 to 3, corresponding to S-101 encoding
feature_least_depth	Boolean	See S-101 least depth of detected feature measured.
significant_features	Boolean	See S-101 significant features detected.
feature_size	Float	See S-101 feature size.
feature_size_var	Float	See further discussion (2)
full_coverage	Boolean	See S-101 full seafloor coverage achieved
bathy_coverage	Boolean	See further discussion (3)
horizontal_uncert_fixed	Float	See S-101 horizontal position uncertainty fixed
horizontal_uncert_var	Float	See S-101 horizontal position uncertainty variable factor
survey_date_start	String	See S-101 Survey date start
survey_date_end	String	See S-101 Survey date end

Most of the proposed fields for the value table correspond directly to the S-101 metaclass Quality of Bathymetric Data. There are, however, some exceptions.

Depth range maximum and minimum in S-101 are omitted. The assumption is that if this information is required than the corresponding nodes in the elevation layer can be queried for a minimum and maximum depth for each table row.
feature_size_var is meant to augment feature_size which corresponds to S-101 size of features detected. As noted in S-101, size of features detected is intended to be described as the smallest size in cubic metres the survey was capable of detecting. Depending on the type of survey this definition might force different depth ranges to have different values. For example, a survey vessel that works at a fixed height off the seafloor could maintain a fixed feature detection size capability over a wide range of depths. A surface vessel working over those same range of depths may have a feature detection capability that varies with depth causing the detection capability to be ambiguous and potentially misrepresented. For this reason feature_size_var is the percentage of depth that a feature of such size could be detected. When both feature_size and feature_size_var are present the greater of the two should be considered valid. The expectation is that feature_size_var will be set to zero if the feature size does not scale with depth. As with feature_size, feature_size_var should be ignored if significant_features is False.
NOAA surveys often use side scan and therefore have areas of coverage that do not contain direct depth measurements. In these cases the nodes with survey coverage but without bathymetry would be set to False. A condition with full_coverage = True and bathy_coverage = False is a useful indicator for how to work with these nodes within our workflow. If full_coverage is False, bathy_coverage must also be False.
Vertical uncertainty is excluded from this table as the vertical uncertainty is reported node by node within the BAG structure.

Feedback on this proposal to consider:

Some preliminary feedback and suggestions on this proposal are included with additional thoughts.

Can the metadata column names follow the S-101 names exactly so as not to create a new set of names that need to be mapped? The answer is yes, but this would make the names very long. We are happy to support a clear convention suggested by the Open Navigation Surface Working Group.

Part C - Data Licensing Information

As NOAA Office of Coast Survey has begun to ingest more data not acquired under NOAA’s direction, proper tracking of data licensing is important for where data can be used internally or in public. Particularly in the case of the National Bathymetric Source Project, tracking the various contributing sources within a final product can be helpful. While BAG already maintains a use qualifier within the top level metadata, a georeference metadata layer would allow for proper tracking of the data license by node to ensure information regarding restrictions on redistribution of data is not lost.

The specific use case envisioned for the described licensing information is within the workflow of the National Bathymetric Source (NBS) Project. The NBS project intends to filter contributing data to a product to match the intended user and the licensing terms of the source data. For example, the NBS project may create bathymetry to support NOAA hydrodynamic models. In this case the bathymetry would only be used to support internal NOAA work and therefore enable the use of data that might not be licensed for general distribution.

Requirements

Carry georeferenced data license information for all contributing data.
The license must be available through a URL or DOI.
Carry information about the source institution and survey.

Project Description

The license information referenced in second requirement above are to allow for the referencing of generic licenses without requiring the full text be carried within the BAG. This means the BAG is only able to reference generic licenses, and custom ones would need to be put online or obtain a DOI. While restrictive, this limits the size of data in the the BAG. Also, this encourages the use of generic (rather than custom) licenses. The following might be useful for some background on data licensing: https://data.research.cornell.edu/content/intellectual-property

The data type declared within the node metadata containing the corresponding key and value data should be labeled as "data_license_1_0", where the “1_0” (one_zero) is meant to convey a version number for the data type.

Four values are required for this data type as depicted in Figure 5 and described in Table 3.

Table 3 - A summary of the metadata fields to the contained in the values table for data_license_1_0.

Column Name	Column Type
Source Institution	String
Source Survey ID	String
License Name	String
License URL or DOI	String

Figure 5 - The data_licence_1_0 type

Feedback on this proposal to consider:

Some preliminary feedback and suggestions on this proposal are included with additional thoughts.

What if there is more than one contributing source for a node? In this case we suggest each contributing source be added within a single table entry, but separated by a semicolon within each value in the table.

Appreciation

We wish to thank Dr. Giuseppe Masetti at the University of New Hampshire Center for Coastal and Ocean Mapping for his substantial contributions to this proposal.

Implement BAG 2.0 Raster API

Implement and test the new BAG 2.0 raster interface

Add Python wrapper

Adding a Python wrapper to the BAG library would enable further use of the BAG reference library and format.

Add continuous integration to BAG repository

Adding continuous integration to the BAG repository would enable building for specific needs (library versions, etc) and environments (operating systems), and make the reference library more functional.

Refactor usage of `sprintf`

sprintf is deprecated in some SDKs or at least frowned upon for security reasons. Currently sprintf is used e.g. in bag.cpp, bag_metadata_export.cpp, and bag_metadata_import.cpp. These and other usages should be replaced with memory-safe alternatives.

Add License statement

From 2019-03-20 meeting minutes 2.1.9 and 2.3

Update the BAG format to comply with CF conventions

Updating the BAG format to comply with CF conventions was suggested at the Open Navigation Surface Working Group Meeting at US Hydro 2019. The outcome of this discussion was that more research was needed to understand the value and repercussions of this change. This is a handle for further discussion.

Add checks to the XML metadata

Enforcing some compliance with the XML metadata would improve usefulness of the BAG.

oss-fuzz: Issue 66337: opennavsurf-bag:bag_extended_fuzzer: Abrt in __cxxabiv1::failed_throw

oss-fuzz issue: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66337
details: https://oss-fuzz.com/testcase-detail/4909796750852096

Implement the BAG 2.0 Metadata API

Implement the new Metadata layer interface for BAG 2.0 based on the NOAA proposal in #2

Design BAG 2.0 C API

Design the C API for the BAG 2.0 interfaces.

Old version 2.0 Branch Issues for reference

This is the original proposal made by Bill Lamey from CARIS on 2014-09-23

BAG 2.0 API Proposal

This document is intended to outline proposed changes to the BAG 1.x API
for the next 2.0 release of BAG.

Rational

The BAG API has been extended overtime to include API’s for accessing
new data types in the BAG file. Originally, BAG was designed to hold the
Elevation and Uncertainty layers, but defined a concept of “Extension”
layers for future data.

In past versions, new Extension layers have been added, but the API has
lost some consistency (Figure 1), and has even begun to contradict the
original Format Specification (BAG FSD v1).

Figure 1. MANDATORY v1.5 BAG ELEMENTS WITH OPTIONAL SURFACES

The goal of this proposed revision is to clean up and standardized the
way BAG accesses/models data to provide a consistent API for the user.
This will also make it much easier to access BAG through other raster
tools (namely the GDAL interface). See Figure 2 for the proposed
restructure of the optional group layers separated into ‘conventional’
Extensional layer datasets.
|

Figure 2. MANDATORY v2.0 BAG ELEMENTS WITH OPTIONAL LAYERS

Proposal

The following is the list of changes suggested to achieve a consistent
API.

Standardize the naming convention for the BAG data structure.

a. Dataset – refers the BAG file in its entirety (ie Metadata +
Layers + Correctors + Certification)

b. Layer – refers to a single ‘layer’ of data within the BAG
data structure (ie Depth, Uncertainty, Std_Dev, etc).
Provide the minimum set of API calls to read and write a BAG file.
Encapsulate the internal data structures used to perform the actual
read/write of the BAG data.
Encapsulate the usage of any dependent libraries and internal
structures (eg hide all of the HDF5 nomenclature and types).
Model the Layers as consistent homogeneous data types.
Provide a separate API for the corrector information (since this can
be at different spatial resolutions than all the other layers). The
existing API for corrector information will be largely unchanged.
Remove internal memory management. Provide alloc/free methods
specific for BAG.
Update to use standard data types in its definition (ie uint32_t
instead of u32). For Windows builds, we’ll still need to define
these. Other platforms can simply include <stdint.h>
Add new example programs using the new API (read/write). Include
small sample BAG files.
Move code repository to an online distributed code management system
(eg Bitbucket).

Backwards compatibility

A subset of the existing APIs will be deprecated and moved to a
‘bag_v15.h’ file. A new define BAG_USE_LEGACY_API will be
introduced to allow users to easily continue to use the legacy API.
This is provided only to ease the transition to the new API. The
legacy API will only support reading/writing BAG v1.5, and reading
all prior versions. In order to create version 2.0 BAGs, users of the
library will need to update their code to use the new 2.0 API. This
legacy API will be removed in the next major release of the BAG library,
or at a future time as agreed by the architecture review board (ARB).

Format Changes

Ultimately, the proposed changes will cause binary format change in the
resulting HDF5 files (see Figure 1). We’ll no longer be storing the BAG
‘opt group’ data in complex data types, but rather as individual
extension ‘layers’ where each layer contains a single value at each
node. The new API will provide backwards compatibility to read 1.5 (and
earlier) versioned BAG files by de-interleaving the complex data types
internally and returning them as regular grids (layers).

At this time no XML metadata changes are planned for this new version.

Add fuzzing to automated CI tests

Fuzzing tests are useful for uncovering correctness errors and security vulnerabilities and should added to automated C++ and Python tests for the BAG library.

Implement new Variable Resolution API

Implement and test the new VR interface for BAG 2.0

Tests don't cover reading and writing data

Looking at some tests (test_bag_dataset.cpp and test_bag_vrrefinements.cpp), I'm not seeing tests to excercise wring data in layers or reading from them.

Is "geographic" and "geo" Misleading terminology in this case?

Maybe it's just me, but I think of geographic coordinates as being Lat/Long. As for projected coordinates, I usually call them projected coordinates or map coordinates.

This led me to first assume at first glance that the gridToGeo and geoToGrid methods went to and from Lat/Long. That seemed ambisious to me and I wondered how the projections where handled, then I looked at the code and realized my mistake.

BAG/api/bag_dataset.cpp

Line 1037 in e6f0b66

std::tuple<double, double> Dataset::gridToGeo(

Should we rename the methods to something that won't suggest Lat/Long? We should at least clarify in the comments/documentation that the returned coordinates are projected (unless the grid is in lat/long!).

opennavigationsurface / bag Goto Github PK

bag's Introduction

Bathymetric Attributed Grid (BAG) - Open Navigation Surface Project

Documentation

Installing and using the BAG library

Building and using the BAG library

bag's People

Contributors

Stargazers

Watchers

Forkers

bag's Issues

General

CMake

Catch2 tests

C++ Library

Python binding

Continuous integration

A Proposal to

The Open Navigation Surface Working Group

for updating the

Bathymetric Attributed Grid

specification with tables and layers for metadata (updated proposal)

2019 March 13

Introduction & Background

Part A - Georeferenced Metadata

Requirements

Project Description

Feedback on this proposal to consider:

Part B - Quality of Bathymetric Data Metalayer

Requirements

Project Description

Feedback on this proposal to consider:

Part C - Data Licensing Information

Requirements

Project Description

Feedback on this proposal to consider:

Appreciation

BAG 2.0 API Proposal

Rational

Proposal

Backwards compatibility

Format Changes

Recommend Projects

Recommend Topics

Recommend Org