Giter Site home page Giter Site logo

Comments (9)

GlenRice-NOAA avatar GlenRice-NOAA commented on August 24, 2024 2

By way of an informal update, the NBS project is reaching a phase where this proposal will become useful. As such we are gaining the perspective for which field belong in which layer. At this point we think the temporal variability field does not belong in the BAG, and the survey dates should be part of the survey data rather than the quality fields.

from bag.

GlenRice-NOAA avatar GlenRice-NOAA commented on August 24, 2024 2

This proposal should be updated to reflect:

  1. A single layer that encompasses both parts B and C of the proposal. This would correspond well to the BlueTopo Spec and where S-102 seems to be headed, which would allow for easy translation between depending on the application.
  2. We should add a version number to the S-101 definitions to make the definitions clear.
  3. We should move the version of the metadata layer from the name to an attribute of the layer.
  4. temporal_variability and perhaps data_assessment should be removed from the attributes.
  5. The version number for this should be set to 2.0.1 to as not to conflict / confuse with the current GDAL implementation.

from bag.

GlenRice-NOAA avatar GlenRice-NOAA commented on August 24, 2024

This proposal was discussed at the Open Navigation Surface Working Group Meeting at US Hydro 2019 conference and received generic approval. Implementation details will be hashed out in a branch and approved through a pull request.

from bag.

GlenRice-NOAA avatar GlenRice-NOAA commented on August 24, 2024

Proposal for next steps:

  1. Fork the repository
  2. Build a reader for the generic data within a secondary library and against a test data file. This should use the BAG library to access the georeferencing information stored in the BAG XML node, but otherwise avoid touching the rest of the library. This reader should provide functions to (a) provide a list of the available layers and the georeferenced metadata type, and (b) provided the layer name return the table, and the array from the HDF5 subnode. Memory allocation and generic code structure will follow modern object oriented coding practices.
  3. Build a writer which inserts a provided metadata table and array as a provided name and data type into a provided BAG. The raster array should be formed to work with the BAG metadata georeferencing information.
  4. Ask for comment on the implementation from the Open Navigation Surface Working Group.
  5. Upon further dialog and input from the ONSWG, keep the current interface to the new data type or integrate it into the original code style.
  6. Add checks for validity when reading and writing for the declared georeferenced metadata types against the metadata table and according to a list of types registered in the library.

from bag.

GlenRice-NOAA avatar GlenRice-NOAA commented on August 24, 2024

There has been a request to add an additional (optional) field to Part C of this proposal. This would be a string which is a DOI to the data and thereby enabling data discovery.

from bag.

tdy-blamey avatar tdy-blamey commented on August 24, 2024

Proposal is accepted. Implementation of the proposal will be handled in #24

from bag.

GlenRice-NOAA avatar GlenRice-NOAA commented on August 24, 2024

A generic spatial metadata layer has been implemented in the V2 work. Further work is needed to define the specific layer types.

from bag.

GlenRice-NOAA avatar GlenRice-NOAA commented on August 24, 2024

A Proposal to

The Open Navigation Surface Working Group

for updating the

Bathymetric Attributed Grid

specification with tables and layers for metadata (updated proposal)

2022 October 13

Introduction & Background

The Bathymetric Attributed Grid (BAG) was created to service the needs of the hydrographic community and target the concept of a Navigation Surface on a survey by survey basis. The design incorporates depth, uncertainty, and some metadata for the associated survey, but the ability to provide metadata with an associated geospatial context is limited. Some metadata, such as those attributes associated with S-101 (e.g., Quality of Bathymetric Data), may be better described with some geospatial context to enable different quality designations within an area. Importantly, these data may describe survey information that are not covered directly by the bathymetric layer. For example, it would be valuable to carry survey wide quality information, even when some of the survey coverage was contributed by side scan. Here we propose a key-raster and value-table pair, commonly called a Raster Attribute Table (RAT), that would enable the description of metadata on a node by node basis (Figure 1).

image_0
Figure 1 - georeference metadata as key-raster value-table pairs.

A RAT could serve the needs of many different types of metadata. The driving desire here is to enable the transport of a S-101 Quality of Bathymetric Data-like meta attribute within the BAG. In recognition of these two distinct parts of this proposal the generic data type to satisfy storing this data within the BAG is in Part A, while a specific implementation of this data type will be discussed for a modified version of the Quality of Bathymetric Data in Part B. These specific implementations become "profiles" with definitions provided by the profile maker and added to a purposed appendix of the BAG specification.

Part A - Georeferenced Metadata

Describing metadata on a node by node basis is important when full spatial resolution of the metadata is desirable and there is only one result for each node. Assuming many of the metadata values will be repeated as a set, a raster containing a key at each node can reference a table row of metadata values.

Requirements

  1. Spatially keyed metadata included in the BAG.

  2. Minimally impact the size of a BAG.

  3. Leave the current XML metadata unchanged.

Project Description

As a generic metadata container the RAT may be reused to describe various metadata types. To enable reuse of this type of container without polluting the root node we propose storing all georeferenced metadata in a dedicated node under the BAG_root node named "Georef_metadata". Each instance of the raster / RAT pair would then be in an additional sub-node with the name of the profile as the name of the subnode. The subnode metadata header would contain a georeferenced metadata type, such as the types described in part B of this proposal. These specific types should not need to be interpreted directly by a library using the BAG, but instead are meant to provide context for the specific meaning of the values stored in the value table for the downstream user. See part B of this proposal for an example.

Within each subnode the single resolution key-raster and value-table pair would be named as "keys" and “values” respectively. In the case that variable resolution metadata keys are available it would be stored as “varres_keys”, reference the same “value” metadata table, and correspond to the indices described in varres_metadata in the same manner as varres_refinements.

An overview level layout of the proposed update is given in Figure 2.

image_1

Figure 2 - The BAG structure with proposed structure additions. Each of the georeferenced metadata field have their own profile (eg "Profile A").

In Figure 2 there happen to be three georeferenced metadata layers. The metadata Alpha is of metadata profile A, while the Cheese and Juice layers are profiles B and C respectively.

If the assumption holds that there are a limited number of valid combinations of the metadata values in the raster, the number of entries in the table should be small and the raster should be highly compressible (raster in Figure 3, table in Table 1).

image_2
Figure 3 - Left, the elevation layer, right, the node metadata layer with table indice keys. The resolution of both rasters is the same despite their depiction here for clarity.

Table 1 - An example of a table corresponding to Figure 3.
image_3

The table information is matched to the relevant raster, and vice versa, through corresponding keys. This enables finding metadata information based on location using the georeferenced raster keys to look up metadata in the table, or finding the locations that have particular metadata by searching the raster for keys that correspond to particular metadata in the table.

Because the key-raster layer corresponds to the other raster layers found in the root node, the georeferencing information within the BAG should also govern this layer. The same number of rows and columns should exist in the key-raster as exist in the root level arrays for elevation and uncertainty.

Keys in the raster are only unsigned integers and should correspond to the row number in the metadata table. Because the HDF structure is self describing there is not a need to predefine the byte size of the key-raster data type. A single byte will likely satisfy the needs of many BAG files, but could be expanded if a large number of rows are required in the metadata table. The byte size required can be determined at the time of BAG formulation. Because the data type for the key-raster may differ from other raster arrays in the file, the no data value is zero ("0"), and the first row in the value-table shall be correspond to integer one (“1”).

At this time we expect the metadata layer concept will be added to the rewrite of the BAG library in C++ with Python wrappers.

Feedback on this proposal to consider:

Some preliminary feedback and suggestions on this proposal are included with additional thoughts.

  1. There is concern that needing to write a raster or refinements layer when the values are the same for the surveyed area is a waste of space, even despite compression. If this is a cause for concern we propose that if there is only one row in the value-table metadata that the key raster or varres_key not be required with the assumption that the metadata applies to all locations where there is data in the elevation layer.

  2. Making this layer type compatible with* CF conventions** (http://cfconventions.org/) would enable easy viewing without special parsers.* A thorough investigation into the CF conventions has not been undertaken. The value of enabling the CF convention needs to be further discussed and defined.

  3. Making each metadata value (column in the value-table) its own layer would be another way to convey this information and also be potentially more flexible. While this is true, it could require many more raster layers which would contain largely the same values. While compressible, lots of rasters could also add significant clutter while increasing the ambiguity of the metadata available. Having specific metadata types means there can be an expectation of the information available, which is valuable when targeting specific needs, and reduce the number of datasets available in the file structure.

  4. What would happen if one of the columns that we need to read are not there for some reason? This is the reason for the data type definitions in the following parts of this proposal. We think a check to ensure compliance with the data type definition can be put in place when the BAG is accessed for read. In this case the layer would be labeled as invalid and ignored if it doesn’t meet the define table parameters.

  5. How would this type of data structure fit within the GDAL data structure? The raster attribute table (RAT) in GDAL is attached to a raster band and would be a natural structure to house proposed raster and table pair. The only difficulty anticipated by the GDAL team is the RAT is not compatible with all HDF data types, thus forcing some form of data type mapping if the GDAL library is not updated to be include these additional types in the RAT format. More information can be found here: https://www.gdal.org/classGDALRasterAttributeTable.html#details . At this time the RAT concept has been added to the GDAL BAG driver as originally proposed.

Part B - OCS-2022-10 Profile

As noted in the Cathedral and the Bazaar (E.S. Raymond 2001), "Every good work of Software starts by scratching a developer's personal itch." In this case the NOAA National Bathymetric Source (NBS) Project envisions using the BAG format to convey both bathymetry, a description of the quality of the bathymetry, and information about the source. The idea is to encapsulate the information to answer to the questions "how deep is it?", "how well do we know it?", and "where did it come from?". While the current format contains a vertical uncertainty layer to quantify "how well do we know it?", we wish to include additional information we often provide in the sub attributes of the S57 metaobject M_QUAL. Thinking forward, this should be formulated as S-101 Quality of Bathymetric Data meta type with the assumption that the information may be backported to S-57 where needed. We think adding this information aligns well with enabling the idea of a navigation surface. We also think this additional information will increase the value of the products we produce and enable seamless internal use of our data.

The specific use case envisioned for the NBS project adds value in two phases. First, processes downstream can utilize the bathymetry quality information directly in the BAG without having to carry around a supplemental file. The NBS supersession logic uses this quality information to sort which data should be used as part of the national bathymetry, so having quality directly integrated into the bag is valuable. Also, this is the same information that is currently provided to our Marine Chart Division as a supplemental file, so it would streamline that processes as well. Second, the NBS project currently creates a BAG which contains the amalgamation of many different sources with different qualities. Having these various qualities tracked within the BAG would simplify the products from the NBS and streamline our process.

It is worth noting that NOAA commissioned surveys rarely have more than one or two designations for the quality of data, so when a BAG represents a survey we expect only a few entries in the attribute table. However, for BAGs that support a combination of many surveys, a large number (thousands) of entries in the table may exist as demonstrated in some of the BlueTopo compilations.

Requirements

  1. Carry the established NOAA Office of Coast Survey metadata attributes for defining Quality of Bathymetric Data attributes and information about the source.

  2. Enable the representation of survey coverage (from side scan, etc) as distinct information from bathymetry.

  3. Carry georeferenced data license information for all contributing data.

  4. The license must be available through a URL or DOI.

  5. Carry identifying information about the source institution and survey.

Project Description

As an implementation of the georeferenced metadata layer type discussed in Part A of this proposal, this section is meant to define a georeferenced metadata type corresponding to the S-101 Quality of Bathymetric Data with some additional attributes. This profile is expected to be more of a registration of the data types as a library accessing this information should just pass it to a user for interpretation. A writer of this information may wish to implement some checks for consistency in the metadata type dependencies declared in S-101 and included the discussion here.

The data type declared in the node containing the key and value data should be labeled with the profile name, such as "OCS-2022-10". We have chosen to provide a date with this profile name to make it distinct from future Office of Coast Survey profiles.

The proposed fields for the metadata table are depicted in Figure 4 and summarized in Table 2.

image_4
Figure 4 - The OCS-2022-10 type as an implementation of Figure 2.

Table 2 - A summary of the metadata fields corresponding to S-101 quality of bathymetric data to be contained in the "values" table. Values which directly reference the S-101 definition are noted. See later discussion on the other items.

Column Name Column Type Note
significant_features Boolean See S-101 significant features detected.
feature_least_depth Boolean See S-101 least depth of detected feature measured.
feature_size Float See S-101 feature size.
feature_size_var Float See further discussion (4)
coverage Boolean See S-101 full seafloor coverage achieved
bathy_coverage Boolean See further discussion (5)
horizontal_uncert_fixed Float See S-101 horizontal position uncertainty fixed
horizontal_uncert_var Float See S-101 horizontal position uncertainty variable factor
survey_date_start String See S-101 Survey date start
survey_date_end String See S-101 Survey date end
Source Institution String e.g. "NOAA Office of Coast Survey"
Source Survey ID String e.g. "H99999"
License Name String e.g. "CC0 1.0"
License URL or DOI String e.g. "https://creativecommons.org/publicdomain/zero/1.0/"

Many of the proposed fields for the value table correspond directly to the S-101 metaclass Quality of Bathymetric Data. There are, however, some exceptions.

  1. Depth range maximum and minimum in S-101 are omitted. If the depth range information is required the assumption is that if this information is required than the corresponding nodes in the elevation layer can be queried for a minimum and maximum depth for each table row.

  2. Data assessment in S-101 is omitted.

  3. Temporal variability is S-101 is omitted.

  4. feature_size_var is meant to augment feature_size which corresponds to S-101 size of features detected. As noted in S-101, size of features detected is intended to be described as the smallest size in cubic metres the survey was capable of detecting. Depending on the type of survey this definition might force different depth ranges to have different values. For example, a survey vessel that works at a fixed height off the seafloor could maintain a fixed feature detection size capability over a wide range of depths. A surface vessel working over those same range of depths may have a feature detection capability that varies with depth causing the detection capability to be ambiguous and potentially misrepresented. For this reason feature_size_var is the percentage of depth that a feature of such size could be detected. When both feature_size and feature_size_var are present the greater of the two should be considered valid. The expectation is that feature_size_var will be set to zero if the feature size does not scale with depth. As with feature_size, feature_size_var should be ignored if significant_features is False.

  5. Coast Survey often uses side scan to detect features in flat seafloor areas, thus these surveys have coverage that do not contain direct depth measurements. In these cases the nodes with survey coverage but without bathymetry would be set to False. A condition with coverage = True and bathy_coverage = False is a useful indicator for how to work with these nodes within our workflow. If coverage is False, bathy_coverage must also be False.

  6. Vertical uncertainty is excluded from this table as the vertical uncertainty is reported node by node within the BAG structure.

  7. As NOAA Office of Coast Survey has begun to ingest more data not owned by NOAA, proper tracking of data licensing is important for where data can be used internally or in public products. Particularly in the case of the National Bathymetric Source Project, tracking the various contributing sources within a final product can be helpful. While BAG already maintains a use qualifier within the top level metadata, a georeference metadata layer would allow for proper tracking of the data license by node to ensure information regarding restrictions on redistribution of data is not lost.

Feedback on this proposal to consider:

Some preliminary feedback and suggestions on this proposal are included with additional thoughts.

  1. Can the metadata column names follow the S-101 names exactly so as not to create a new set of names that need to be mapped? The answer is yes, but this would make the names very long. We are happy to support a clear convention suggested by the Open Navigation Surface Working Group.

Appreciation

The concepts described in this proposal are the summary of a lot of people's work over a non-negligible amount of time. We appreciate this sustained effort and look forward to success in the near future.

from bag.

GlenRice-NOAA avatar GlenRice-NOAA commented on August 24, 2024

This work has been implemented. Thanks to all for their effort on moving this concept forward.

from bag.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.