noaa-owp / ngen-cal Goto Github PK

View Code? Open in Web Editor NEW

8.0 8.0 15.0 1.38 MB

Tools for calibrating and configuring NextGen

Home Page: https://github.com/NOAA-OWP/ngen-cal/wiki

License: Other

Python 100.00%

ngen-cal's People

Contributors

Stargazers

Watchers

Forkers

hellkite500 robertbartel stcui007 jdmattern aaraney lcunha0118 wanruwu-noaa mattw-nws jordanlasergit ben-choat xianwuxue-noaa cuahsi ajkhattak shansura

ngen-cal's Issues

Investigate if ngen.cal pandas version needs to be pinned (pandas~=1.1.2)

ngen-cal/python/ngen_cal/setup.cfg

Line 33 in 39e2edf

pandas~=1.1.2

Ngen-cal strategy field unexpectedly expressed as enum in json schema

ngen-cal/python/ngen_cal/ngen.py

Line 240 in fdca562

strategy: Literal[NgenStrategy.explicit]

From the src, it looks like what this should expand to is the following:

class NgenExplicit(NgenBase):
    
    strategy: Literal["explicit"]

However, when you export the model as a json schema using pydantic, you get the following:

      // ...
      "strategy": {
          "title": "Strategy",
          "enum": [
            "explicit"
          ],
          "type": "string"
        }
      // ...

I think the following should remedy this:

class NgenExplicit(NgenBase):
    
    strategy: Literal[NgenStrategy.explicit.value] = NgenStrategy.explicit.value

MultiBmi's root validator does not consider that passed in module could be a pydantic model object

ngen-cal/python/ngen_conf/src/ngen/config/multi.py

Lines 52 to 58 in 05fe8ce

    
           modules = values.get('modules') 
        
           if not name and modules: 
        
               try: 
        
                   names = [ m['params']['model_name'] for m in modules ] 
        
               except KeyError: 
        
                   names = [ m['params']['model_type_name'] for m in modules ] 
        
               values['model_name'] = '_'.join( names )

Include NetCDF forcing provider

In configurations.py, class Forcing, a NetCDF provider needs to be added.

ngen-cal support calibration semantics on ngen layers

Support calibrating a layer/regional model combined with catchment specific model formulations.

hard coded output file name. Might be better in config file

flowdveldepth_Ngen.h5 is hard coded in ngen.py. Different versions of t-route had different files names. This would be better defined in the configuration file.

Port updated objective function implementations from private repo

Need to review metric_functions and their usage in search.py. Some may be simplified, others may be good candidates to PR to hydrotools.metrics.

Combine objective and metric_function modules.

Update ngen-cal hydrofabric reading to use latest hydrofabric geopackages

Official hydrofabric releases are no longer in geojson form. While the required information can be extracted from the hydrofabric geopackages and transformed to geojson format currently supported by ngen-cal, it would be best to provide direct support for reading the geopackage.

This would also reduce the amount of hydrofabric related keys in the input configuration file, reducing the chances of configuration errors.

Current behavior

ngen-cal only works with geojson inputs

Expected behavior

ngen-cal works with geopackage inputs using latest stable hydrofabric release.

Update config generation for new supported BMI modules

Some new BMI configs needs to be added to the ngen config generator, these are the ones that need to be added:

soil freeze thaw
sloth
pet

CFE needs to be updated/versioned for the latest version as well.

Duplicated github actions?

After seeing some tests fail, I noticed that the two github actions in the repo have the same name and appear to be doing the same thing, however python-package.yml is out of date. Can we remove, python-package.yml?

workdir used incorrectly in agent construction

https://github.com/NOAA-OWP/ngen-cal/blob/aa9c9847b5210facb1e1e7054aa851e6720e38df/python/ngen_cal/src/ngen/cal/agent.py#L82C44-L83C1

The workdir no longer part of model_conf as an input, so this line should be either

relative_to = self.job.workdir

or probably better to change the following line

self._model.model.resolve_paths(relative_to)

self._model.model.resolve_paths(self.job.workdir)

and simply remove the relative_to temporary variable.

Port module plotting improvements from private repo

Remove plotting from ngen cal package and put plotting scripts in a higher level plotting/utility directory.

Port validation mechanics from private repo

validation_run.py should be its own script on par with calibraiton.py. Could also add support in main.py to run validation.

This will have significant overlap with #102.

Port remaining configuration utilities from private repo

Partitally addressed with #97 for adding support to ngen_conf for new modules.

Additional configuration mechanics need to be evaluated from the createInput package and aligned with ngen-cal configuation.py where it makes sense.

Additional customizations and configuration creation may be suitable for a user script.

ngen-config Update model_param spec

https://github.com/NOAA-OWP/ngen-cal/blob/3a170b81632050381f14c374b1eaeac06da0f607/python/ngen_conf/src/ngen/config/bmi_formulation.py#L45C5-L45C17

ngen now supports a model param mapping to link model params to hydrofabric feature attributes.

The config structure can be seen here in this test code.

The config schema should be updated to reflect this change and properly validate/generate configs that may have this structure.

Update wiki usage guide to include PSO capability

The current usage guide on the wiki shows a configuration for DDS which implies that it is the only supported optimization strategy, which isn't the case. At the very least, the config shown there should be modified to remove that disclaimer, and perhaps a section added on how to switch to using PSO.

Typo in formulation model constraint

ngen-cal/python/ngen_conf/src/ngen/config/formulation.py

Line 8 in fdca562

params: "KnownFormulations" = Field(descriminator="model_name")

This should be discriminator. Likewise, the discriminator should be model_type_name not model_name. model_type_name is the alias for model_name, meaning model_type_name is the key in the configuration file.

Edit:

While this was an issue, by correctly spelling discriminator, this forced all model_type_name fields to be type Literal. This is okay in most cases, however it is not desirable for MultiBMI. MultiBMI's model_type_name field should not be constrained like the other BMI model pydantic model types.

However, there still needs to be a way to determine the correct pydantic model variant from the union of formulations, KnownFormulations. The name and model_type_name fields are used for this purpose. name corresponds the bmi adapter name used by that model (e.g. bmi_fortran) and model_type_name roughly corresponds to the model name (e.g. NoahOWP). For non-multi bmi models, the name and model_type_name are constant values (e.g. bmi_fortran and NoahOWP for the NoahOWP pydantic model). In the case of the MultiBMI pydantic model, only name is a constant value, bmi_multi, model_type_name can be any valid string.

However, it feels tedious that it would be required to specify these constants when programmatically creating a for example, a NoahOWP instance. For this reason, these constants are not required when creating a KnownFormulations type. Likewise, it is possible to pass a KnownFormulations instance when creating a Formulation object. However, to deserialized into Formulation object, it is required that name and model_type_name are present for non-multi bmi types and ,potentially only, name it equals bmi_multi`.

Initial run attempting to work out of temporary worker directory instead of workdir provided in calibration config file - Causing failed execution.

Short description explaining the high-level reason for the new issue.

Current behavior

On exectuing ngen-cal, I received the below traceback ending with an error suggesting pandas merge function was being applied to an object of NoneType. The error was produced in cal/search.py, line 25 in _objective_func when the following was executed: pd.merge(simulated_hydrograph, observed_hydrograph, left_index=True, right_index=True).

o Traceback (most recent call last):
o File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
o return _run_code(code, main_globals, None,
o File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
o exec(code, run_globals)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/main.py", line 87, in
o main(general, conf['model'])
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/main.py", line 63, in main
o func(start_iteration, general.iterations, agent)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/search.py", line 190, in dds_set
o _evaluate(0, calibration_set, info=True)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/search.py", line 56, in _evaluate
o score = _objective_func(calibration_object.output, calibration_object.observed, calibration_object.objective, calibration_object.evaluation_range)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/ngen/cal/search.py", line 25, in _objective_func
o df = pd.merge(simulated_hydrograph, observed_hydrograph, left_index=True, right_index=True)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/pandas/core/reshape/merge.py", line 74, in merge
o op = _MergeOperation(
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/pandas/core/reshape/merge.py", line 593, in init
o _left = _validate_operand(left)
o File "/home/west/git_repositories/ngen_20231127_calib/ngen/venv/lib/python3.10/site-packages/pandas/core/reshape/merge.py", line 2066, in _validate_operand
o raise TypeError(
o TypeError: Can only merge Series or DataFrame objects, a <class 'NoneType'> was passed

Expected behavior

The code runs, producing an automated approach to calibration.
On initial run, the working directory is workdir as defined in the calibration configuration file.

Steps to replicate behavior (include URLs)

In Ubuntu 22.04
use build_ngen_calib.sh in the attached zip folder to build ngen and set up ngen-cal. You may wish to edit the file to specifcy where ngen is built, for exmample.
create a symlink in the attached folder to the ngen folder after it is built.
Run ngen-cal with python -m ngen.cal calib_config_CAMELS_CFE_Calib_Sep_2.yaml

This should run, but let me know when it doesn.t.

Proposed solution

After scouring through the code, I found that JobMeta() in cal/meta.py, takes both an argument for parent_workdir, and workdir,
def __init__(self, name: str, parent_workdir: Path, workdir: Path=None, log=False):

But when JobMeta was called from the Agent() class, self._job was None, triggering the following call to JobMeta with only a value for parent_workdir provided.
https://github.com/NOAA-OWP/ngen-cal/blob/master/python/ngen_cal/src/ngen/cal/agent.py#L80
self._job = JobMeta(model_conf['type'], workdir, log=log)

So, workdir was being passed to JobMeta() as parent_workdir, and workdir was being passed as None (the default value), which triggered the xxx_worker directory to be the main working directory.

By providing workdir twice to JobMeta, the calibration seems to be running as expected.
self._job = JobMeta(model_conf['type'], workdir, workdir, log=log)

Screenshots

RecreateIssue.zip

Add PET's default registration function

ngen-cal/python/ngen_conf/src/ngen/config/pet.py

Line 5 in a2e2955

class PET(BMIC):

A register_bmi_pet as PET's default registration_function.

Improvement: Include options for model evaluation (runoff versus streamflow)

Goal: add the capability of evaluating calibration based on runoff (results in cat files). This is relevant when we want to isolate hydrological uncertainties.

Changes will include additional parameters, and modification f the calibration_set.py function output.

See description here or below.

Additional parameters -add to yaml configuration file:

Eval_method: 'runoff’ (default is streamflow, future implementations might include an array like [precipitation, runoff, soil moisture])
runoff: evaluates volume of runoff generated in the basin. Not necessarily routed.
streamflow: evaluate the routed runoff

Evaluation_variable: "QOUT"
Column number or name of the variable to be evaluated in the ngen model output. Future implementations might include an array of variables [RAINFALL, QOUT, SW_STORAGE]). This maps the Eval_method to the simulation variable that corresponds to it.

Reampling:
This is an optional configuration. If not provided, evaluation will be performed at the temporal resolution of the simulated data. This is especially important when evaluating runoff, since we need to remove uncertainties due to the lack of routing.
Resampling_time: 'D'
Temporal resampling. Follows the pandas resample standard. For example, ‘D’ is for day, ‘H’ is for hour
Resampling_method: ‘sum’
#Optional parameter. Default is sum. Follows the pandas resample standard.
unit: "meter/hour"

Modification to the calibration_set.py, function output:

If Eval_method==”streamflow” - existing logic
If Eval_method==”runoff”:

Aggregate runoff for all cat files upstream of the point of interest. See logic below:

import geopandas as gp
import pandas as pd
catchment_file='catchment_data.geojson'
Results="./"
zones = gp.GeoDataFrame.from_file(catchment_file)
for index,row in zones.iterrows():
catstr=row[0]
area=zones.area_sqkm.iloc[index]

Cat_out_file=Results+catstr+".csv"
Cat_out=pd.read_csv(Cat_out_file,parse_dates=True,index_col=1)
if(isinstance(Cat_out.index.min(),str)):
		Cat_out.index=pd.to_datetime(Cat_out.index)

if(index==0):
		Total_out=Cat_out.copy()*area
else:
		Total_out=Total_out+Cat_out*area

Total_out=Total_out/sum(zones.area_sqkm)

Change the units*: for now we will change the units of the observed data to match the simulated. Another option is to change the unit of the simulated data. Eventually this will talk to unit test, but for now If “unit” is mm/hour, convert the observed to mm/hour: Obs_q_mh=(Obs_Q_cms/total_area)3.6/1000.
Resample in time: resample both the observed and simulated in time.

*Note - might be better fit to function evaluate

Port Grey Wolf optimizer functions from private repo

Refactor gwo_swarms to eliminate copy of pyswarm base class
Consider need to implement intermediate class of SwarmOptimizer
Ensure GlobalBestGWO implements regardless of previous choice
- write_hist_iter_file
- read_hist_iter_file
implement gwo_search in search.py
extend general.strategey.algorithms for gwo in strategy.py
add gwo option to calibration.py

Consider refactoring model/module inheritance heiarchy.

Per feedback on PR #17:

Consider adding a complementary ModuleConfig type instead of putting, e.g., CFE in the same hierarchy as BMIC.

Variable_names_map can change depending on previous model

ngen-cal/python/ngen_conf/src/ngen/config/cfe.py

Line 28 in 804c200

_variable_names_map = {

This is problematic because it depends on which model is run before CFE. If we are running Noah-OWP:
_variable_names_map = {
"water_potential_evaporation_flux": "EVAPOTRANS",
"atmosphere_water__liquid_equivalent_precipitation_rate": "QINSUR"
}

If running PET we dont need to define anything, since the variables are named appropriately. Or for completeness:

                     "variables_names_map": {
                                "atmosphere_water__liquid_equivalent_precipitation_rate": "atmosphere_water__liquid_equivalent_precipitation_rate",
                                "water_potential_evaporation_flux": "water_potential_evaporation_flux"
                            }

Not sure why that needs to change in the realization file. Can we just keep those variables the same as provided by the user?

Add optional `output_root` to `ngen.config`

Add optional output_root: Optional[Path] key. Need to double check all the places where it is accepted. I think just ngen.config.NgenRealization? This was added to NextGen in NOAA-OWP/ngen#531.

Port notification upon job completion functionality from private repo

Expand ngen-cal utilities, consider some simple event notifications which allow user/caller scripts to do what they see fit.

Do not leave user property in Evaluatable or EvaluationOptions!

Consider the case where a gaged nexus is shared by two or more catchments.

When a gaged nexus is shared by two or more catchments, a multi-catchment calibration should still be reasonable to perform.

Current behavior

In this scenario, a runtime error is raised: RuntimeError( "Currently only a single nexus in the hydrfabric can be gaged")

Expected behavior

Allow the calibration to proceed, since it is still only a single gaged nexus, even though it has multiple catchments upstream of it.

Implementation notes

This is likely a logical bug that comes from tracking the eval_nexus found in the hydrofabric using a list, and not a set. It should be as simple as changing that data structure so that duplicate nexus id's don't imply multiple gaged nexuses.

Deprecate calibration.py

The calibration.py file is a second interface to the ngen-cal package, and the functionality has been migrated to the package __main__.py module. This should probably get deprecated and removed as the calibration entry point and users/workflows should update to calling the package directly.

Port output handling improvements from private repo

Incorporate improvements to handling outputs from both models and calibration package itself.

Consider an I/O mixin/abstraction to help especially with the overlap of the semantics for things like "calibration" and "validation" output requirements.

Consider adding lifecycle events to `ngen.cal`

In brief, calibration typically takes on the form of a run -> evaluate -> adjust loop. Given that ngen.cal is a library, it desirable to expose apis for users to hook into these lifecycle events so they can do things . Some examples:

send an email at different stages of completion (e.g. 50%, 100%)
chart state space exploration over the duration of the run
stop early once a threshold has been reached

Of course you could do this by creating subclasses and overriding the current implementation, however that requires a high level of knowledge of the code to do what is often a simple task. A less intrusive approach is to implement some kind of event broker that emits events during certain points in the lifecycle of a calibration and provide a way for client code code to hook into those events. This is commonly done by registering a callback function for a given event type that the broker calls when the event is emitted. However, this is by no means the only mechanic for providing this kind of functionality.

Agnostic of the implementation, the goal of this potential addition is to provide a solution that has a simple and straightforward api that has a low barrier for usage. Meaning, you don't have to know much if anything about the internals of ngen.cal to get something working.

Remove redundant GitHub action workflow

There are two GitHub action workflow files that essentially do the same thing -- running unit tests for the package. We need to pick one to use and remove the other so we aren't running the tests twice.

Use of overloaded and/or inverted objective functions for algorithms requiring minimizing vs maximizing of cost/score

Per @mattw-nws comment on #41:

Strongly believe we shouldn't change this/use the same name... can we instead add
"one_minus_kling_gupta": objectives.kge,
... and really, also change kge to be named one_minus_kge?
Alternatively, we could add code that detected a non-zero target whenever PSO was used, and automatically optimize for target - f(x) instead, issue a NOTICE, and then re-invert the value when we're outputting the objective scores to the user? This last part > would be important if we do "magic" like this.

We need to take some steps to address these issues.

ValueError when exporting ngen.config LSTM and BMIPython json schemas's

I receive the following exception when exporting either LSTM's or BMIPython's json schema. Likewise, this affect all classes that either inherit or compose these classes.

ValueError: Value not declarable with JSON Schema, field: name='python_type_PyObject' type=PyObject required=True

I verified this issue is coming from use of pydantic.PyObject as a type hint is the above models. Ive not looked into possible solutions for this, but im sure we can figure out a work around.

Reproduce

from ngen.config.realization import Realization

with open("Realization.schema.json", "w") as fp:
    fp.write(Realization.schema_json(indent=2))

Add utility function for validating that paths exist

As it stands now, ngen.config's pathlib.Path pydantic model fields do not validate if the path exists. This is done on purpose to improve usability if you are for example, programmatically generating a configuration file or just performing validation. In some cases, however, it is really useful to determine if files exist or not.

Add a utility function that given a pydantic model instance, recursively validates that any pathlib.Path field exists. It should be easy to accomplish this using the magic property __fields__ on a pydantic.BaseModel instance. It is important not to fail fast and that all files that do not exist are captured in some sort of error group that is accessible by the caller.

Using default mapping in config/cfe.py causes PET+CFE calibration to fail

Short description explaining the high-level reason for the new issue.

Current behavior

Using default mapping in config/cfe.py causes PET+CFE calibration to fail, but removing causes PR checks for build (3.7) and build (3.8) to fail.

I submitted a PR commenting out the default mapping, Update cfe.py - remove QINSUR as default variable name mapping #90, but configuration checks fail with below error.
It appears the tests expect the default mapping to be present.:

=========================== short test summary info ============================
FAILED python/ngen_conf/tests/test_cfe.py::test_name_map_default - KeyError: 'atmosphere_water__liquid_equivalent_precipitation_rate'
FAILED python/ngen_conf/tests/test_multi.py::test_name_map - KeyError: 'atmosphere_water__liquid_equivalent_precipitation_rate'
======================== 2 failed, 101 passed in 0.46s =========================
Error: Process completed with exit code 1.

Expected behavior

Calibrating with PET+CFE works.

Steps to replicate behavior (include URLs)

Run calibration with PET+CFE. (Note SLOTH is required with CFE).

Screenshots

Allow calibration configuration to be provided via config file

Various configuration parameters need to be available to a user to customize the calibration being performed. These should be loaded by a configuration file.

Current behavior

Several calibration specific configuration variables are currently hard coded in calibration.py and requires manual editing to change them.

Expected behavior

A configuration file should be passed to main which sets the various configuration options, with sensible defaults where available.

Create a Calibratable abstract interface to support a CalibratableCatchment

Create a Calibratable abstract interface that can be used by DDS or other implemented search algorithms define the required parameter and data access to support automated calibration of a Calibratable object. This will be used to extend the hypy.Catchment class to make a CalibratableCatchment that calibration algorithms can use.

Why is CFE's "water_potential_evaporation_flux": "EVAPOTRANS" commented out?

I am not sure why this is commented out, if we are not explicitly running any PET module then CFE takes PET (i.e., EVAPOTRANS) from NOAH_OM as an input or am I missing something here
#"water_potential_evaporation_flux": "EVAPOTRANS",

Originally posted by @ajkhattak in #49 (comment)

@hellkite500, @ajkhattak raised this question in reviewing #49. Can you have a look at this an add some your insight?

Formalize Crosswalk format

I ran into an issue performing a calibration that resulted in a RuntimeError: Currently only a single nexus in the hydrfabric can be gaged exception. @mattw-nws, shared a patch to get around this issue, however it appears we can avoid this error and improve error handling if we are to formalize a crosswalk format.

At the present it appears the expected format is:

{
"(cat|wb)-[0-9]+": {
  "Gage_no": int
  },
  ...
}

ngen-cal/python/ngen_cal/src/ngen/cal/ngen.py

Line 122 in 55968df

self._x_walk[id] = gage[0]

@hellkite500, are you open to introducing a pydantic model(s) to handle deserializing crosswalk files? I think something like the following will work:

from pydantic import BaseModel, Field, constr

# typing imports
from typing import Dict

class CrosswalkItem(BaseModel):
  gage_no: int = Field(..., alias="Gage_no")

class Crosswalk(BaseModel):
  __root__: Dict[constr(regex="^(cat|wb)-[0-9]+$"), CrosswalkItem]

Provide parameter sensitivity option

Feature enhancement to allow for parameter sensitivity runs.

Current behavior

Only automated calibration is currently possible with this code.

Expected behavior

Provide an option and the mechanics to run multiple simulations with sampled parameters outside the DDS loop and guided calibration.

ngen config layer support

Need to add layer information/support to ngen config.

This spec isn't well defined, and is undergoing a bit of refactoring even now, but we need to add it soon.

Include logic to save best results

Right now only the last result is saved. It would be helpful to save the best result, and also the realization file for best results.

add option to save observed streamflow data used for calculating performance metrics

Short description explaining the high-level reason for the new issue.

ENHANCEMENT

Current behavior

The ngen-cal cod euses HY_Features to access nwis data to get usgs streamflow data. It uses this data to calculate performance metrics against the simulated streamflow. The data is not saved though.

Desired behavior

Include options in the calibration configuration file to 1. Specifiy if the user would like to save the observed data and 2. if 1 is True, provide a directory in which to save the data.

Discussion: Change the name of this repository to a more appropriate name

For better or worse, this repository has become the landing space for many NextGen related tools. Namely, tools for calibration, realization configuration, init config representation, and configuration generation. I like that this repo is operating as monorepo for a lot of our tooling around NextGen. It is so far, easy to maintain and once you've found it, you've found the majority of tooling around NextGen. However ngen-cal is no longer an appropriate name. For our sake and more so, the communities sake, I think we should adopt a new repo name sooner rather than later. A different name that better encapsulates the tools in this repo will help others discover these tools and hopefully drive usage of these tools.

To start the discussion, below are a few suggestions. Please think about this carefully and add others that come to mind before reading my suggestions.

ngen-tools
ngen-utils
ngen-py
ngen-tooling

Consider generating t-route ngen config file from ngen config generator

When ngen routing is required, consider having the config generator generate the approriate t-route yaml config file with sensible defaults.

Should fixed_time_step parameter default be True?

NGEN Config sets fixed_time_step default to False.

ngen-cal/python/ngen_conf/src/ngen/config/bmi_formulation.py

Line 37 in a2e2955

fixed_time_step: bool = False

In the NGEN BMI documentation this is implied, True:

fixed_time_step

boolean value to indicate whether this model has a fixed time step size

implied to be true by default

Should NGEN Config's fixed_time_step default be True.

Revisit LSTM bmi init config model

          Let's open an issue to come back to LSTM configs and see if we can use a target `train_cfg_file` to determine fields required to support that particular model.  Having a "default" for now is a good place to start.

Originally posted by @hellkite500 in #49 (comment)

Output all simulated hydrographs for comparison

For diagnosing how parameters drive model behavior it would be useful to have all simulated hydrographs saved as output during calibration.

Current behavior

Calibration script outputs hydrograph corresponding to best parameter set.

Expected behavior

Output hydrographs for all parameter sets, either as default behavior or as option. (The latter is likely more useful as output will get large as study domain expands.)

Steps to replicate behavior (include URLs)

Run calibration script as-is.

ngen.cal Objective Enum needs type constraints

ngen-cal/python/ngen_cal/strategy.py

Line 19 in fdca562

class Objective(Enum):

This enum should be a class Objective(str, Enum).

When Objective is used as a type in the Estimation pydantic model, the resulting json_schema does not qualify that that the objective field is constrained to string types.

  "definitions": {
    // ...
    "Objective": {
      "title": "Objective",
      "description": "Enumeration of supported search algorithms\n\n    ",
      // missing "type": "string",
      "enum": [
        "custom",
        "kling_gupta",
        "nnse",
        "single_peak",
        "volume"
      ]
    },
    "Estimation": {
      "title": "Estimation",
      "description": "Estimation strategy for defining parameter estimation",
      "type": "object",
      "properties": {
        // ...
        "objective": {
          "title": "Objective",
          "default": "custom",
          "anyOf": [
            {
              "$ref": "#/definitions/Objective"
            },
            // ...
          ]
        },
    // ...

Should either `forcing_filename` or `output_filename` be `typing.Optional`?

ngen-cal/python/ngen_conf/src/ngen/config/init_config/noahowp.py

Lines 82 to 83 in 1e8189a

    
           forcing_filename: Path 
        
           output_filename: Path

https://github.com/NOAA-OWP/noah-owp-modular/blob/30d0f53e8c14acc4ce74018e06ff7c9410ecc13c/src/RunModule.f90#L210-L212

#ifndef NGEN_FORCING_ACTIVE
      call open_forcing_file(namelist%forcing_filename)
#endif

https://github.com/NOAA-OWP/noah-owp-modular/blob/30d0f53e8c14acc4ce74018e06ff7c9410ecc13c/src/RunModule.f90#L219-L221

#ifndef NGEN_OUTPUT_ACTIVE
      call initialize_output(namelist%output_filename, domain%ntime, levels%nsoil, levels%nsnow)
#endif

If NoahOWP modular is compiled for use in NextGen, forcing and output will be handled by NextGen. That leads me to think these fields should either be marked as typing.Optional or NextGen and non-NextGen variants of the NoahOWP init config should be created. Alternatively, should these fields be removed? Thus far, with regard to init config files, the goal of ngen.config has been to capture the structure of well known module's (e.g. NoahOWP modular) NextGen compliant init config files. We have not focused on capturing all possible variants of a modules config, just the variant that can work with NextGen.

Observation and Model units don't correspond

Observation values coming from hydrolocation (NWIS) are in f^3/S (cubic feet per second) and model outputs are in m^3/s (cubic meters per second).

This either needs adjusted here or upstream in hypy nwis location. Regardless of where it gets addressed, needs to be better documented.

	modules = values.get('modules')
	if not name and modules:
	try:
	names = [ m['params']['model_name'] for m in modules ]
	except KeyError:
	names = [ m['params']['model_type_name'] for m in modules ]
	values['model_name'] = '_'.join( names )

noaa-owp / ngen-cal Goto Github PK

ngen-cal's People

Contributors

Stargazers

Watchers

Forkers

ngen-cal's Issues

Current behavior

Expected behavior

Current behavior

Expected behavior

Steps to replicate behavior (include URLs)

Proposed solution

Screenshots

Additional parameters -add to yaml configuration file:

Modification to the calibration_set.py, function output:

Current behavior

Expected behavior

Implementation notes

Reproduce

Current behavior

Expected behavior

Steps to replicate behavior (include URLs)

Screenshots

Current behavior

Expected behavior

Current behavior

Expected behavior

Current behavior

Desired behavior

Current behavior

Expected behavior

Steps to replicate behavior (include URLs)

Recommend Projects

Recommend Topics

Recommend Org