spacetelescope / stpipe Goto Github PK

View Code? Open in Web Editor NEW

3.0 9.0 24.0 426 KB

Home Page: https://stpipe.readthedocs.io

License: Other

Python 100.00%

stpipe's Introduction

stpipe

Provides base classes and command-line tools for implementing calibration pipeline software.

stpipe's People

Contributors

Stargazers

Watchers

stpipe's Issues

Implement stpipe run CLI command

Issue JP-1894 was created on JIRA by Edward Slavich:

Implement the "stpipe run" CLI subcommand. Here is the tentative help output for the command:

$ ./stpipe run -h
usage: stpipe run [-h] [--config <path>] [-d] [-p <name>=<value>] [-v] <class> <input-file>

run a step or pipeline

positional arguments:
  <class>            step or pipeline class name (case-insensitive, module path may be omitted for unique class names)
  <input-file>       input dataset or association

optional arguments:
  -h, --help         show this help message and exit
  --config <path>    config file (use 'stpipe print-config' to save and edit the default config)
  -d, --debug        debug logging (DEBUG level)
  -p <name>=<value>  override an individual step or pipeline parameter (use 'stpipe describe' to list available parameters)
  -v, --verbose      verbose logging (INFO level)

examples:
  run a pipeline with default parameters recommended by CRDS:

    stpipe run jwst.pipeline.Detector1Pipeline dataset.fits

  run a pipeline with parameters specified in a local config:

    stpipe run --config config.asdf jwst.pipeline.Detector1Pipeline dataset.fits

  override an individual pipeline parameter:

    stpipe run -p save_calibrated_ramp=true jwst.pipeline.Detector1Pipeline dataset.fits

  override an individual step parameter:

    stpipe run -p jump.rejection_threshold=3.0 jwst.pipeline.Detector1Pipeline dataset.fits

See #8 for more discussion.

Bundled `configobj` subject to CVE-2023-26112

configobj is subject to a ReDoS: GHSA-c33w-24p9-8m24

This regex used to parse the config/spec items suffers from catastrophic backtracking:

stpipe/src/stpipe/extern/configobj/validate.py

Line 540 in e82a1f0

_func_re = re.compile(r'(.+?)\((.*)\)', re.DOTALL)

This is one useful write-up on the issue: https://www.regular-expressions.info/catastrophic.html

The upstream/bundled project appears abandoned but does have an open PR that appears to fix the offending regex:
DiffSK/configobj#236

Given the level of control necessary to exploit the issue (it seems likely a Step.spec could be crafted to exploit it) it seems like a low risk as the step could also just include a while True. I haven't thought through what this means for command line options or if a ReDoS in python is limited to one cpu.

Move strun to project.scripts in pyproject.toml

Currently, strun lives on it's own in scripts and is pulled in via shenanigans via setup.py. It should really have it's own entry in pyproject.toml:

[project.scripts]
stpipe = 'stpipe.cli.main:main'

and live with the rest of the cli scripts.

Fix coverage upload to Codecov

Currently coverage uploads are failing.

https://github.com/spacetelescope/stpipe/pull/15/checks?check_run_id=1857492557#step:5:79

Use the Github action provided by Codecov to do this in a separate action.

Improvements to Step create/run methods

This issue is for discussion of potential improvements to the Python interface for creating and running steps and pipelines. First a summary of the current state of affairs:

Methods for creating steps

Column key:

Selects subclass: Does the method attempt to select the correct Step subclass based on arguments or config file contents?
Runs step: Does the method run the step before returning?
CRDS pars: Does the method incorporate parameters from CRDS pars files?
Config input: Does the method accept a path to a user config?
Override parameters: Does the method support overriding parameters on an individual basis?
Override style: If parameter overrides are supported, are they passed as standard keyword arguments or CLI-style arguments?

Method	Selects subclass	Runs step	CRDS pars	Config input	Override parameters	Override style	Notes
__init__	✗	✗	✗	✗	✓	keyword	Accepts a `config_file` argument but does not apply parameters from it.
call	✗	✓	✓	✓	✓	keyword	User config file passed as keyword argument. Config file's `class` field ignored.
from_cmdline	✓	✓	✓	✓	✓	CLI	Selects Step subclass based on config `class` field or class name argument.
from_config_file	✓	✗	✗	✓	✗		Selects Step subclass based on config `class` field.
from_config_section	✗	✗	✗	✓	✗		Probably not intended to be part of the public API.

Methods for running steps

Column key:

Creates step: Does the method create the Step instance before running it?

Method	Creates step	Notes
__call__	✗	Alias for `run`.
call	✓	Python API only (not used by CLI code).
from_cmdline	✓	The `strun` script is a thin wrapper around this method.
process	✗	Subclass implementation method. Not intended to be called directly by general users.
run	✗	Eventually called by any method that needs to run the step.

Suggestions for improvement

Eliminate run-and-call methods. Little value add, and presents confusing interface where step creation arguments and step run arguments are blended together into one method signature.
Move methods that return instances whose classes may be different from the one the method was invoked on. This is confusing and better handled with module methods.
Remove from_cmdline and instead call the corresponding cmdline module method directly.
Rename process to make clear that it shouldn't be invoked by users. Maybe a name with a leading underscore, or something like run_impl.
Remove one of run or __call__ so that usage is uniform.
Rename config_file argument to __init__ to something like working_dir, to make clear that the config is not loaded.
Change CLI code to be a relatively thin wrapper around the Python interface (instead of parallel implementations like the current from_cmdline vs call). This will ensure consistency between the two interfaces. There is already some divergence between call and from_cmdline, e.g. the _pars_model atttribute is not set by call, and call doesn't know how to select the Step subclass based on a config.
Pass around step parameters as a separate dict argument instead of **kwargs. This provides a clear separation between the parameters and other method arguments.

Possible new interface

Step.__init__(self, params=None, working_dir=None, ...): Parameters are passed to initializer in a dict.
Step.call_impl(self, *args): Step subclass implementation.
Step.__call__(self, *args): Wrapper around call_impl that handles common setup and teardown.
stpipe.create_step(*, step_class=None, config_path=None, crds_params_enabled=True, dataset=None, params=None, working_dir=None, ...): Convenience method for creating steps. At least one of step_class or config_path is required to determine the step class. dataset is required if crds_params_enabled is True.
stpipe.cmdline.from_cmdline(args): Method that parses CLI arguments. Ends in a call to stpipe.create_step.

Run step from CLI

from stpipe.cmdline import from_cmdline

# stpipe step run config.cfg dataset.asdf --foo=42
step, inputs = from_cmdline(args)
step(*inputs)

Run step from Python

from stpipe import create_step

step = create_step(config_path="config.cfg", dataset="dataset.asdf", params={"foo": 42})
step("dataset.asdf")

from stpipe import create_step

step = create_step(config_path="config.asdf", dataset="dataset.asdf")
step.some_param = "some_value"
step()

Developing a step

from stpipe import Step

class MyStep(Step):
    def call_impl(self, dataset):
        print(f"Value of foo: {self.foo}")

step = MyStep(params={"foo": 42})
step("dataset.asdf")

Step needs a parameter in spec to control output filename formatting

Unless explicitly calling self.save_model(format=False), the default file save implementation in Step.run does not allow for output filename format control. A parameter could be added to the Step spec that, when set, could set format=False.

Issue connected to JP-1793

Add stpipe.create_step method

Issue JP-1889 was created on JIRA by Edward Slavich:

The stpipe.create_step method will be the new Python API for creating CRDS-configured Step instances. Tentative method signature:

def create_step(input_file, step_class=None, config_file=None):
    """
    Create a Step instance with parameters configured by CRDS.
    One of step_class or config_file must be specified.

    Parameters
    ----------
    input_file : str or pathlib.PurePath or
                 stdatamodels.DataModel or stdatamodels.ModelContainer
      Dataset whose header will be used to select a config reference
      from CRDS.  Also the file assigned to the Step's input_file
      parameter.
    step_class : type, optional
      Step class to instantiate.  Must be a subclass of stpipe.Step.
    config_file : str or pathlib.PurePath or asdf.AsdfFile
      ASDF Step configuration file.```

Skipping steps while the data is in a ModelContainer causes an error

The error catch implemented in the try/except around the keyword setting of 'SKIPPED' for skipped steps doesn't catch (or appropriately set) keywords for input still in a ModelContainer.

Run unit tests on jwst and romcal as part of CI workflow

Add a job that runs the jwst and romancal unit tests to the CI workflow.

This could be a manual dispatch and/or per-PR as the current CI is. Maybe both?!

Don't allow abbreviations for long argparse options

We currently allow abbreviations for long argparse arguments in the command line interface for stpipe. We should not.

As an example, refpix has as a parameter use_side_ref_pixels. If calling from within Python, one needs to use the full parameter name to change it. If calling from the command line, any of the following will currently work:

--steps.refpix.use_side_ref_pixels=True
--steps.refpix.use_side_ref_pix=True
--steps.refpix.use_side=True
--steps.refpix.use=True
--steps.refpix.u=True

From a user interface perspective, this causes confusion, and it violates several Zen of Python principles.

This is, btw, the default behavior for argparse. Something I did not know.

Anyway, for consistency, it would be good to use allow_abbrev=False in the stpipe parser. See

https://docs.python.org/3/library/argparse.html#allow-abbrev

This will make debugging user problems much easier, as if a user uses --steps.refix.u=True in a script, the same param won't work within Python. And if we add a new parameter use_side_of_fries_with_that, then it will break that user's script.

Thanks to @bhilbert4 for pointing this out.

Romancal downstream CI failing due to WebbPSF

The WebbPSF changes to romancal now require additional data to be made available in order for the CI to properly run. This has already been done for the romancal CI, but it needs to be ported to stpipe in order for its testing to continue properly.

Generate traitlets from ConfigObj

Issue JP-1888 was created on JIRA by Edward Slavich:

In order to support existing Step implementations, we'll need to be able to parse the ConfigObj spec string and generate corresponding traitlet attributes. We'll need a new metaclass for JwstStep that creates the traitlets when the class is defined.

Consider IPython's traitlets as an application framework

The traitlets package is used by IPython/Jupyter for application configuration. It handles type checking and other validation and mapping to CLI arguments and config files. It also comes with logger configuration support built-in. We may be able replace configobj and a bunch of custom code with traitlets, and usage would be familiar to users who have previously configured ipython, the notebook server, etc.

Here's an example of how traitlets work:

#!/usr/bin/env python3
from traitlets import Integer, Enum, Bool, validate, TraitError
from traitlets.config import Application, LoggingConfigurable


class Step(LoggingConfigurable):
    save_results = Bool(
        default_value=False,
        help="""
            Force save results for an intermediate step.
        """,
        config=True,
    )


class Extract1dStep(Step):
    smoothing_length = Integer(
        default_value=None,
        allow_none=True,
        help="""
            If set, the background regions (if any) will be smoothed
            with a boxcar function of this width along the dispersion
            direction.  This must be an odd integer.
        """,
        config=True,
    )

    @validate("smoothing_length")
    def _validate_odd(self, proposal):
        value = proposal["value"]
        trait = proposal["trait"]
        if value % 2 == 0:
            raise TraitError(f"{trait.name} must be an odd integer")
        return value

    bkg_fit = Enum(
        ["poly", "mean", "median"],
        default_value="poly",
        help="""
            A string indicating the type of fitting to be applied to
            background values in each column (or row, if the dispersion is
            vertical).
            """,
        config=True,
    )

    bkg_order = Integer(
        default_value=None,
        min=0,
        allow_none=True,
        help="""
            If present, a polynomial with order 'bkg_order' will be fit to
            each column (or row, if the dispersion direction is vertical)
            of the background region or regions.  For a given column (row),
            one polynomial will be fit to all background regions.  The
            polynomial will be evaluated at each pixel of the source
            extraction region(s) along the column (row), and the fitted value
            will be subtracted from the data value at that pixel.
            If both 'smoothing_length' and 'bkg_order' are not None, the
            boxcar smoothing will be done first.
        """,
        config=True,
    )

    log_increment = Integer(
        default_value=50,
        help="""
            If log_increment is greater than 0 and the input data are multi-integration
            (which can be CubeModel or SlitModel), a message will be written to the log
            with log level INFO every log_increment integrations.  This is intended to
            provide progress information when invoking the step interactively.
        """,
        config=True,
    )

    subtract_background = Bool(
        default_value=None,
        allow_none=True,
        help="""
            A flag which indicates whether the background should be subtracted.
            If absent, the value in the extract_1d reference file will be used.
            If present, this parameter overrides the value in the
            extract_1d reference file.
        """,
        config=True,
    )

    use_source_posn = Bool(
        default_value=None,
        allow_none=True,
        help="""
            If True, the source and background extraction positions specified in
            the extract1d reference file (or the default position, if there is no
            reference file) will be shifted to account for the computed position
            of the source in the data.  If absent, the values in the
            reference file will be used. Aperture offset is determined by computing
            the pixel location of the source based on its RA and Dec. It does not
            make sense to apply aperture offsets for extended sources, so this
            parameter can be overriden (set to False) internally by the step.
        """,
        config=True,
    )

    apply_apcorr = Bool(
        default_value=True,
        help="""
            Switch to select whether or not to apply an APERTURE correction during
            the Extract1dStep.
        """,
        config=True,
    )

    def __call__(self):
        self.log.debug(f"Called Extract1dStep.__call__")
        self.log.debug(f"smoothing_length={self.smoothing_length}")
        self.log.debug(f"save_results={self.save_results}")
        pass


class Stpipe(Application):
    name = "stpipe"

    description = "Calibration pipeline CLI"

    classes = [Extract1dStep]

    def start(self):
        self.log.debug("Called Stpipe.start")

        step = Extract1dStep(config=self.config)
        step()


def main():
    Stpipe.launch_instance()


if __name__ == "__main__":
    main()

Calling this script with no arguments:

$ ./stpipe
[Stpipe] smoothing_length=None
[Stpipe] save_results=False

Overriding save_results for all subclasses of Step:

$ ./stpipe --Step.save_results=true
[Stpipe] smoothing_length=None
[Stpipe] save_results=True

Overriding for just Extract1dStep:

$ ./stpipe --Extract1dStep.save_results=true
[Stpipe] smoothing_length=None
[Stpipe] save_results=True

Invalid smoothing_length caught by custom validator:

$ ./stpipe --Extract1dStep.smoothing_length=10
Traceback (most recent call last):
...
traitlets.traitlets.TraitError: smoothing_length must be an odd integer

Okay, fine, let's turn it up to 11:

$ ./stpipe --Extract1dStep.smoothing_length=11
[Stpipe] smoothing_length=11
[Stpipe] save_results=False

Built-in support for log configuration:

$ ./stpipe --Stpipe.log_format='%(asctime)s - %(levelname)s - %(message)s' --Stpipe.log_level=DEBUG
2021-01-21 22:55:22 - DEBUG - Called Stpipe.start
2021-01-21 22:55:22 - DEBUG - Called Extract1dStep.__call__
2021-01-21 22:55:22 - INFO - smoothing_length=None
2021-01-21 22:55:22 - INFO - save_results=False

Auto-generated help for Extract1dStep:

Extract1dStep(Step) options
---------------------------
--Extract1dStep.apply_apcorr=<Bool>
    Switch to select whether or not to apply an APERTURE correction during the
    Extract1dStep.
    Default: True
--Extract1dStep.bkg_fit=<Enum>
    A string indicating the type of fitting to be applied to background values
    in each column (or row, if the dispersion is vertical).
    Choices: any of ['poly', 'mean', 'median']
    Default: 'poly'
--Extract1dStep.bkg_order=<Int>
    If present, a polynomial with order 'bkg_order' will be fit to each column
    (or row, if the dispersion direction is vertical) of the background region
    or regions.  For a given column (row), one polynomial will be fit to all
    background regions.  The polynomial will be evaluated at each pixel of the
    source extraction region(s) along the column (row), and the fitted value
    will be subtracted from the data value at that pixel. If both
    'smoothing_length' and 'bkg_order' are not None, the boxcar smoothing will
    be done first.
    Default: None
--Extract1dStep.log_increment=<Int>
    If log_increment is greater than 0 and the input data are multi-integration
    (which can be CubeModel or SlitModel), a message will be written to the log
    with log level INFO every log_increment integrations.  This is intended to
    provide progress information when invoking the step interactively.
    Default: 50
--Extract1dStep.save_results=<Bool>
    Force save results for an intermediate step.
    Default: False
--Extract1dStep.smoothing_length=<Int>
    If set, the background regions (if any) will be smoothed with a boxcar
    function of this width along the dispersion direction.  This must be an odd
    integer.
    Default: None
--Extract1dStep.subtract_background=<Bool>
    A flag which indicates whether the background should be subtracted. If
    absent, the value in the extract_1d reference file will be used. If present,
    this parameter overrides the value in the extract_1d reference file.
    Default: None
--Extract1dStep.use_source_posn=<Bool>
    If True, the source and background extraction positions specified in the
    extract1d reference file (or the default position, if there is no reference
    file) will be shifted to account for the computed position of the source in
    the data.  If absent, the values in the reference file will be used.
    Aperture offset is determined by computing the pixel location of the source
    based on its RA and Dec. It does not make sense to apply aperture offsets
    for extended sources, so this parameter can be overriden (set to False)
    internally by the step.
    Default: None

Config file example (we wouldn't have to use this, but it's available if we need it):

c.Extract1dStep.save_results = True
c.Extract1dStep.smoothing_length = 11

Step.spec parsing fails when comment contains a closing parthesis

This issue came up while adding a spec to jwst ami analyze in:
spacetelescope/jwst#7862

An attempt to add an inline comment to a spec entry like the following

src = string(default='A0V') # Source spectral type for model (Phoenix models)

Failed with an odd error:

ValidationError: Config parameter 'src': missing

Removing the parentheses enclosing "Phoenix models" fixed the issue.

This appears to be related to the _inspec argument provided to the ConfigObj:

stpipe/src/stpipe/config_parser.py

Line 202 in e82a1f0

_inspec=not preserve_comments,

Removing this argument allows the spec to be parsed with seemingly no errors. The spec/ConfigObj also still contains a inline_comments attribute with accurate values for the comments. I'm not sure what preserve_comments is supposed to be doing.

Perhaps one way to address this would be to:

add a test for a spec with a comment that contains a closing parenthesis
remove the _inspec argument
deprecate preserve_comments

Support for `pathlib.Path` objects?

Hi!

I was wondering if pathlib.Path objects were supported in stpipe (and packages depending on it like jwst) or if support was planned in the future.

I wanted to report an example issue I encountered recently: using a Path object as input to Step.run() leads to the _input_filename attribute not being set because the Path falls in the else category here.

The simple fix for users is to use strings as input path, but it took me a bit of time to figure this out because I missed the error message in the log.

Add verbosity option to Step.call

Issue JP-1817 was created on JIRA by Jonathan Eisenhamer:

In Python, setting up logging for Step.call is possible through stpipe.log module. Even with documentation, this is a bit obtuse. Users are expecting a keyword argument, similar to the strun call. Add the keyword argument such that Step.call(verbose=True) is available.

`step_config_with_metadata` schema only used in jwst tests

https://github.com/spacetelescope/stpipe/blob/main/src/stpipe/resources/schemas/step_config_with_metadata-1.0.0.yaml
Is unused in stpipe.

One use exists in the following jwst test:
https://github.com/spacetelescope/jwst/blob/master/jwst/stpipe/tests/test_config.py#L38

I suspect that the schema was intended to be used in:

stpipe/src/stpipe/config.py

Line 93 in d20b642

def to_asdf(self, include_metadata=False):

and

stpipe/src/stpipe/config.py

Line 129 in d20b642

def from_asdf(cls, asdf_file):

@jdavies-st this might stem from spacetelescope/jwst#5695 Do you recall if this schema was intended only for test use or was the plan to use it to validate step configs (that contain metadata)?

Implement stpipe describe CLI command

Issue JP-1896 was created on JIRA by Edward Slavich:

The "stpipe describe" command will show the docstring for a Step subclass, its sub-steps in the case of a Pipeline, and description of the available parameters. Tentative help output:

$ ./stpipe describe -h
usage: stpipe describe [-h] [-v] <class>

describe a step or pipeline

positional arguments:
  <class>        step or pipeline class name (case-insensitive, module path may be omitted for unique class names)

optional arguments:
  -h, --help     show this help message and exit
  -v, --verbose  verbose output

examples:
  print step description and parameters:

    stpipe describe jwst.step.RampFitStep

  increase verbosity to include parameters for nested steps:

    stpipe describe --verbose jwst.pipeline.Detector1Pipeline

Unable to take list-like arguments in the commandline

Hi,

I found the strun command does not handle the following commandline correctly:

strun <other args> --steps.resample.output_shape 1000,1000
----------------------------------------------------------------------
ERROR PARSING CONFIGURATION:
    Config parameter 'output_shape': the value "1000,1000" is of the
    wrong type.

I did some debugging and found that this was because when Step is created, the param_args that passed to the instance is the raw args parsed from the ArgumentParser. These values are not handled by the cmdline._override_config_from_args nor by config_parser.string_to_python_type

I put together and a quick fix for now https://github.com/Jerry-Ma/stpipe/tree/fix_list_like_cmdline_args, let me know if you want me to make a PR.

Making specifying a log file more convenient

Currently specifying a log file requires editing a .cfg file (and has the drawback that if one forgets that change to that file, one can be confused as why one is not seeing any output when running a task).

From the command line it should be possible to specify a log file as a command line option, and in Python, as a keyword argument to the method starting a pipeline step or pipeline.

Even better, it is useful to see both output to the log file in a terminal, and have that output go to a log file, so some sort of "tee" option should also be provided.

AbstractDataModel needs a method for accessing meta.filename

Currently the code assumes meta.filename will exist, but models should be free to store it in a different location. We should add a method to AbstractDataModel that accesses the filename wherever it lives.

Implement stpipe CLI command for saving a Step's default + CRDS config file

Issue JP-1897 was created on JIRA by Edward Slavich:

The "stpipe print-config" or "stpipe save-config" subcommand (part of this ticket will be polling opinions on which of these will work best) will allow users to save Step subclass default and CRDS parameters to an ASDF config file which the user can customize. An example help output for the "print-config" option, which prints the ASDF to stdout where it can be redirected as needed:

$ ./stpipe print-config -h
usage: stpipe print-config [-h] <class> <input-file>

print step or pipeline config to stdout

positional arguments:
  <class>       step or pipeline class name (case-insensitive, module path may be omitted for unique class names)
  <input-file>  input dataset or association (used to fetch parameters from CRDS)

optional arguments:
  -h, --help    show this help message and exit

examples:
  save a pipeline config to a local file:

    stpipe print-config jwst.pipeline.Detector1Pipeline dataset.fits > config.asdf

See #8 for discussion.

strun pipeline --help doesn't list parameter defaults

If I do:

$ strun calwebb_image3 --help
usage: strun [-h] [--logcfg LOGCFG] [--verbose] [--debug]
             [--save-parameters SAVE_PARAMETERS] [--disable-crds-steppars]
             [--pre_hooks] [--post_hooks] [--output_file] [--output_dir]
             [--output_ext] [--output_use_model] [--output_use_index]
...

Image3Pipeline: Applies level 3 processing to imaging-mode data from any JWST
instrument. Included steps are: assign_mtwcs tweakreg skymatch
outlier_detection resample source_catalog

positional arguments:
  cfg_file_or_class     The configuration file or Python class to run
  args                  arguments to pass to step

options:
  -h, --help            show this help message and exit
  --logcfg LOGCFG       The logging configuration file to load
  --verbose, -v         Turn on all logging messages
  --debug               When an exception occurs, invoke the Python debugger,
                        pdb
  --save-parameters SAVE_PARAMETERS
                        Save step parameters to specified file.
  --disable-crds-steppars
                        Disable retrieval of step parameter references files
                        from CRDS
  --pre_hooks 
  --post_hooks 
  --output_file         File to save output to.
  --output_dir          Directory path for output files
  --output_ext          Output file type
  --output_use_model    When saving use `DataModel.meta.filename`
  --output_use_index    Append index.
...
  --steps.tweakreg.kernel_fwhm 
                        Gaussian kernel FWHM in pixels
  --steps.tweakreg.snr_threshold 
                        SNR threshold above the bkg
  --steps.tweakreg.sharplo 
                        The lower bound on sharpness for object detection.
  --steps.tweakreg.sharphi 
                        The upper bound on sharpness for object detection.
  --steps.tweakreg.roundlo 
                        The lower bound on roundness for object detection.
  --steps.tweakreg.roundhi 
                        The upper bound on roundness for object detection.
  --steps.tweakreg.brightest 
                        Keep top ``brightest`` objects
  --steps.tweakreg.peakmax 
                        Filter out objects with pixel values >= ``peakmax``
  --steps.tweakreg.bkg_boxsize 
                        The background mesh box size in pixels.
  --steps.tweakreg.enforce_user_order 
                        Align images in user specified order?
  --steps.tweakreg.expand_refcat 
                        Expand reference catalog with new sources?
...

Note that while it lists available args, it does not list what the defaults are. The defaults are in the spec attribute for each class, so it should be easy to list them in the --help as well.

This is actually the quickest, easiest interface for finding out what a pipeline's options are, but it does not give the defaults. There is currently no reliable interface to do this nicely within Python, as Pipeline.spec is just a string and does not give the step parameters.

This is an important user interface issue that should be fixed.

Side note: The logging from the run of the pipeline does list the used parameters (some of which may be defaults), but of course it is very difficult (impossible?) to read in the jumble dict that is dumped in logging. Formatting of one parameter per line something like the above would be much preferable.

Support other datamodels than provided by stdatamodels

Currently the plan is to have romancal use tagged ASDF files, but doing so requires a different class implementation for data models. Stpipe should be able to accommodate different datamodel implementations without requiring subclassing of the stdatamodel implementation. Currently the only dependence on the datamodel class by stpipe is on these methods:

DataModel.crds_observatory
DataModel.get_crds_parameters
DataModel.save
It also uses the attribute meta.filename

Support traitlets for defining Step parameters

Issue JP-1898 was created on JIRA by Edward Slavich:

Add support for defining Step parameters as traitlets. The Step class will need to inherit from traitlets.HasTraits, and we'll need to include any traitlet parameters when generating config files, describing steps, etc.

Use tmp_path instead of tmpdir pytest fixture

Replace all instances thereof, and then in pyproject.toml:

[tool.pytest.ini_options]
addopts = [
    "-p no:legacypath",
]

This is will make the code more robust for use of pathlib.Path instances in the runtime code.

Add version attribute to package

Something I forgot to do initially...

Step parameters that accept path strings should accept pathlib.Path objects

Currently, Step parameter validation requires file paths to be strings, specifically for overrid_<reffile> cases. pathlib.Path objects should also pass validation. Specifically this means modifying the custom validation functions here

stpipe/src/stpipe/config_parser.py

Lines 331 to 334 in b928337

    
           validator.functions["input_file"] = _get_input_file_check(root_dir) 
        
           validator.functions["output_file"] = _get_output_file_check(root_dir) 
        
           validator.functions["is_datamodel"] = _is_datamodel 
        
           validator.functions["is_string_or_datamodel"] = _is_string_or_datamodel

Parsing step's list arguments is broken

@jemorrison reported that parsing list arguments of steps does not work. Indeed, I verified that running strun with

--steps.resample.output_shape=500,800

which used to work before, no longer works:

ValueError: Config parameter 'output_shape': the value "500,800" is of the wrong type.

CC: @stscieisenhamer

After some detective work, I determined that #57 broke support for list arguments.

The Step.get_pars() method doesn't include override_<reffile> parameters

When a step runs, it logs the parameters used, but it misses the reffile overrides. Using the same Step.get_pars() method that the log message uses, we see:

>>> from jwst.step import GainScaleStep
>>> gain_scale = GainScaleStep()
>>> gain_scale.get_pars()
{'pre_hooks': [],
 'post_hooks': [],
 'output_file': None,
 'output_dir': None,
 'output_ext': '.fits',
 'output_use_model': False,
 'output_use_index': True,
 'save_results': False,
 'skip': False,
 'suffix': None,
 'search_output_file': True,
 'input_dir': ''}

Compared to the command line --help:

$ strun gain_scale --help
usage: strun [-h] [--logcfg LOGCFG] [--verbose] [--debug] [--save-parameters SAVE_PARAMETERS]
             [--disable-crds-steppars] [--pre_hooks] [--post_hooks] [--output_file] [--output_dir] [--output_ext]
             [--output_use_model] [--output_use_index] [--save_results] [--skip] [--suffix] [--search_output_file]
             [--input_dir] [--override_gain]
             cfg_file_or_class [args ...]

GainScaleStep: Rescales countrate data to account for use of a non-standard gain value. All integrations are
multiplied by the factor GAINFACT.

positional arguments:
  cfg_file_or_class     The configuration file or Python class to run
  args                  arguments to pass to step

options:
  -h, --help            show this help message and exit
  --logcfg LOGCFG       The logging configuration file to load
  --verbose, -v         Turn on all logging messages
  --debug               When an exception occurs, invoke the Python debugger, pdb
  --save-parameters SAVE_PARAMETERS
                        Save step parameters to specified file.
  --disable-crds-steppars
                        Disable retrieval of step parameter references files from CRDS
  --pre_hooks           [default=list]
  --post_hooks          [default=list]
  --output_file         File to save output to.
  --output_dir          Directory path for output files
  --output_ext          Output file type [default='.fits']
  --output_use_model    When saving use `DataModel.meta.filename` [default=False]
  --output_use_index    Append index. [default=True]
  --save_results        Force save results [default=False]
  --skip                Skip this step [default=False]
  --suffix              Default suffix for output files
  --search_output_file 
                        Use outputfile define in parent step [default=True]
  --input_dir           Input directory
  --override_gain       Override the gain reference file

Note that the override_gain parameter is not printed out. This is true whether an override has been supplied or not.

So somehow the merged config grabbed by get_pars() is missing the reffile overrides.

Make input file a Step parameter

Issue JP-1884 was created on JIRA by Edward Slavich:

Currently a Step instance accepts an input file as an argument to the run method, which encourages reusing the step with subsequent files. Now that we have CRDS config references, this isn't a good idea, because the Step's parameters may only be appropriate for the original file.

Add a new parameter for the input file to the Step class spec
Remove the positional argument from the run method on the Step base class
Keep the positional argument on the JwstStep base class, but emit a warning when used and copy the value into the new parameter
Change code in jwst that calls Step.run with an argument to use the parameter instead

Failing test: ModuleNotFoundError: No module named 'roman_datamodels.tests'

Hi,

stpipe: 0.5.1

During update of the package for Guix downstream from 0.5.0 to 0.5.1 I've faced with the issue of failing test:

============================= test session starts ==============================
platform linux -- Python 3.10.7, pytest-7.1.3, pluggy-1.0.0 -- /gnu/store/l6fpy0i9hlll9b6k8vy2i2a4cshwz3cv-python-wrapper-3.10.7/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/guix-build-python-stpipe-0.5.1.drv-0/stpipe-0.5.1/.hypothesis/examples')
rootdir: /tmp/guix-build-python-stpipe-0.5.1.drv-0/stpipe-0.5.1, configfile: pyproject.toml, testpaths: tests
plugins: hypothesis-6.54.5, doctestplus-0.12.1, openfiles-0.5.0, asdf-2.15.0, astropy-header-0.2.2
collecting ... collected 44 items

tests/test_abstract_datamodel.py::test_roman_datamodel FAILED            [  2%]
tests/test_abstract_datamodel.py::test_jwst_datamodel SKIPPED (could not import 'jwst.datamodels': No module named 'jwst') [  4%]
tests/test_abstract_datamodel.py::test_good_datamodel PASSED             [  6%]
tests/test_abstract_datamodel.py::test_bad_datamodel PASSED              [  9%]
tests/test_config_parser.py::test_merge_config_nested_mapping PASSED     [ 11%]
tests/test_config_parser.py::test_preserve_comments_deprecation[True] PASSED [ 13%]
tests/test_config_parser.py::test_preserve_comments_deprecation[False] PASSED [ 15%]
tests/test_config_parser.py::test_preserve_comments_deprecation[None] PASSED [ 18%]
tests/test_config_parser.py::test_preserve_comments_deprecation[value3] PASSED [ 20%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="{name}" value="{value}"-fields0-None] PASSED [ 22%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="fred" value="great"-fields1-None] PASSED [ 25%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="fred" value="great"_more-fields2-None] PASSED [ 27%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="{name}" value="great"_more-fields3-None] PASSED [ 29%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="{name}" value=""-fields4-None] PASSED [ 31%]
tests/test_format_template.py::test_basics[key_formats5-astring-astring-fields5-None] PASSED [ 34%]
tests/test_format_template.py::test_basics[key_formats6-astring-astring_s00001-fields6-None] PASSED [ 36%]
tests/test_format_template.py::test_basics[key_formats7-astring-astring_smysource-fields7-None] PASSED [ 38%]
tests/test_format_template.py::test_basics[key_formats8-astring-astring_error-fields8-errors8] PASSED [ 40%]
tests/test_format_template.py::test_separators PASSED                    [ 43%]
tests/test_format_template.py::test_allow_unknown PASSED                 [ 45%]
tests/test_integration.py::test_asdf_extension PASSED                    [ 47%]
tests/test_logger.py::test_configuration PASSED                          [ 50%]
tests/test_logger.py::test_record_logs PASSED                            [ 52%]
tests/test_scripts.py::test_scripts_in_path PASSED                       [ 54%]
tests/test_step.py::test_build_config_pipe_config_file PASSED            [ 56%]
tests/test_step.py::test_build_config_pipe_crds PASSED                   [ 59%]
tests/test_step.py::test_build_config_pipe_default PASSED                [ 61%]
tests/test_step.py::test_build_config_pipe_kwarg PASSED                  [ 63%]
tests/test_step.py::test_build_config_step_config_file PASSED            [ 65%]
tests/test_step.py::test_build_config_step_crds PASSED                   [ 68%]
tests/test_step.py::test_build_config_step_default PASSED                [ 70%]
tests/test_step.py::test_build_config_step_kwarg PASSED                  [ 72%]
tests/test_step.py::test_step_list_args PASSED                           [ 75%]
tests/test_step.py::test_logcfg_routing PASSED                           [ 77%]
tests/test_step.py::test_log_records PASSED                              [ 79%]
tests/cli/test_list.py::test_no_arguments PASSED                         [ 81%]
tests/cli/test_list.py::test_pipelines_only PASSED                       [ 84%]
tests/cli/test_list.py::test_steps_only PASSED                           [ 86%]
tests/cli/test_list.py::test_filter_class_names PASSED                   [ 88%]
tests/cli/test_list.py::test_filter_aliases PASSED                       [ 90%]
tests/cli/test_main.py::test_version[-v] PASSED                          [ 93%]
tests/cli/test_main.py::test_version[--version] PASSED                   [ 95%]
tests/cli/test_main.py::test_package_main PASSED                         [ 97%]
tests/cli/test_main.py::test_cli_main PASSED                             [100%]

=================================== FAILURES ===================================
_____________________________ test_roman_datamodel _____________________________

    def test_roman_datamodel():
        roman_datamodels = pytest.importorskip("roman_datamodels.datamodels")
>       import roman_datamodels.tests.util as rutil
E       ModuleNotFoundError: No module named 'roman_datamodels.tests'

tests/test_abstract_datamodel.py:12: ModuleNotFoundError
=========================== short test summary info ============================
FAILED tests/test_abstract_datamodel.py::test_roman_datamodel - ModuleNotFoun...
=================== 1 failed, 42 passed, 1 skipped in 3.89s ====================

Reproduce:

guix time-machine --commit=b566e1a98a74d84d3978cffefd05295602c9445d -- build --system=x86_64-linux --with-latest=python-stpipe python-stpipe

Print statements always print twice

The pipeline ALWAYS prints statements two times, first in pink and then in white. Obviously this doesn't affect functionality but it is extremely annoying! Suppressing the second output would be awesome. Thanks!

Remove ci_watson dependency

We'll need to replace uses of _jail (there are only a few) with tmp_path.

Migrate existing Step parameter definitions to traitlets

Issue JP-1899 was created on JIRA by Edward Slavich:

Once traitlet-based Step parameters are available, we'll need to migrate the "spec" class attribute on the existing jwst steps to individual traitlet attributes.

Formatting of Step parameters in log should be readable

Currently when a step runs, it outputs a log of the step parameters, or if it's a pipeline, the pipeline and constituent step parameters. But it is basically unreadable, as one can see for example for jwst.pipeline.Detector1Pipeline:

{'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'save_calibrated_ramp': False, 'steps': {'group_scale': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': ''}, 'dq_init': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': ''}, 'emicorr': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': True, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'save_intermediate_results': False, 'user_supplied_reffile': None, 'nints_to_phase': None, 'nbins': None, 'scale_reference': True}, 'saturation': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'n_pix_grow_sat': 1},
...
'jump': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'rejection_threshold': 4.0, 'three_group_rejection_threshold': 6.0, 'four_group_rejection_threshold': 5.0, 'maximum_cores': '1', 'flag_4_neighbors': True, 'max_jump_to_flag_neighbors': 1000.0, 'min_jump_to_flag_neighbors': 10.0, 'after_jump_flag_dn1': 0.0, 'after_jump_flag_time1': 0.0, 'after_jump_flag_dn2': 0.0, 'after_jump_flag_time2': 0.0, 'expand_large_events': False, 'min_sat_area': 1.0, 'min_jump_area': 5.0, 'expand_factor': 2.0, 'use_ellipses': False, 'sat_required_snowball': True, 'min_sat_radius_extend': 2.5, 'sat_expand': 2, 'edge_size': 25, 'find_showers': False, 'extend_snr_threshold': 1.2, 'extend_min_area': 90, 'extend_inner_radius': 1.0, 'extend_outer_radius': 2.6, 'extend_ellipse_expand_ratio': 1.1, 'time_masked_after_shower': 15.0, 'max_extended_radius': 200, 'minimum_groups': 3, 'minimum_sigclip_groups': 100, 'only_use_ints': True}, 'ramp_fit': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'int_name': '', 'save_opt': False, 'opt_name': '', 'suppress_one_group': True, 'maximum_cores': '1'}, 'gain_scale': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': ''}}}

It would be much better to have formatted output. Here's 3 options:

Use yaml.dump()

pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: False
suffix: None
search_output_file: True
input_dir: ''
save_calibrated_ramp: False
steps:
  group_scale:
    pre_hooks: []
    post_hooks: []
    output_file: None
    output_dir: None
    output_ext: .fits
    output_use_model: False
    output_use_index: True
    save_results: False
    skip: False
    suffix: None
    search_output_file: True
    input_dir: ''
  dq_init:
    pre_hooks: []
    post_hooks: []
    output_file: None
    output_dir: None
    output_ext: .fits
    output_use_model: False
    output_use_index: True
    save_results: False
    skip: False
    suffix: None
    search_output_file: True
    input_dir: ''
  emicorr:
    pre_hooks: []
    post_hooks: []
    output_file: None
    output_dir: None
    output_ext: .fits
    output_use_model: False
    output_use_index: True
    save_results: False
    skip: True
    suffix: None
    search_output_file: True
    input_dir: ''
    save_intermediate_results: False
    user_supplied_reffile: None
    nints_to_phase: None
    nbins: None
    scale_reference: True
  saturation:
    pre_hooks: []
    post_hooks: []
    output_file: None
    output_dir: None
    output_ext: .fits
    output_use_model: False
    output_use_index: True
    save_results: False
    skip: False
    suffix: None
    search_output_file: True
    input_dir: ''
    n_pix_grow_sat: 1
...
  jump:
    pre_hooks: []
    post_hooks: []
    output_file: None
    output_dir: None
    output_ext: .fits
    output_use_model: False
    output_use_index: True
    save_results: False
    skip: False
    suffix: None
    search_output_file: True
    input_dir: ''
    rejection_threshold: 4.0
    three_group_rejection_threshold: 6.0
    four_group_rejection_threshold: 5.0
    maximum_cores: '1'
    flag_4_neighbors: True
    max_jump_to_flag_neighbors: 1000.0
    min_jump_to_flag_neighbors: 10.0
    after_jump_flag_dn1: 0.0
    after_jump_flag_time1: 0.0
    after_jump_flag_dn2: 0.0
    after_jump_flag_time2: 0.0
    expand_large_events: False
    min_sat_area: 1.0
    min_jump_area: 5.0
    expand_factor: 2.0
    use_ellipses: False
    sat_required_snowball: True
    min_sat_radius_extend: 2.5
    sat_expand: 2
    edge_size: 25
    find_showers: False
    extend_snr_threshold: 1.2
    extend_min_area: 90
    extend_inner_radius: 1.0
    extend_outer_radius: 2.6
    extend_ellipse_expand_ratio: 1.1
    time_masked_after_shower: 15.0
    max_extended_radius: 200
    minimum_groups: 3
    minimum_sigclip_groups: 100
    only_use_ints: True
  ramp_fit:
    pre_hooks: []
    post_hooks: []
    output_file: None
    output_dir: None
    output_ext: .fits
    output_use_model: False
    output_use_index: True
    save_results: False
    skip: False
    suffix: None
    search_output_file: True
    input_dir: ''
    int_name: ''
    save_opt: False
    opt_name: ''
    suppress_one_group: True
    maximum_cores: '1'
  gain_scale:
    pre_hooks: []
    post_hooks: []
    output_file: None
    output_dir: None
    output_ext: .fits
    output_use_model: False
    output_use_index: True
    save_results: False
    skip: False
    suffix: None
    search_output_file: True
    input_dir: ''

Use json.dumps

{
  "pre_hooks": [],
  "post_hooks": [],
  "output_file": None,
  "output_dir": None,
  "output_ext": ".fits",
  "output_use_model": False,
  "output_use_index": True,
  "save_results": False,
  "skip": False,
  "suffix": None,
  "search_output_file": True,
  "input_dir": "",
  "save_calibrated_ramp": False,
  "steps": {
    "group_scale": {
      "pre_hooks": [],
      "post_hooks": [],
      "output_file": None,
      "output_dir": None,
      "output_ext": ".fits",
      "output_use_model": False,
      "output_use_index": True,
      "save_results": False,
      "skip": False,
      "suffix": None,
      "search_output_file": True,
      "input_dir": ""
    },
    "dq_init": {
      "pre_hooks": [],
      "post_hooks": [],
      "output_file": None,
      "output_dir": None,
      "output_ext": ".fits",
      "output_use_model": False,
      "output_use_index": True,
      "save_results": False,
      "skip": False,
      "suffix": None,
      "search_output_file": True,
      "input_dir": ""
    },
    "emicorr": {
      "pre_hooks": [],
      "post_hooks": [],
      "output_file": None,
      "output_dir": None,
      "output_ext": ".fits",
      "output_use_model": False,
      "output_use_index": True,
      "save_results": False,
      "skip": True,
      "suffix": None,
      "search_output_file": True,
      "input_dir": "",
      "save_intermediate_results": False,
      "user_supplied_reffile": None,
      "nints_to_phase": None,
      "nbins": None,
      "scale_reference": True
    },
    "saturation": {
      "pre_hooks": [],
      "post_hooks": [],
      "output_file": None,
      "output_dir": None,
      "output_ext": ".fits",
      "output_use_model": False,
      "output_use_index": True,
      "save_results": False,
      "skip": False,
      "suffix": None,
      "search_output_file": True,
      "input_dir": "",
      "n_pix_grow_sat": 1
    },
...
    "jump": {
      "pre_hooks": [],
      "post_hooks": [],
      "output_file": None,
      "output_dir": None,
      "output_ext": ".fits",
      "output_use_model": False,
      "output_use_index": True,
      "save_results": False,
      "skip": False,
      "suffix": None,
      "search_output_file": True,
      "input_dir": "",
      "rejection_threshold": 4.0,
      "three_group_rejection_threshold": 6.0,
      "four_group_rejection_threshold": 5.0,
      "maximum_cores": "1",
      "flag_4_neighbors": True,
      "max_jump_to_flag_neighbors": 1000.0,
      "min_jump_to_flag_neighbors": 10.0,
      "after_jump_flag_dn1": 0.0,
      "after_jump_flag_time1": 0.0,
      "after_jump_flag_dn2": 0.0,
      "after_jump_flag_time2": 0.0,
      "expand_large_events": False,
      "min_sat_area": 1.0,
      "min_jump_area": 5.0,
      "expand_factor": 2.0,
      "use_ellipses": False,
      "sat_required_snowball": True,
      "min_sat_radius_extend": 2.5,
      "sat_expand": 2,
      "edge_size": 25,
      "find_showers": False,
      "extend_snr_threshold": 1.2,
      "extend_min_area": 90,
      "extend_inner_radius": 1.0,
      "extend_outer_radius": 2.6,
      "extend_ellipse_expand_ratio": 1.1,
      "time_masked_after_shower": 15.0,
      "max_extended_radius": 200,
      "minimum_groups": 3,
      "minimum_sigclip_groups": 100,
      "only_use_ints": True
    },
    "ramp_fit": {
      "pre_hooks": [],
      "post_hooks": [],
      "output_file": None,
      "output_dir": None,
      "output_ext": ".fits",
      "output_use_model": False,
      "output_use_index": True,
      "save_results": False,
      "skip": False,
      "suffix": None,
      "search_output_file": True,
      "input_dir": "",
      "int_name": "",
      "save_opt": False,
      "opt_name": "",
      "suppress_one_group": True,
      "maximum_cores": "1"
    },
    "gain_scale": {
      "pre_hooks": [],
      "post_hooks": [],
      "output_file": None,
      "output_dir": None,
      "output_ext": ".fits",
      "output_use_model": False,
      "output_use_index": True,
      "save_results": False,
      "skip": False,
      "suffix": None,
      "search_output_file": True,
      "input_dir": ""
    }
  }
}

Use pprint.pformat()

{'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': '',
'save_calibrated_ramp': False,
'steps': {'group_scale': {'pre_hooks': [],
                        'post_hooks': [],
                        'output_file': None,
                        'output_dir': None,
                        'output_ext': '.fits',
                        'output_use_model': False,
                        'output_use_index': True,
                        'save_results': False,
                        'skip': False,
                        'suffix': None,
                        'search_output_file': True,
                        'input_dir': ''},
         'dq_init': {'pre_hooks': [],
                    'post_hooks': [],
                    'output_file': None,
                    'output_dir': None,
                    'output_ext': '.fits',
                    'output_use_model': False,
                    'output_use_index': True,
                    'save_results': False,
                    'skip': False,
                    'suffix': None,
                    'search_output_file': True,
                    'input_dir': ''},
         'emicorr': {'pre_hooks': [],
                    'post_hooks': [],
                    'output_file': None,
                    'output_dir': None,
                    'output_ext': '.fits',
                    'output_use_model': False,
                    'output_use_index': True,
                    'save_results': False,
                    'skip': True,
                    'suffix': None,
                    'search_output_file': True,
                    'input_dir': '',
                    'save_intermediate_results': False,
                    'user_supplied_reffile': None,
                    'nints_to_phase': None,
                    'nbins': None,
                    'scale_reference': True},
         'saturation': {'pre_hooks': [],
                       'post_hooks': [],
                       'output_file': None,
                       'output_dir': None,
                       'output_ext': '.fits',
                       'output_use_model': False,
                       'output_use_index': True,
                       'save_results': False,
                       'skip': False,
                       'suffix': None,
                       'search_output_file': True,
                       'input_dir': '',
                       'n_pix_grow_sat': 1},
...
         'jump': {'pre_hooks': [],
                 'post_hooks': [],
                 'output_file': None,
                 'output_dir': None,
                 'output_ext': '.fits',
                 'output_use_model': False,
                 'output_use_index': True,
                 'save_results': False,
                 'skip': False,
                 'suffix': None,
                 'search_output_file': True,
                 'input_dir': '',
                 'rejection_threshold': 4.0,
                 'three_group_rejection_threshold': 6.0,
                 'four_group_rejection_threshold': 5.0,
                 'maximum_cores': '1',
                 'flag_4_neighbors': True,
                 'max_jump_to_flag_neighbors': 1000.0,
                 'min_jump_to_flag_neighbors': 10.0,
                 'after_jump_flag_dn1': 0.0,
                 'after_jump_flag_time1': 0.0,
                 'after_jump_flag_dn2': 0.0,
                 'after_jump_flag_time2': 0.0,
                 'expand_large_events': False,
                 'min_sat_area': 1.0,
                 'min_jump_area': 5.0,
                 'expand_factor': 2.0,
                 'use_ellipses': False,
                 'sat_required_snowball': True,
                 'min_sat_radius_extend': 2.5,
                 'sat_expand': 2,
                 'edge_size': 25,
                 'find_showers': False,
                 'extend_snr_threshold': 1.2,
                 'extend_min_area': 90,
                 'extend_inner_radius': 1.0,
                 'extend_outer_radius': 2.6,
                 'extend_ellipse_expand_ratio': 1.1,
                 'time_masked_after_shower': 15.0,
                 'max_extended_radius': 200,
                 'minimum_groups': 3,
                 'minimum_sigclip_groups': 100,
                 'only_use_ints': True},
         'ramp_fit': {'pre_hooks': [],
                     'post_hooks': [],
                     'output_file': None,
                     'output_dir': None,
                     'output_ext': '.fits',
                     'output_use_model': False,
                     'output_use_index': True,
                     'save_results': False,
                     'skip': False,
                     'suffix': None,
                     'search_output_file': True,
                     'input_dir': '',
                     'int_name': '',
                     'save_opt': False,
                     'opt_name': '',
                     'suppress_one_group': True,
                     'maximum_cores': '1'},
         'gain_scale': {'pre_hooks': [],
                       'post_hooks': [],
                       'output_file': None,
                       'output_dir': None,
                       'output_ext': '.fits',
                       'output_use_model': False,
                       'output_use_index': True,
                       'save_results': False,
                       'skip': False,
                       'suffix': None,
                       'search_output_file': True,
                       'input_dir': ''}}}

@hbushouse and @nden, 👍 or 👎? And if so, which of the 3 is preferable?

stpipe behaves weirdly with loggers

I'm currently trying to use my own logging and wrap around some of the JWST pipeline. This works OK, but the logs seem to produce multiple outputs once I've used anything that wraps around stpipe. I import a logger at the top of my submodule like :

log = logging.getLogger(__name__)

and then log.info('Test message') produces

[2023-06-29 14:28:39,968] pjpipe.download.download_step - INFO - Test message

as expected. However, after I open up a datamodel

im = datamodels.open(filename) log.info('Test message')

then produces 3 messages:

[2023-06-29 14:28:41,361] pjpipe.download.download_step - INFO - Test message
2023-06-29 14:28:41,361 - stpipe - INFO - Test message
[2023-06-29 14:28:41,361] stpipe - INFO - Test message

This doesn't seem like intended behaviour, and seems like it should be a relatively easy fix

	validator.functions["input_file"] = _get_input_file_check(root_dir)
	validator.functions["output_file"] = _get_output_file_check(root_dir)
	validator.functions["is_datamodel"] = _is_datamodel
	validator.functions["is_string_or_datamodel"] = _is_string_or_datamodel