Provides base classes and command-line tools for implementing calibration pipeline software.
spacetelescope / stpipe Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://stpipe.readthedocs.io
License: Other
Home Page: https://stpipe.readthedocs.io
License: Other
Issue JP-1894 was created on JIRA by Edward Slavich:
Implement the "stpipe run" CLI subcommand. Here is the tentative help output for the command:
$ ./stpipe run -h
usage: stpipe run [-h] [--config <path>] [-d] [-p <name>=<value>] [-v] <class> <input-file>
run a step or pipeline
positional arguments:
<class> step or pipeline class name (case-insensitive, module path may be omitted for unique class names)
<input-file> input dataset or association
optional arguments:
-h, --help show this help message and exit
--config <path> config file (use 'stpipe print-config' to save and edit the default config)
-d, --debug debug logging (DEBUG level)
-p <name>=<value> override an individual step or pipeline parameter (use 'stpipe describe' to list available parameters)
-v, --verbose verbose logging (INFO level)
examples:
run a pipeline with default parameters recommended by CRDS:
stpipe run jwst.pipeline.Detector1Pipeline dataset.fits
run a pipeline with parameters specified in a local config:
stpipe run --config config.asdf jwst.pipeline.Detector1Pipeline dataset.fits
override an individual pipeline parameter:
stpipe run -p save_calibrated_ramp=true jwst.pipeline.Detector1Pipeline dataset.fits
override an individual step parameter:
stpipe run -p jump.rejection_threshold=3.0 jwst.pipeline.Detector1Pipeline dataset.fits
See #8 for more discussion.
configobj is subject to a ReDoS: GHSA-c33w-24p9-8m24
This regex used to parse the config/spec items suffers from catastrophic backtracking:
This is one useful write-up on the issue: https://www.regular-expressions.info/catastrophic.html
The upstream/bundled project appears abandoned but does have an open PR that appears to fix the offending regex:
DiffSK/configobj#236
Given the level of control necessary to exploit the issue (it seems likely a Step.spec
could be crafted to exploit it) it seems like a low risk as the step could also just include a while True
. I haven't thought through what this means for command line options or if a ReDoS in python is limited to one cpu.
Currently, strun
lives on it's own in scripts and is pulled in via shenanigans via setup.py. It should really have it's own entry in pyproject.toml:
[project.scripts]
stpipe = 'stpipe.cli.main:main'
and live with the rest of the cli scripts.
Currently coverage uploads are failing.
https://github.com/spacetelescope/stpipe/pull/15/checks?check_run_id=1857492557#step:5:79
Use the Github action provided by Codecov to do this in a separate action.
This issue is for discussion of potential improvements to the Python interface for creating and running steps and pipelines. First a summary of the current state of affairs:
Column key:
Step
subclass based on arguments or config file contents?Method | Selects subclass | Runs step | CRDS pars | Config input | Override parameters | Override style | Notes |
---|---|---|---|---|---|---|---|
__init__ | ✗ | ✗ | ✗ | ✗ | ✓ | keyword | Accepts a config_file argument but does not apply parameters from it. |
call | ✗ | ✓ | ✓ | ✓ | ✓ | keyword | User config file passed as keyword argument. Config file's class field ignored. |
from_cmdline | ✓ | ✓ | ✓ | ✓ | ✓ | CLI | Selects Step subclass based on config class field or class name argument. |
from_config_file | ✓ | ✗ | ✗ | ✓ | ✗ | Selects Step subclass based on config class field. |
|
from_config_section | ✗ | ✗ | ✗ | ✓ | ✗ | Probably not intended to be part of the public API. |
Column key:
Step
instance before running it?Method | Creates step | Notes |
---|---|---|
__call__ | ✗ | Alias for run . |
call | ✓ | Python API only (not used by CLI code). |
from_cmdline | ✓ | The strun script is a thin wrapper around this method. |
process | ✗ | Subclass implementation method. Not intended to be called directly by general users. |
run | ✗ | Eventually called by any method that needs to run the step. |
from_cmdline
and instead call the corresponding cmdline
module method directly.process
to make clear that it shouldn't be invoked by users. Maybe a name with a leading underscore, or something like run_impl
.run
or __call__
so that usage is uniform.config_file
argument to __init__
to something like working_dir
, to make clear that the config is not loaded.from_cmdline
vs call
). This will ensure consistency between the two interfaces. There is already some divergence between call
and from_cmdline
, e.g. the _pars_model
atttribute is not set by call
, and call
doesn't know how to select the Step
subclass based on a config.dict
argument instead of **kwargs
. This provides a clear separation between the parameters and other method arguments.Step.__init__(self, params=None, working_dir=None, ...)
: Parameters are passed to initializer in a dict
.Step.call_impl(self, *args)
: Step
subclass implementation.Step.__call__(self, *args)
: Wrapper around call_impl
that handles common setup and teardown.stpipe.create_step(*, step_class=None, config_path=None, crds_params_enabled=True, dataset=None, params=None, working_dir=None, ...)
: Convenience method for creating steps. At least one of step_class
or config_path
is required to determine the step class. dataset
is required if crds_params_enabled
is True
.stpipe.cmdline.from_cmdline(args)
: Method that parses CLI arguments. Ends in a call to stpipe.create_step
.from stpipe.cmdline import from_cmdline
# stpipe step run config.cfg dataset.asdf --foo=42
step, inputs = from_cmdline(args)
step(*inputs)
from stpipe import create_step
step = create_step(config_path="config.cfg", dataset="dataset.asdf", params={"foo": 42})
step("dataset.asdf")
or
from stpipe import create_step
step = create_step(config_path="config.asdf", dataset="dataset.asdf")
step.some_param = "some_value"
step()
from stpipe import Step
class MyStep(Step):
def call_impl(self, dataset):
print(f"Value of foo: {self.foo}")
step = MyStep(params={"foo": 42})
step("dataset.asdf")
Unless explicitly calling self.save_model(format=False), the default file save implementation in Step.run does not allow for output filename format control. A parameter could be added to the Step spec that, when set, could set format=False.
Issue connected to JP-1793
Issue JP-1889 was created on JIRA by Edward Slavich:
The stpipe.create_step method will be the new Python API for creating CRDS-configured Step instances. Tentative method signature:
def create_step(input_file, step_class=None, config_file=None):
"""
Create a Step instance with parameters configured by CRDS.
One of step_class or config_file must be specified.
Parameters
----------
input_file : str or pathlib.PurePath or
stdatamodels.DataModel or stdatamodels.ModelContainer
Dataset whose header will be used to select a config reference
from CRDS. Also the file assigned to the Step's input_file
parameter.
step_class : type, optional
Step class to instantiate. Must be a subclass of stpipe.Step.
config_file : str or pathlib.PurePath or asdf.AsdfFile
ASDF Step configuration file.```
The error catch implemented in the try/except around the keyword setting of 'SKIPPED' for skipped steps doesn't catch (or appropriately set) keywords for input still in a ModelContainer.
Add a job that runs the jwst
and romancal
unit tests to the CI workflow.
This could be a manual dispatch and/or per-PR as the current CI is. Maybe both?!
We currently allow abbreviations for long argparse arguments in the command line interface for stpipe
. We should not.
As an example, refpix
has as a parameter use_side_ref_pixels
. If calling from within Python, one needs to use the full parameter name to change it. If calling from the command line, any of the following will currently work:
--steps.refpix.use_side_ref_pixels=True
--steps.refpix.use_side_ref_pix=True
--steps.refpix.use_side=True
--steps.refpix.use=True
--steps.refpix.u=True
From a user interface perspective, this causes confusion, and it violates several Zen of Python principles.
This is, btw, the default behavior for argparse
. Something I did not know.
Anyway, for consistency, it would be good to use allow_abbrev=False
in the stpipe
parser. See
https://docs.python.org/3/library/argparse.html#allow-abbrev
This will make debugging user problems much easier, as if a user uses --steps.refix.u=True
in a script, the same param won't work within Python. And if we add a new parameter use_side_of_fries_with_that
, then it will break that user's script.
Thanks to @bhilbert4 for pointing this out.
The WebbPSF changes to romancal
now require additional data to be made available in order for the CI to properly run. This has already been done for the romancal
CI, but it needs to be ported to stpipe
in order for its testing to continue properly.
Issue JP-1888 was created on JIRA by Edward Slavich:
In order to support existing Step implementations, we'll need to be able to parse the ConfigObj spec string and generate corresponding traitlet attributes. We'll need a new metaclass for JwstStep that creates the traitlets when the class is defined.
The traitlets package is used by IPython/Jupyter for application configuration. It handles type checking and other validation and mapping to CLI arguments and config files. It also comes with logger configuration support built-in. We may be able replace configobj and a bunch of custom code with traitlets, and usage would be familiar to users who have previously configured ipython, the notebook server, etc.
Here's an example of how traitlets work:
#!/usr/bin/env python3
from traitlets import Integer, Enum, Bool, validate, TraitError
from traitlets.config import Application, LoggingConfigurable
class Step(LoggingConfigurable):
save_results = Bool(
default_value=False,
help="""
Force save results for an intermediate step.
""",
config=True,
)
class Extract1dStep(Step):
smoothing_length = Integer(
default_value=None,
allow_none=True,
help="""
If set, the background regions (if any) will be smoothed
with a boxcar function of this width along the dispersion
direction. This must be an odd integer.
""",
config=True,
)
@validate("smoothing_length")
def _validate_odd(self, proposal):
value = proposal["value"]
trait = proposal["trait"]
if value % 2 == 0:
raise TraitError(f"{trait.name} must be an odd integer")
return value
bkg_fit = Enum(
["poly", "mean", "median"],
default_value="poly",
help="""
A string indicating the type of fitting to be applied to
background values in each column (or row, if the dispersion is
vertical).
""",
config=True,
)
bkg_order = Integer(
default_value=None,
min=0,
allow_none=True,
help="""
If present, a polynomial with order 'bkg_order' will be fit to
each column (or row, if the dispersion direction is vertical)
of the background region or regions. For a given column (row),
one polynomial will be fit to all background regions. The
polynomial will be evaluated at each pixel of the source
extraction region(s) along the column (row), and the fitted value
will be subtracted from the data value at that pixel.
If both 'smoothing_length' and 'bkg_order' are not None, the
boxcar smoothing will be done first.
""",
config=True,
)
log_increment = Integer(
default_value=50,
help="""
If log_increment is greater than 0 and the input data are multi-integration
(which can be CubeModel or SlitModel), a message will be written to the log
with log level INFO every log_increment integrations. This is intended to
provide progress information when invoking the step interactively.
""",
config=True,
)
subtract_background = Bool(
default_value=None,
allow_none=True,
help="""
A flag which indicates whether the background should be subtracted.
If absent, the value in the extract_1d reference file will be used.
If present, this parameter overrides the value in the
extract_1d reference file.
""",
config=True,
)
use_source_posn = Bool(
default_value=None,
allow_none=True,
help="""
If True, the source and background extraction positions specified in
the extract1d reference file (or the default position, if there is no
reference file) will be shifted to account for the computed position
of the source in the data. If absent, the values in the
reference file will be used. Aperture offset is determined by computing
the pixel location of the source based on its RA and Dec. It does not
make sense to apply aperture offsets for extended sources, so this
parameter can be overriden (set to False) internally by the step.
""",
config=True,
)
apply_apcorr = Bool(
default_value=True,
help="""
Switch to select whether or not to apply an APERTURE correction during
the Extract1dStep.
""",
config=True,
)
def __call__(self):
self.log.debug(f"Called Extract1dStep.__call__")
self.log.debug(f"smoothing_length={self.smoothing_length}")
self.log.debug(f"save_results={self.save_results}")
pass
class Stpipe(Application):
name = "stpipe"
description = "Calibration pipeline CLI"
classes = [Extract1dStep]
def start(self):
self.log.debug("Called Stpipe.start")
step = Extract1dStep(config=self.config)
step()
def main():
Stpipe.launch_instance()
if __name__ == "__main__":
main()
Calling this script with no arguments:
$ ./stpipe
[Stpipe] smoothing_length=None
[Stpipe] save_results=False
Overriding save_results for all subclasses of Step
:
$ ./stpipe --Step.save_results=true
[Stpipe] smoothing_length=None
[Stpipe] save_results=True
Overriding for just Extract1dStep
:
$ ./stpipe --Extract1dStep.save_results=true
[Stpipe] smoothing_length=None
[Stpipe] save_results=True
Invalid smoothing_length caught by custom validator:
$ ./stpipe --Extract1dStep.smoothing_length=10
Traceback (most recent call last):
...
traitlets.traitlets.TraitError: smoothing_length must be an odd integer
Okay, fine, let's turn it up to 11:
$ ./stpipe --Extract1dStep.smoothing_length=11
[Stpipe] smoothing_length=11
[Stpipe] save_results=False
Built-in support for log configuration:
$ ./stpipe --Stpipe.log_format='%(asctime)s - %(levelname)s - %(message)s' --Stpipe.log_level=DEBUG
2021-01-21 22:55:22 - DEBUG - Called Stpipe.start
2021-01-21 22:55:22 - DEBUG - Called Extract1dStep.__call__
2021-01-21 22:55:22 - INFO - smoothing_length=None
2021-01-21 22:55:22 - INFO - save_results=False
Auto-generated help for Extract1dStep:
Extract1dStep(Step) options
---------------------------
--Extract1dStep.apply_apcorr=<Bool>
Switch to select whether or not to apply an APERTURE correction during the
Extract1dStep.
Default: True
--Extract1dStep.bkg_fit=<Enum>
A string indicating the type of fitting to be applied to background values
in each column (or row, if the dispersion is vertical).
Choices: any of ['poly', 'mean', 'median']
Default: 'poly'
--Extract1dStep.bkg_order=<Int>
If present, a polynomial with order 'bkg_order' will be fit to each column
(or row, if the dispersion direction is vertical) of the background region
or regions. For a given column (row), one polynomial will be fit to all
background regions. The polynomial will be evaluated at each pixel of the
source extraction region(s) along the column (row), and the fitted value
will be subtracted from the data value at that pixel. If both
'smoothing_length' and 'bkg_order' are not None, the boxcar smoothing will
be done first.
Default: None
--Extract1dStep.log_increment=<Int>
If log_increment is greater than 0 and the input data are multi-integration
(which can be CubeModel or SlitModel), a message will be written to the log
with log level INFO every log_increment integrations. This is intended to
provide progress information when invoking the step interactively.
Default: 50
--Extract1dStep.save_results=<Bool>
Force save results for an intermediate step.
Default: False
--Extract1dStep.smoothing_length=<Int>
If set, the background regions (if any) will be smoothed with a boxcar
function of this width along the dispersion direction. This must be an odd
integer.
Default: None
--Extract1dStep.subtract_background=<Bool>
A flag which indicates whether the background should be subtracted. If
absent, the value in the extract_1d reference file will be used. If present,
this parameter overrides the value in the extract_1d reference file.
Default: None
--Extract1dStep.use_source_posn=<Bool>
If True, the source and background extraction positions specified in the
extract1d reference file (or the default position, if there is no reference
file) will be shifted to account for the computed position of the source in
the data. If absent, the values in the reference file will be used.
Aperture offset is determined by computing the pixel location of the source
based on its RA and Dec. It does not make sense to apply aperture offsets
for extended sources, so this parameter can be overriden (set to False)
internally by the step.
Default: None
Config file example (we wouldn't have to use this, but it's available if we need it):
c.Extract1dStep.save_results = True
c.Extract1dStep.smoothing_length = 11
This issue came up while adding a spec to jwst ami analyze in:
spacetelescope/jwst#7862
An attempt to add an inline comment to a spec entry like the following
src = string(default='A0V') # Source spectral type for model (Phoenix models)
Failed with an odd error:
ValidationError: Config parameter 'src': missing
Removing the parentheses enclosing "Phoenix models" fixed the issue.
This appears to be related to the _inspec
argument provided to the ConfigObj
:
stpipe/src/stpipe/config_parser.py
Line 202 in e82a1f0
inline_comments
attribute with accurate values for the comments. I'm not sure what preserve_comments
is supposed to be doing.
Perhaps one way to address this would be to:
_inspec
argumentpreserve_comments
Hi!
I was wondering if pathlib.Path
objects were supported in stpipe
(and packages depending on it like jwst
) or if support was planned in the future.
I wanted to report an example issue I encountered recently: using a Path
object as input to Step.run()
leads to the _input_filename
attribute not being set because the Path
falls in the else
category here.
The simple fix for users is to use strings as input path, but it took me a bit of time to figure this out because I missed the error message in the log.
Issue JP-1817 was created on JIRA by Jonathan Eisenhamer:
In Python, setting up logging for Step.call
is possible through stpipe.log
module. Even with documentation, this is a bit obtuse. Users are expecting a keyword argument, similar to the strun
call. Add the keyword argument such that Step.call(verbose=True)
is available.
https://github.com/spacetelescope/stpipe/blob/main/src/stpipe/resources/schemas/step_config_with_metadata-1.0.0.yaml
Is unused in stpipe
.
One use exists in the following jwst
test:
https://github.com/spacetelescope/jwst/blob/master/jwst/stpipe/tests/test_config.py#L38
I suspect that the schema was intended to be used in:
Line 93 in d20b642
Line 129 in d20b642
@jdavies-st this might stem from spacetelescope/jwst#5695 Do you recall if this schema was intended only for test use or was the plan to use it to validate step configs (that contain metadata)?
Issue JP-1896 was created on JIRA by Edward Slavich:
The "stpipe describe" command will show the docstring for a Step subclass, its sub-steps in the case of a Pipeline, and description of the available parameters. Tentative help output:
$ ./stpipe describe -h
usage: stpipe describe [-h] [-v] <class>
describe a step or pipeline
positional arguments:
<class> step or pipeline class name (case-insensitive, module path may be omitted for unique class names)
optional arguments:
-h, --help show this help message and exit
-v, --verbose verbose output
examples:
print step description and parameters:
stpipe describe jwst.step.RampFitStep
increase verbosity to include parameters for nested steps:
stpipe describe --verbose jwst.pipeline.Detector1Pipeline
Hi,
I found the strun command does not handle the following commandline correctly:
strun <other args> --steps.resample.output_shape 1000,1000
----------------------------------------------------------------------
ERROR PARSING CONFIGURATION:
Config parameter 'output_shape': the value "1000,1000" is of the
wrong type.
I did some debugging and found that this was because when Step is created, the param_args that passed to the instance is the raw args parsed from the ArgumentParser. These values are not handled by the cmdline._override_config_from_args
nor by config_parser.string_to_python_type
I put together and a quick fix for now https://github.com/Jerry-Ma/stpipe/tree/fix_list_like_cmdline_args, let me know if you want me to make a PR.
Currently specifying a log file requires editing a .cfg
file (and has the drawback that if one forgets that change to that file, one can be confused as why one is not seeing any output when running a task).
From the command line it should be possible to specify a log file as a command line option, and in Python, as a keyword argument to the method starting a pipeline step or pipeline.
Even better, it is useful to see both output to the log file in a terminal, and have that output go to a log file, so some sort of "tee" option should also be provided.
Currently the code assumes meta.filename will exist, but models should be free to store it in a different location. We should add a method to AbstractDataModel
that accesses the filename wherever it lives.
Issue JP-1897 was created on JIRA by Edward Slavich:
The "stpipe print-config" or "stpipe save-config" subcommand (part of this ticket will be polling opinions on which of these will work best) will allow users to save Step subclass default and CRDS parameters to an ASDF config file which the user can customize. An example help output for the "print-config" option, which prints the ASDF to stdout where it can be redirected as needed:
$ ./stpipe print-config -h
usage: stpipe print-config [-h] <class> <input-file>
print step or pipeline config to stdout
positional arguments:
<class> step or pipeline class name (case-insensitive, module path may be omitted for unique class names)
<input-file> input dataset or association (used to fetch parameters from CRDS)
optional arguments:
-h, --help show this help message and exit
examples:
save a pipeline config to a local file:
stpipe print-config jwst.pipeline.Detector1Pipeline dataset.fits > config.asdf
See #8 for discussion.
If I do:
$ strun calwebb_image3 --help
usage: strun [-h] [--logcfg LOGCFG] [--verbose] [--debug]
[--save-parameters SAVE_PARAMETERS] [--disable-crds-steppars]
[--pre_hooks] [--post_hooks] [--output_file] [--output_dir]
[--output_ext] [--output_use_model] [--output_use_index]
...
Image3Pipeline: Applies level 3 processing to imaging-mode data from any JWST
instrument. Included steps are: assign_mtwcs tweakreg skymatch
outlier_detection resample source_catalog
positional arguments:
cfg_file_or_class The configuration file or Python class to run
args arguments to pass to step
options:
-h, --help show this help message and exit
--logcfg LOGCFG The logging configuration file to load
--verbose, -v Turn on all logging messages
--debug When an exception occurs, invoke the Python debugger,
pdb
--save-parameters SAVE_PARAMETERS
Save step parameters to specified file.
--disable-crds-steppars
Disable retrieval of step parameter references files
from CRDS
--pre_hooks
--post_hooks
--output_file File to save output to.
--output_dir Directory path for output files
--output_ext Output file type
--output_use_model When saving use `DataModel.meta.filename`
--output_use_index Append index.
...
--steps.tweakreg.kernel_fwhm
Gaussian kernel FWHM in pixels
--steps.tweakreg.snr_threshold
SNR threshold above the bkg
--steps.tweakreg.sharplo
The lower bound on sharpness for object detection.
--steps.tweakreg.sharphi
The upper bound on sharpness for object detection.
--steps.tweakreg.roundlo
The lower bound on roundness for object detection.
--steps.tweakreg.roundhi
The upper bound on roundness for object detection.
--steps.tweakreg.brightest
Keep top ``brightest`` objects
--steps.tweakreg.peakmax
Filter out objects with pixel values >= ``peakmax``
--steps.tweakreg.bkg_boxsize
The background mesh box size in pixels.
--steps.tweakreg.enforce_user_order
Align images in user specified order?
--steps.tweakreg.expand_refcat
Expand reference catalog with new sources?
...
Note that while it lists available args, it does not list what the defaults are. The defaults are in the spec
attribute for each class, so it should be easy to list them in the --help
as well.
This is actually the quickest, easiest interface for finding out what a pipeline's options are, but it does not give the defaults. There is currently no reliable interface to do this nicely within Python, as Pipeline.spec
is just a string and does not give the step parameters.
This is an important user interface issue that should be fixed.
Side note: The logging from the run of the pipeline does list the used parameters (some of which may be defaults), but of course it is very difficult (impossible?) to read in the jumble dict that is dumped in logging. Formatting of one parameter per line something like the above would be much preferable.
Currently the plan is to have romancal use tagged ASDF files, but doing so requires a different class implementation for data models. Stpipe should be able to accommodate different datamodel implementations without requiring subclassing of the stdatamodel implementation. Currently the only dependence on the datamodel class by stpipe is on these methods:
Issue JP-1898 was created on JIRA by Edward Slavich:
Add support for defining Step parameters as traitlets. The Step class will need to inherit from traitlets.HasTraits, and we'll need to include any traitlet parameters when generating config files, describing steps, etc.
Replace all instances thereof, and then in pyproject.toml:
[tool.pytest.ini_options]
addopts = [
"-p no:legacypath",
]
This is will make the code more robust for use of pathlib.Path
instances in the runtime code.
Something I forgot to do initially...
Currently, Step
parameter validation requires file paths to be strings, specifically for overrid_<reffile>
cases. pathlib.Path
objects should also pass validation. Specifically this means modifying the custom validation functions here
stpipe/src/stpipe/config_parser.py
Lines 331 to 334 in b928337
@jemorrison reported that parsing list arguments of steps does not work. Indeed, I verified that running strun
with
--steps.resample.output_shape=500,800
which used to work before, no longer works:
ValueError: Config parameter 'output_shape': the value "500,800" is of the wrong type.
CC: @stscieisenhamer
After some detective work, I determined that #57 broke support for list arguments.
When a step runs, it logs the parameters used, but it misses the reffile overrides. Using the same Step.get_pars()
method that the log message uses, we see:
>>> from jwst.step import GainScaleStep
>>> gain_scale = GainScaleStep()
>>> gain_scale.get_pars()
{'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': ''}
Compared to the command line --help
:
$ strun gain_scale --help
usage: strun [-h] [--logcfg LOGCFG] [--verbose] [--debug] [--save-parameters SAVE_PARAMETERS]
[--disable-crds-steppars] [--pre_hooks] [--post_hooks] [--output_file] [--output_dir] [--output_ext]
[--output_use_model] [--output_use_index] [--save_results] [--skip] [--suffix] [--search_output_file]
[--input_dir] [--override_gain]
cfg_file_or_class [args ...]
GainScaleStep: Rescales countrate data to account for use of a non-standard gain value. All integrations are
multiplied by the factor GAINFACT.
positional arguments:
cfg_file_or_class The configuration file or Python class to run
args arguments to pass to step
options:
-h, --help show this help message and exit
--logcfg LOGCFG The logging configuration file to load
--verbose, -v Turn on all logging messages
--debug When an exception occurs, invoke the Python debugger, pdb
--save-parameters SAVE_PARAMETERS
Save step parameters to specified file.
--disable-crds-steppars
Disable retrieval of step parameter references files from CRDS
--pre_hooks [default=list]
--post_hooks [default=list]
--output_file File to save output to.
--output_dir Directory path for output files
--output_ext Output file type [default='.fits']
--output_use_model When saving use `DataModel.meta.filename` [default=False]
--output_use_index Append index. [default=True]
--save_results Force save results [default=False]
--skip Skip this step [default=False]
--suffix Default suffix for output files
--search_output_file
Use outputfile define in parent step [default=True]
--input_dir Input directory
--override_gain Override the gain reference file
Note that the override_gain
parameter is not printed out. This is true whether an override has been supplied or not.
So somehow the merged config grabbed by get_pars()
is missing the reffile overrides.
Issue JP-1884 was created on JIRA by Edward Slavich:
Currently a Step instance accepts an input file as an argument to the run method, which encourages reusing the step with subsequent files. Now that we have CRDS config references, this isn't a good idea, because the Step's parameters may only be appropriate for the original file.
Hi,
stpipe: 0.5.1
During update of the package for Guix downstream from 0.5.0 to 0.5.1 I've faced with the issue of failing test:
============================= test session starts ==============================
platform linux -- Python 3.10.7, pytest-7.1.3, pluggy-1.0.0 -- /gnu/store/l6fpy0i9hlll9b6k8vy2i2a4cshwz3cv-python-wrapper-3.10.7/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/tmp/guix-build-python-stpipe-0.5.1.drv-0/stpipe-0.5.1/.hypothesis/examples')
rootdir: /tmp/guix-build-python-stpipe-0.5.1.drv-0/stpipe-0.5.1, configfile: pyproject.toml, testpaths: tests
plugins: hypothesis-6.54.5, doctestplus-0.12.1, openfiles-0.5.0, asdf-2.15.0, astropy-header-0.2.2
collecting ... collected 44 items
tests/test_abstract_datamodel.py::test_roman_datamodel FAILED [ 2%]
tests/test_abstract_datamodel.py::test_jwst_datamodel SKIPPED (could not import 'jwst.datamodels': No module named 'jwst') [ 4%]
tests/test_abstract_datamodel.py::test_good_datamodel PASSED [ 6%]
tests/test_abstract_datamodel.py::test_bad_datamodel PASSED [ 9%]
tests/test_config_parser.py::test_merge_config_nested_mapping PASSED [ 11%]
tests/test_config_parser.py::test_preserve_comments_deprecation[True] PASSED [ 13%]
tests/test_config_parser.py::test_preserve_comments_deprecation[False] PASSED [ 15%]
tests/test_config_parser.py::test_preserve_comments_deprecation[None] PASSED [ 18%]
tests/test_config_parser.py::test_preserve_comments_deprecation[value3] PASSED [ 20%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="{name}" value="{value}"-fields0-None] PASSED [ 22%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="fred" value="great"-fields1-None] PASSED [ 25%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="fred" value="great"_more-fields2-None] PASSED [ 27%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="{name}" value="great"_more-fields3-None] PASSED [ 29%]
tests/test_format_template.py::test_basics[None-name="{name}" value="{value}"-name="{name}" value=""-fields4-None] PASSED [ 31%]
tests/test_format_template.py::test_basics[key_formats5-astring-astring-fields5-None] PASSED [ 34%]
tests/test_format_template.py::test_basics[key_formats6-astring-astring_s00001-fields6-None] PASSED [ 36%]
tests/test_format_template.py::test_basics[key_formats7-astring-astring_smysource-fields7-None] PASSED [ 38%]
tests/test_format_template.py::test_basics[key_formats8-astring-astring_error-fields8-errors8] PASSED [ 40%]
tests/test_format_template.py::test_separators PASSED [ 43%]
tests/test_format_template.py::test_allow_unknown PASSED [ 45%]
tests/test_integration.py::test_asdf_extension PASSED [ 47%]
tests/test_logger.py::test_configuration PASSED [ 50%]
tests/test_logger.py::test_record_logs PASSED [ 52%]
tests/test_scripts.py::test_scripts_in_path PASSED [ 54%]
tests/test_step.py::test_build_config_pipe_config_file PASSED [ 56%]
tests/test_step.py::test_build_config_pipe_crds PASSED [ 59%]
tests/test_step.py::test_build_config_pipe_default PASSED [ 61%]
tests/test_step.py::test_build_config_pipe_kwarg PASSED [ 63%]
tests/test_step.py::test_build_config_step_config_file PASSED [ 65%]
tests/test_step.py::test_build_config_step_crds PASSED [ 68%]
tests/test_step.py::test_build_config_step_default PASSED [ 70%]
tests/test_step.py::test_build_config_step_kwarg PASSED [ 72%]
tests/test_step.py::test_step_list_args PASSED [ 75%]
tests/test_step.py::test_logcfg_routing PASSED [ 77%]
tests/test_step.py::test_log_records PASSED [ 79%]
tests/cli/test_list.py::test_no_arguments PASSED [ 81%]
tests/cli/test_list.py::test_pipelines_only PASSED [ 84%]
tests/cli/test_list.py::test_steps_only PASSED [ 86%]
tests/cli/test_list.py::test_filter_class_names PASSED [ 88%]
tests/cli/test_list.py::test_filter_aliases PASSED [ 90%]
tests/cli/test_main.py::test_version[-v] PASSED [ 93%]
tests/cli/test_main.py::test_version[--version] PASSED [ 95%]
tests/cli/test_main.py::test_package_main PASSED [ 97%]
tests/cli/test_main.py::test_cli_main PASSED [100%]
=================================== FAILURES ===================================
_____________________________ test_roman_datamodel _____________________________
def test_roman_datamodel():
roman_datamodels = pytest.importorskip("roman_datamodels.datamodels")
> import roman_datamodels.tests.util as rutil
E ModuleNotFoundError: No module named 'roman_datamodels.tests'
tests/test_abstract_datamodel.py:12: ModuleNotFoundError
=========================== short test summary info ============================
FAILED tests/test_abstract_datamodel.py::test_roman_datamodel - ModuleNotFoun...
=================== 1 failed, 42 passed, 1 skipped in 3.89s ====================
Reproduce:
guix time-machine --commit=b566e1a98a74d84d3978cffefd05295602c9445d -- build --system=x86_64-linux --with-latest=python-stpipe python-stpipe
We'll need to replace uses of _jail
(there are only a few) with tmp_path
.
Issue JP-1899 was created on JIRA by Edward Slavich:
Once traitlet-based Step parameters are available, we'll need to migrate the "spec" class attribute on the existing jwst steps to individual traitlet attributes.
Currently when a step runs, it outputs a log of the step parameters, or if it's a pipeline, the pipeline and constituent step parameters. But it is basically unreadable, as one can see for example for jwst.pipeline.Detector1Pipeline
:
{'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'save_calibrated_ramp': False, 'steps': {'group_scale': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': ''}, 'dq_init': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': ''}, 'emicorr': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': True, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'save_intermediate_results': False, 'user_supplied_reffile': None, 'nints_to_phase': None, 'nbins': None, 'scale_reference': True}, 'saturation': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'n_pix_grow_sat': 1},
...
'jump': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'rejection_threshold': 4.0, 'three_group_rejection_threshold': 6.0, 'four_group_rejection_threshold': 5.0, 'maximum_cores': '1', 'flag_4_neighbors': True, 'max_jump_to_flag_neighbors': 1000.0, 'min_jump_to_flag_neighbors': 10.0, 'after_jump_flag_dn1': 0.0, 'after_jump_flag_time1': 0.0, 'after_jump_flag_dn2': 0.0, 'after_jump_flag_time2': 0.0, 'expand_large_events': False, 'min_sat_area': 1.0, 'min_jump_area': 5.0, 'expand_factor': 2.0, 'use_ellipses': False, 'sat_required_snowball': True, 'min_sat_radius_extend': 2.5, 'sat_expand': 2, 'edge_size': 25, 'find_showers': False, 'extend_snr_threshold': 1.2, 'extend_min_area': 90, 'extend_inner_radius': 1.0, 'extend_outer_radius': 2.6, 'extend_ellipse_expand_ratio': 1.1, 'time_masked_after_shower': 15.0, 'max_extended_radius': 200, 'minimum_groups': 3, 'minimum_sigclip_groups': 100, 'only_use_ints': True}, 'ramp_fit': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'int_name': '', 'save_opt': False, 'opt_name': '', 'suppress_one_group': True, 'maximum_cores': '1'}, 'gain_scale': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': ''}}}
It would be much better to have formatted output. Here's 3 options:
yaml.dump()
pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: False
suffix: None
search_output_file: True
input_dir: ''
save_calibrated_ramp: False
steps:
group_scale:
pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: False
suffix: None
search_output_file: True
input_dir: ''
dq_init:
pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: False
suffix: None
search_output_file: True
input_dir: ''
emicorr:
pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: True
suffix: None
search_output_file: True
input_dir: ''
save_intermediate_results: False
user_supplied_reffile: None
nints_to_phase: None
nbins: None
scale_reference: True
saturation:
pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: False
suffix: None
search_output_file: True
input_dir: ''
n_pix_grow_sat: 1
...
jump:
pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: False
suffix: None
search_output_file: True
input_dir: ''
rejection_threshold: 4.0
three_group_rejection_threshold: 6.0
four_group_rejection_threshold: 5.0
maximum_cores: '1'
flag_4_neighbors: True
max_jump_to_flag_neighbors: 1000.0
min_jump_to_flag_neighbors: 10.0
after_jump_flag_dn1: 0.0
after_jump_flag_time1: 0.0
after_jump_flag_dn2: 0.0
after_jump_flag_time2: 0.0
expand_large_events: False
min_sat_area: 1.0
min_jump_area: 5.0
expand_factor: 2.0
use_ellipses: False
sat_required_snowball: True
min_sat_radius_extend: 2.5
sat_expand: 2
edge_size: 25
find_showers: False
extend_snr_threshold: 1.2
extend_min_area: 90
extend_inner_radius: 1.0
extend_outer_radius: 2.6
extend_ellipse_expand_ratio: 1.1
time_masked_after_shower: 15.0
max_extended_radius: 200
minimum_groups: 3
minimum_sigclip_groups: 100
only_use_ints: True
ramp_fit:
pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: False
suffix: None
search_output_file: True
input_dir: ''
int_name: ''
save_opt: False
opt_name: ''
suppress_one_group: True
maximum_cores: '1'
gain_scale:
pre_hooks: []
post_hooks: []
output_file: None
output_dir: None
output_ext: .fits
output_use_model: False
output_use_index: True
save_results: False
skip: False
suffix: None
search_output_file: True
input_dir: ''
json.dumps
{
"pre_hooks": [],
"post_hooks": [],
"output_file": None,
"output_dir": None,
"output_ext": ".fits",
"output_use_model": False,
"output_use_index": True,
"save_results": False,
"skip": False,
"suffix": None,
"search_output_file": True,
"input_dir": "",
"save_calibrated_ramp": False,
"steps": {
"group_scale": {
"pre_hooks": [],
"post_hooks": [],
"output_file": None,
"output_dir": None,
"output_ext": ".fits",
"output_use_model": False,
"output_use_index": True,
"save_results": False,
"skip": False,
"suffix": None,
"search_output_file": True,
"input_dir": ""
},
"dq_init": {
"pre_hooks": [],
"post_hooks": [],
"output_file": None,
"output_dir": None,
"output_ext": ".fits",
"output_use_model": False,
"output_use_index": True,
"save_results": False,
"skip": False,
"suffix": None,
"search_output_file": True,
"input_dir": ""
},
"emicorr": {
"pre_hooks": [],
"post_hooks": [],
"output_file": None,
"output_dir": None,
"output_ext": ".fits",
"output_use_model": False,
"output_use_index": True,
"save_results": False,
"skip": True,
"suffix": None,
"search_output_file": True,
"input_dir": "",
"save_intermediate_results": False,
"user_supplied_reffile": None,
"nints_to_phase": None,
"nbins": None,
"scale_reference": True
},
"saturation": {
"pre_hooks": [],
"post_hooks": [],
"output_file": None,
"output_dir": None,
"output_ext": ".fits",
"output_use_model": False,
"output_use_index": True,
"save_results": False,
"skip": False,
"suffix": None,
"search_output_file": True,
"input_dir": "",
"n_pix_grow_sat": 1
},
...
"jump": {
"pre_hooks": [],
"post_hooks": [],
"output_file": None,
"output_dir": None,
"output_ext": ".fits",
"output_use_model": False,
"output_use_index": True,
"save_results": False,
"skip": False,
"suffix": None,
"search_output_file": True,
"input_dir": "",
"rejection_threshold": 4.0,
"three_group_rejection_threshold": 6.0,
"four_group_rejection_threshold": 5.0,
"maximum_cores": "1",
"flag_4_neighbors": True,
"max_jump_to_flag_neighbors": 1000.0,
"min_jump_to_flag_neighbors": 10.0,
"after_jump_flag_dn1": 0.0,
"after_jump_flag_time1": 0.0,
"after_jump_flag_dn2": 0.0,
"after_jump_flag_time2": 0.0,
"expand_large_events": False,
"min_sat_area": 1.0,
"min_jump_area": 5.0,
"expand_factor": 2.0,
"use_ellipses": False,
"sat_required_snowball": True,
"min_sat_radius_extend": 2.5,
"sat_expand": 2,
"edge_size": 25,
"find_showers": False,
"extend_snr_threshold": 1.2,
"extend_min_area": 90,
"extend_inner_radius": 1.0,
"extend_outer_radius": 2.6,
"extend_ellipse_expand_ratio": 1.1,
"time_masked_after_shower": 15.0,
"max_extended_radius": 200,
"minimum_groups": 3,
"minimum_sigclip_groups": 100,
"only_use_ints": True
},
"ramp_fit": {
"pre_hooks": [],
"post_hooks": [],
"output_file": None,
"output_dir": None,
"output_ext": ".fits",
"output_use_model": False,
"output_use_index": True,
"save_results": False,
"skip": False,
"suffix": None,
"search_output_file": True,
"input_dir": "",
"int_name": "",
"save_opt": False,
"opt_name": "",
"suppress_one_group": True,
"maximum_cores": "1"
},
"gain_scale": {
"pre_hooks": [],
"post_hooks": [],
"output_file": None,
"output_dir": None,
"output_ext": ".fits",
"output_use_model": False,
"output_use_index": True,
"save_results": False,
"skip": False,
"suffix": None,
"search_output_file": True,
"input_dir": ""
}
}
}
pprint.pformat()
{'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': '',
'save_calibrated_ramp': False,
'steps': {'group_scale': {'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': ''},
'dq_init': {'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': ''},
'emicorr': {'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': True,
'suffix': None,
'search_output_file': True,
'input_dir': '',
'save_intermediate_results': False,
'user_supplied_reffile': None,
'nints_to_phase': None,
'nbins': None,
'scale_reference': True},
'saturation': {'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': '',
'n_pix_grow_sat': 1},
...
'jump': {'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': '',
'rejection_threshold': 4.0,
'three_group_rejection_threshold': 6.0,
'four_group_rejection_threshold': 5.0,
'maximum_cores': '1',
'flag_4_neighbors': True,
'max_jump_to_flag_neighbors': 1000.0,
'min_jump_to_flag_neighbors': 10.0,
'after_jump_flag_dn1': 0.0,
'after_jump_flag_time1': 0.0,
'after_jump_flag_dn2': 0.0,
'after_jump_flag_time2': 0.0,
'expand_large_events': False,
'min_sat_area': 1.0,
'min_jump_area': 5.0,
'expand_factor': 2.0,
'use_ellipses': False,
'sat_required_snowball': True,
'min_sat_radius_extend': 2.5,
'sat_expand': 2,
'edge_size': 25,
'find_showers': False,
'extend_snr_threshold': 1.2,
'extend_min_area': 90,
'extend_inner_radius': 1.0,
'extend_outer_radius': 2.6,
'extend_ellipse_expand_ratio': 1.1,
'time_masked_after_shower': 15.0,
'max_extended_radius': 200,
'minimum_groups': 3,
'minimum_sigclip_groups': 100,
'only_use_ints': True},
'ramp_fit': {'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': '',
'int_name': '',
'save_opt': False,
'opt_name': '',
'suppress_one_group': True,
'maximum_cores': '1'},
'gain_scale': {'pre_hooks': [],
'post_hooks': [],
'output_file': None,
'output_dir': None,
'output_ext': '.fits',
'output_use_model': False,
'output_use_index': True,
'save_results': False,
'skip': False,
'suffix': None,
'search_output_file': True,
'input_dir': ''}}}
@hbushouse and @nden, 👍 or 👎? And if so, which of the 3 is preferable?
I'm currently trying to use my own logging and wrap around some of the JWST pipeline. This works OK, but the logs seem to produce multiple outputs once I've used anything that wraps around stpipe. I import a logger at the top of my submodule like :
log = logging.getLogger(__name__)
and then log.info('Test message')
produces
[2023-06-29 14:28:39,968] pjpipe.download.download_step - INFO - Test message
as expected. However, after I open up a datamodel
im = datamodels.open(filename) log.info('Test message')
then produces 3 messages:
[2023-06-29 14:28:41,361] pjpipe.download.download_step - INFO - Test message
2023-06-29 14:28:41,361 - stpipe - INFO - Test message
[2023-06-29 14:28:41,361] stpipe - INFO - Test message
This doesn't seem like intended behaviour, and seems like it should be a relatively easy fix
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.