gridtools / gt4py Goto Github PK

Python library for generating high-performance implementations of stencil kernels for weather and climate modeling from a domain-specific language (DSL).

Home Page: https://GridTools.github.io/gt4py

License: GNU General Public License v3.0

Python 99.96% Dockerfile 0.04%

gt4py's Introduction

GT4Py: GridTools for Python

GT4Py is a Python library for generating high performance implementations of stencil kernels from a high-level definition using regular Python functions. GT4Py is part of the GridTools framework, a set of libraries and utilities to develop performance portable applications in the area of weather and climate modeling.

NOTE: The gt4py.next subpackage contains a new version of GT4Py which is not compatible with the current stable version defined in gt4py.cartesian. The new version is highly experimental, it only works with unstructured meshes and it requires python >= 3.10.

📃 Description

GT4Py is a Python library for expressing computational motifs as found in weather and climate applications. These computations are expressed in a domain specific language (GTScript) which is translated to high-performance implementations for CPUs and GPUs.

The DSL expresses computations on a 3-dimensional Cartesian grid. The horizontal axes (I, J) are always computed in parallel, while the vertical (K) can be iterated in sequential, forward or backward, order. Cartesian offsets are expressed relative to a center index.

In addition, GT4Py provides functions to allocate arrays with memory layout suited for a particular backend.

The following backends are supported:

numpy: Pure-Python backend
gt:cpu_ifirst: GridTools C++ CPU backend using I-first data ordering
gt:cpu_kfirst: GridTools C++ CPU backend using K-first data ordering
gt:gpu: GridTools backend for CUDA
cuda: CUDA backend minimally using utilities from GridTools
dace:cpu: Dace code-generated CPU backend
dace:gpu: Dace code-generated GPU backend

🚜 Installation

GT4Py can be installed as a regular Python package using pip (or any other PEP-517 frontend). As usual, we strongly recommended to create a new virtual environment to work on this project.

The performance backends also require the Boost <https://www.boost.org/>__ library, a dependency of GridTools C++, which needs to be installed by the user.

⚙ Configuration

If GridTools or Boost are not found in the compiler's standard include path, or a custom version is desired, then a couple configuration environment variables will allow the compiler to use them:

GT_INCLUDE_PATH: Path to the GridTools installation.
BOOST_ROOT: Path to a boost installation.

Other commonly used environment variables are:

CUDA_ARCH: Set the compute capability of the NVIDIA GPU if it is not detected automatically by cupy.
CXX: Set the C++ compiler.
GT_CACHE_DIR_NAME: Name of the compiler's cache directory (defaults to .gt_cache)
GT_CACHE_ROOT: Path to the compiler cache (defaults to ./)

More options and details are available in config.py.

📖 Documentation

GT4Py uses Sphinx documentation. To build the documentation install the dependencies in requirements-dev.txt

pip install -r ./gt4py/requirements-dev.txt

and then build the docs with

cd gt4py/docs/user/cartesian
make html  # run 'make help' for a list of targets

🛠 Development Instructions

Follow the installation instructions below to initialize a development virtual environment containing an editable installation of the GT4Py package. Make sure you read the CONTRIBUTING.md and CODING_GUIDELINES.md documents before you start working on the project.

Recommended Installation using `tox`

If tox is already installed in your system (tox is available in PyPI and many other package managers), the easiest way to create a virtual environment ready for development is:

# Clone the repository
git clone https://github.com/gridtools/gt4py.git
cd gt4py

# Create the development environment in any location (usually `.venv`)
# selecting one of the following templates:
#     dev-py310       -> base environment
#     dev-py310-atlas -> base environment + atlas4py bindings
tox devenv -e dev-py310 .venv

# Finally, activate the environment
source .venv/bin/activate

Manual Installation

Alternatively, a development environment can be created from scratch installing the frozen dependencies packages :

# Clone the repository
git clone https://github.com/gridtools/gt4py.git
cd gt4py

# Create a (Python 3.10) virtual environment (usually at `.venv`)
python3.10 -m venv .venv

# Activate the virtual environment and update basic packages
source .venv/bin/activate
pip install --upgrade wheel setuptools pip

# Install the required development tools
pip install -r requirements-dev.txt
# Install GT4Py project in editable mode
pip install -e .

# Optionally, install atlas4py bindings directly from the repo
# pip install git+https://github.com/GridTools/atlas4py#egg=atlas4py

⚖️ License

GT4Py is licensed under the terms of the GPLv3.

gt4py's People

Contributors

Stargazers

Watchers

Forkers

egparedes gronerl twicki eddie-c-davis benweber42 havogt dropd ai2cm tehrengruber leuty fthaler stubbiali spidermonkey1975 rheacangeo stagno smallsmallroundglobe floriandeconinck tbennun mroethlin jedwards4b mahim37 jbaksta ruestefa samkellerhals kszenes nfarabullini halungge abishekg7 nikizadehgfdl ninaburg rnnmcc pankajkarman oelbert edopao c2sm l90lpa philip-paul-mueller kotsaloscv merax vectorflux sf-n iomaganaris xyuan

gt4py's Issues

Validate GridTools parallel model

Put in place tests to validate that gt4py backend implementations follow the gridtools parallel model.

PR #14 errors for FV3 stencils using dawn:gtmc backend

The following error occurs when testing FV3 stencils with dawn:gtmc backend using arguments: pytest --exec_backend=dawn:gtmc --data_backend=gtmc:

file "test_python_modules.py", line 357, in test_serialized_savepoints()
ERROR running stencilKE_C_SW-In
	The layout of the field ke is not compatible with the backend.
Traceback (most recent call last):
  File "/fv3/test/test_python_modules.py", line 348, in test_serialized_savepoints
    process_savepoint(serializer, sp, args)
  File "/fv3/test/test_python_modules.py", line 299, in process_savepoint
    process_test_savepoint(serializer, sp, split_name, args)
  File "/fv3/test/test_python_modules.py", line 277, in process_test_savepoint
    process_input_savepoint(serializer, sp, testobj, test_name, args)
  File "/fv3/test/test_python_modules.py", line 239, in process_input_savepoint
    args["output_data"][test_name] = testobj.compute(input_data)
  File "/fv3/translate/translate_ke_c_sw.py", line 35, in compute
    ke_c, vort_c = KE_C_SW.compute(**inputs)
  File "/fv3/stencils/ke_c_sw.py", line 64, in compute
    copy_uc_values(ke_c, uc, ua, origin=origin, domain=copy_domain)
  File "/.gt_cache/py37_1013/dawngtmc/fv3/stencils/ke_c_sw/m_copy_uc_values__dawngtmc_92bfa304e2.py", line 80, in __call__
    field_args=field_args, parameter_args=parameter_args, domain=domain, origin=origin, exec_info=exec_info
  File "/usr/src/gt4py/src/gt4py/stencil_object.py", line 188, in _call_run
    f"The layout of the field {name} is not compatible with the backend."
ValueError: The layout of the field ke is not compatible with the backend.
```file "test_python_modules.py", line 357, in test_serialized_savepoints()
ERROR running stencilKE_C_SW-In
	The layout of the field ke is not compatible with the backend.
Traceback (most recent call last):
  File "/fv3/test/test_python_modules.py", line 348, in test_serialized_savepoints
    process_savepoint(serializer, sp, args)
  File "/fv3/test/test_python_modules.py", line 299, in process_savepoint
    process_test_savepoint(serializer, sp, split_name, args)
  File "/fv3/test/test_python_modules.py", line 277, in process_test_savepoint
    process_input_savepoint(serializer, sp, testobj, test_name, args)
  File "/fv3/test/test_python_modules.py", line 239, in process_input_savepoint
    args["output_data"][test_name] = testobj.compute(input_data)
  File "/fv3/translate/translate_ke_c_sw.py", line 35, in compute
    ke_c, vort_c = KE_C_SW.compute(**inputs)
  File "/fv3/stencils/ke_c_sw.py", line 64, in compute
    copy_uc_values(ke_c, uc, ua, origin=origin, domain=copy_domain)
  File "/.gt_cache/py37_1013/dawngtmc/fv3/stencils/ke_c_sw/m_copy_uc_values__dawngtmc_92bfa304e2.py", line 80, in __call__
    field_args=field_args, parameter_args=parameter_args, domain=domain, origin=origin, exec_info=exec_info
  File "/usr/src/gt4py/src/gt4py/stencil_object.py", line 188, in _call_run
    f"The layout of the field {name} is not compatible with the backend."
ValueError: The layout of the field ke is not compatible with the backend.

Please find the code from the ke_c_sw.py file below:

def compute(uc, vc, u, v, ua, va, dt2):
    grid = spec.grid
    # co = grid.compute_origin()
    origin = (grid.is_ - 1, grid.js - 1, 0)

    # Create storage objects to hold the new vorticity and kinetic energy values
    ke_c = utils.make_storage_from_shape(uc.shape, origin=origin)
    vort_c = utils.make_storage_from_shape(vc.shape, origin=origin)

    # Set vorticity and kinetic energy values (ignoring edge values)
    copy_domain = (grid.nic + 2, grid.njc + 2, grid.npz)
    copy_uc_values(ke_c, uc, ua, origin=origin, domain=copy_domain)
    copy_vc_values(vort_c, vc, va, origin=origin, domain=copy_domain)
    ...

# Kinetic energy field computations
@gtscript.stencil(backend=utils.exec_backend, rebuild=True)
def copy_uc_values(ke: sd, uc: sd, ua: sd):
    with computation(PARALLEL), interval(...):
        ke[0, 0, 0] = uc if ua > 0.0 else uc[1, 0, 0]

The make_storage methods are included below for reference:

def make_storage_data(array, full_shape, istart=0, jstart=0, kstart=0, origin=origin, backend=data_backend):
    full_np_arr = np.zeros(full_shape)
    if len(array.shape) == 2:
        return make_storage_data_from_2d(array, full_shape, istart=istart, jstart=jstart, origin=origin, backend=backend)
    elif len(array.shape) == 1:
        return make_storage_data_from_1d(array, full_shape, kstart=kstart, origin=origin, backend=backend)
    else:
        isize, jsize, ksize = array.shape
        full_np_arr[istart:istart+isize, jstart:jstart+jsize, kstart:kstart+ksize] = array
        return gt.storage.from_array(data=full_np_arr, backend=backend, default_origin=origin, shape=full_shape)

def make_storage_data_from_2d(array2d, full_shape, istart=0, jstart=0, origin=origin, backend=data_backend):
    shape2d = full_shape[0:2]
    isize, jsize = array2d.shape
    full_np_arr_2d = np.zeros(shape2d)
    full_np_arr_2d[istart:istart+isize, jstart:jstart+jsize] = array2d
    #full_np_arr_3d = np.lib.stride_tricks.as_strided(full_np_arr_2d, shape=full_shape, strides=(*full_np_arr_2d.strides, 0))
    full_np_arr_3d = np.repeat(full_np_arr_2d[:, :, np.newaxis], full_shape[2], axis=2)
    return gt.storage.from_array(data=full_np_arr_3d, backend=backend, default_origin=origin, shape=full_shape)

# TODO: surely there's a shorter, more generic way to do this.
def make_storage_data_from_1d(array1d, full_shape, kstart=0, origin=origin, backend=data_backend, axis=2):
    # r = np.zeros(full_shape)
    tilespec = list(full_shape)
    full_1d = np.zeros(full_shape[axis])
    full_1d[kstart:kstart+len(array1d)] = array1d
    tilespec[axis] = 1
    if axis == 2:
        r = np.tile(full_1d, tuple(tilespec))
        # r[:, :, kstart:kstart+len(array1d)] = np.tile(array1d, tuple(tilespec))
    elif axis == 1:
        x = np.repeat(full_1d[np.newaxis, :], full_shape[0], axis=0)
        r = np.repeat(x[:, :, np.newaxis], full_shape[2], axis=2)
    else:
        y = np.repeat(full_1d[:, np.newaxis], full_shape[1], axis=1)
        r = np.repeat(y[:, :, np.newaxis], full_shape[2], axis=2)
    return gt.storage.from_array(data=r, backend=backend, default_origin=origin, shape=full_shape)

def make_storage_from_shape(shape, origin, backend=data_backend):
    return gt.storage.from_array(data=np.zeros(shape), backend=backend, default_origin=origin, shape=shape)

Proposal for gtscript region decorator

We would like your opinion on a decorator extension to GT4Py that we have been thinking about. This would mark regions that could be fused into one gtscript.stencil, so that corners and edges could be treated inside a single dawn stencil.

Example usage:

@gtscript.region
def execute_code():
  stencil1(..., domain=..., origin=...)
  if cond1:
    stencil2(..., domain=..., origin=...)
  elif cond2:
    stencil3(..., domain=..., origin=...)
  for i in range(3):
    stencil4(..., domain=..., origin=...)

What do you think of such a feature?

calling gtscript functions from conditionals

Perhaps this should not be supported, but when a gtscript function is called inside of a conditional, often an error raises "Temporary field {name} implicitly defined within run-time if-else region.", even if there are no temporaries in the function, if it's just using and assigning fields passed from the input arguments to the stencil... I assume using a function causes a temporary variable to be generated, and since that is not declared before the conditional, it is not allowed.
Here's and example:

@gtscript.function
def qcon_func(qcon, q0_liquid + q0_rain):
    qcon = q0_liquid  + q0_rain
    return qcon

then the stencil

with computation(BACKWARD), interval(0, -1):
    if ri < ri_ref:
        qcon = qcon_func(qcon, q0_liquid, q0_rain)

This works if instead of assigning qcon in the function

@gtscript.function
def qcon_func(qcon, q0_liquid, q0_rain):
    return q0_liquid  + q0_rain

So it is totally reasonable in this case to say do not assign qcon inside the function. But more complex functions can't be returned as one-line operations, and assign an input field that wouldn't cause problems inside the stencil spec. And the field being assigned is not temporary, it is just not known in the context of the function?

This is a lower order issue, but has tripped me up a couple of times so thought I'd bring it up! It's nice to be able to reuse functions, but of course less essential than having all the other functionality.

put gdp-1 in list of accepted gdps in the docs

Review Stefano's sample codes to be included in the paper

Code samples can be found @ https://gist.github.com/mbianco/418f10a72aad3ad373ad428ceaa3be14

Does it make sense to use these also for testing and/or examples in the repository?

Add optional data member to backend API to expose supported secondary languages

This is an enabler for the final GDP-1 implementation. See #57 for more details.

Refactor options systems

Redesign the option system to make it more general

Move concept explanations from "Quickstart" to a new "Concepts" section

Let's keep the Quickstart tutorial lean and fast-paced by explaining only exactly what is needed to understand the code and linking to more in-depth explanations of concepts in the "Concepts" section (needs to be created).

The "Concepts" section will also serve as a normalization point for a shared language surrounding GT4Py.

Improve parametrization of tests

As suggested be @gronerl , and according to pytest-dev/pytest#815, we could update this:

 ["name", "backend"], itertools.product(stencil_definitions.names, CPU_BACKENDS)

to this cleaner option:

@pytest.mark.parametrize("name", stencil_definitions.names)
@pytest.mark.parametrize("backend", CPU_BACKENDS)
def test_generation_cpu(name, backend):

Document possible incompatibilities with user installed GridTools

Problem

If a user has previously installed GridTools (C++ libs) in a standard prefix (like /usr/local) then it is possible that during setuptools.build_ext the compiler will find those headers. Namely if

the python version being used was compiled using includes from the same prefix
boost is installed in the same prefix

This is a problem if the user installed GridTools is a different version from what GT4Py uses and there are breaking changes, like different header structure.

Solution

document the GidTools version requirement for GT4Py and advise users to (re-) install other versions of GT C++ to a separate prefix

Luxury solution

get actual include paths used for compilation and scan them for existing gridtools sources at gt4py.gt_src_manager install time, decide to give an error message or use the existing sources if compatible

Enhance templates for code generation backends

Improve the code generation templates for the internal backends.

Reduce the number of conditionals and logic to implement cpu/gpu specific logic by using a master Jinja template inheritance from a common base.
Evaluate the possibility to switch from Jinja to Mako, since it's probably better suited for source code generation and is likely easier to learn for developers.

Runtime conditionals

For backends of 'debug' and 'numpy', runtime conditionals are not implemented (and the error nicely tells you so). A runtime conditional on data can be achieved with vectorized expressions...e.g. a = q[0, 2, 0]* (c[0,0,0] > 0.0) + q[0, 3, 0] * (c[0,0,0] <= 0.0) to achieve a conditional:
if (c[0,0,0] > 0.0):
a = q[0,2,0]
else:
a = q[0,3,0]
It is likely not performant, but works. If I switch the backend to 'gtmc' with this expression I no longer get the same answer -- significant differences, not just roundoff error (changes on the order of 10% seen). The non-vectorized conditional form of that stencil nicely does not crash with the gtmc backend, but produces the same incorrect answers. It's possible there is an issue with my setup, but I would not expect a dramatic answer change with a change in backend. It is known that the runtime conditionals are future work, I am not insisting this be resolved asap, but wanted to share it as an issue.

Fix stefano's NaN/Managed memory issues

We need to verify if this is relevant, and add a test if it is.

Adapt backend API for CLI

This is the first step towards the implementation of the CLI (GDP-1). This issue serves to track the implementation of the described changes. Each change will be treated in a separate issue.

Current situation

The GDP-1 proof-of-concept had to use internals of concrete backends which are not part of the public API to achieve a number of things. This would obviously not be maintainable.

Specifically the problems arise from the intention of making the CLI output easy to integrate into an external build system, while the current backend API is geared towards JIT generation/compilation for use from python only. Code generation is not cleanly separated from compilation and stencil-ids are baked into file / code object names and paths.

Changes to the API

Functionality:

Generate the primary language code without (or with optional) unique stencil-id, returning a hierarchy of source files ready to be independently compiled in a client-defined location (or in-memory in a way that allows programmatically writing out).
Generate language bindings (for python or other secondary language), again without (or optionally) baking in stencil-id and return in a format ready to be written / copied programatically to client-defined location (without compiling anything).
Generate a secondary language module / source file (for bindings) without actually compiling the bindings, in a way that if the language bindings are compiled correctly the module / source file can be used from the target language.
~~After generating bindings and secondary language source, compile bindings to make the target language source immediately usable.~~ (not a backend refactoring anymore)

Data:

set of secondary languages for which bindings can be generated

Design:

separate caching from generating in backend API

Explanation:

Generate primary language code

For the GridTools backends this would be C++ or Cuda code. Both the JIT process as well as the CLI require the source, however for the CLI to embed transparently into external build systems the unique stencil-id must be optional. Also the user of the CLI should have control over where the source files end up to be written and the source file hierarchy (if there are more than one) must make sense for an external build tool. Obviously compiling for JIT usage must hapen separately from this step.

Generate language bindings

For GridTools, this is the pybind11 .cpp file for python as a secondary language. If no stencil-id is given, it should be generated under the assumption that the source files it refers to are generated without stencil-id and the output must make it clear where this file expects to be located relative to the primary language source files.

If a backend does not support secondary languages (if python is the primary language), or if the secondary language can call the primary at runtime this functionality may be absent.

Generate secondary language source

For GridTools backends this is the extension module that imports the compiled bindings from an .so object file and provides the python wrapper on top of that. In general it is the entry point for the secondary language to call the optimized stencils in an idiomatic way.

This may be absent if no secondary languages are supported (if python is the primary) or if the primary and secondary languages are so compatible that they do not require a wrapper.

Compile bindings

This is for when the CLI is also to be used as the build system, when the client code calling the stencils is written in a secondary language supported by the backend. This should be the same process as JIT compilation, except without / optional stencil-id in the file names of course, with the source files in the CLI-user specified location. The Idea is that afterwards the secondary language wrapper can be used immediately.

Add example of working OpenMP settings on MacOS

For MacOS users, the installation and configuration of OpenMP can be tricky, so we could add to the documentation an example of a possible way to install this on MacOS:

export BOOST_ROOT=/usr/local/opt/boost
export OPENMP_CPPFLAGS="-Xpreprocessor -fopenmp"
export OPENMP_LDFLAGS="$(brew --prefix libomp)/lib/libomp.a"

# Apple Command Line Tools and Boost and OpenMP should have been installed via homebrew:
$ brew install boost libomp

Dynamic compilation of stencils for types found at run-time (like numba)

Extract types of input fields at runtime and call the actual @Stencil decorator with the function and the extracted types to load or compile an appropriate version of the stencil.

Add support for new Python versions with AST modifications

Python 3.8 introduced changes in the Python AST that breaks the current frontend and this may happen again in the future. A new generic version-independent AST could be used to avoid this problems in the future.

Duck storages reference implementation

Work can be tracked here: #29

Have a first implementation by the end of the sprint.

We try to make the initial implementation as "optional" module to be used instead of the current storage. Later the current storage will be replaced by the new one.

Evaluate memory usage in all the backends

Introduce tests/benchmarks to evaluate memory usage of gt4py.

Are Cupy/Gridtools memory pools compatible?
Are there memory leaks?

Support for 1d and 2d fields

Define GTScript syntax for argument fields
Define syntax for temporary fields
Define broadcasting behavior in GTScript
Update analysis pipeline and code generation backends

Depends on #28

Cannot install GT sources

When I try installing the GT library sources using the command: pip3 install ./gt4py/setup.py install_gt_sources, I get the following error:

WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see pypa/pip#5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Defaulting to user installation because normal site-packages is not writeable
ERROR: Invalid requirement: './gt4py/setup.py'
Hint: It looks like a path. It does exist.

This worked previously so I don't know why it has stopped working. I am trying to install on my local laptop running Ubuntu 18.04 and I am using Python 3.6.9.

Any assistance would be appreciated.

Thanks,
Mark

Delete temp_cache if compilation fails?

Asynchronous (non-blocking) compilation of multiple stencils

A possible implementation strategy could be to return a proxy StencilObject with a concurrent.future inside. The first time that any attribute is accessed (_getattr_) wait for the completion of the future and load the actual stencil object (similar to JAX arrays implementation).

Notes: check if workarounds modifying setuptools objects to compile CUDA code still work when different stencils at compile simultaenously.

Support for member methods as stencil definition functions

In order to support object-oriented stencil definitions, we could try a similar approach to dataclasses.
Instance methods could be allowed as stencil definitions by relying in the convention of naming the first parameter as self. In this way, the gt.stencil decorator should check for self as the first parameter and add a tag attribute to the function to be compiled. The class needs to be also decorated with a class_with_stencils decorator function that looks for the tag attribute in all the member methods and then adds a new _init_ that calls the original one after compiling the tagged method and assigning them to the instance as bound methods.

Example:

@class_with_stencils
class Component:
    ...

    @stencil
    def my_stencil_definition(self, field_a: Field[], ....):
         ....

Unexpected behaviour of nested subroutines with vertical data dependencies

Results differ for the same calculation done in a subroutine depending on whether it is called by the stencil directly or through another subroutine.

The attached ipython notebook illustrates the problem:

nested_subroutines_with_data_dependencies.ipynb.zip

Refactor backend subsystem

Convert backend class methods to regular methods
Pass the name of the backend to the constructor
Extract cache management functionality of a different class

Add proper check for redundant fields in the analysis pipeline

Verify if the check for fields with redundancy (horizontal blocks) is still needed and thus it should be added to the analysis pipeline for GridTools backends.

Note: started from discussions with @egparedes and @lukasm91

Bugs in Stefano stencils (microphysics, ...) with MC backend

All the paths I will mention in the following refer to the prognostic_saturation branch of the tasmania repo

Vertical stencils with gtmc backend: validation fails unless I perform any stage (really doesn’t matter which) sequentially, although not needed. To reproduce this issue: in tests/, run pytest isentropic/test_isentropic_vertical_advection_debug.py. The error does not occur with the numpy and gtx86 backend. I haven’t tried the gtcuda backend yet, but I would wait for the issue with the storages to be sorted out.

Using a runtime float as both a condition and inside the conditional

A stencil with a conditional that triggers off a runtime float variable, and is also used in the condition, triggers an error.
For example:

@GTScript.stencil(backend="numpy")
def cap_var(q: gtscript.Field[_dtype], b: gtscript.Field[_dtype], q_max: float):
if q > q_max:
b = q_max

triggers an Assertion error in the merge_extents method
def _merge_extents(self, refs: list):
result = {}
params = set()

    # Merge offsets for same symbol
    for name, extent in refs:
        if extent is None:
           assert name in params or name not in result
           AssertionError

usr/src/gt4py/src/gt4py/analysis/passes.py:359: AssertionError

Fix star imports from packages' init.py

As suggested by @DropD, fix import * statements in all package's __init__.py.

From:

from .concepts import *

to:

from .concepts import Concept1, Concep2, func

Study GPU optimizations

Investigate if tests (specially with hypothesis) are always properly rebuilt

gt4py.stencil(definition=...., rebuild=TRUE)

Generate computation source in CLI friendly way

backend.generate_computation(stencil_id=None) or similar should return the stencil in computation language source together with the intended file hierarchy (relative locations of the source files, file names if required).

The return format must be standardized enough for the CLI to write the source to files (or copy the temporary files) to a file system location specified by the user, so that they can be compiled or interpreted without additional changes.

The stencil_id=None is intended to allow reuse for the JIT machinery. It might be replaced or dropped by another mechanism to the same effect.

This is part of #57, which describes the context of this change.

temporaries inside of conditionals

Related to #52, we are thinking it might be reasonable to turn off the check for temporaries inside of conditionals, e.g. make the user responsible if they end up with a read before write.
if a > 0.
b = 1.
else:
b = 2.

This triggers the temporaries error even though it is defined in all pathways. I am not suggesting you try to determine whether a new temporary is defined in all pathways, but rather could let the user figure it out. Perhaps the errors would be hard to understand when they did mess it up.

In many programming languages you can have a variable that is only defined in a certain condition, and if outside that condition.
if cond:
f = 1.

...
if cond:
f = f + 1

There may be barriers to this I am not thinking of.

multiple 'ands' in a conditional without parenthesis

I had a conditional with 3 terms joined by 'ands' --

elif a > 0.0 and q > 0.0 and b < q

But this gave the wrong answer (did not go into the conditional when it should have) until I added parenthesis:

elif a > 0.0 and (q > 0.0 and b < q)

Perhaps there is something incorrect about chaining ands like that without marking the order?

Support callable objects as gtscript function definitions

It can be easily done by using the __call__ member of the object as a function definition, if the object is a callable object but not a regular function.

use temp field in self-assignment

An assignment such as:

a = a[J - 1]  + 2

has a data race. It should be transformed into

 tmp = a[J - 1]  + 2 ; a = tmp

Implement differential operators as part of GTScript standard library

Differential operators should be implemented in one or more gtscript modules and not as builtins, to avoid an explosion of builtins in the language.

StencilFusion Library Transform

k interval offset larger than 2?

I have a stencil where the code specification indicates the need for a with interval(3, 4), but this causes an error.
E.g. :

@gtscript.stencil(backend="numpy")
def example(ub:  gtscript.Field[_dtype]):
 with computation(PARALLEL), interval(3, 4):
        ub = 5.

This raises and error indicating this interval is invalid, which looks like is stemming from the default offset_limit value in the make_axis_interval function:

def make_axis_interval(bounds: tuple, *, offset_limit: int = 2)
 assert isinstance(bounds[0], (VarRef, UnaryOpExpr, BinOpExpr)) or (
          isinstance(bounds[0], int) and abs(bounds[0]) <= offset_limit
        )
       AssertionError

Is there a way to override the offset_limit from the frontend?

TEs

Add support for augmented assignments

Propose adding support for augmented assignments (e.g., a += b * c). An example of this use case is the Coriolis stencil:

@gtscript.stencil(backend=backend)
def coriolis_stencil(
    u_nnow: gtscript.Field[dtype],
    v_nnow: gtscript.Field[dtype],
    fc: gtscript.Field[dtype],
    u_tens: gtscript.Field[dtype],
    v_tens: gtscript.Field[dtype]

    with computation(FORWARD), interval(...):
        z_fv_north = fc * (v_nnow + v_nnow[1, 0, 0])
        z_fv_south = fc[0, -1, 0] * (v_nnow[0, -1, 0] + v_nnow[1, -1, 0])
        u_tens += (0.25 * (z_fv_north + z_fv_south))
        z_fu_east = fc * (u_nnow + u_nnow[0, 1, 0])
        z_fu_west = fc[-1, 0, 0] * (u_nnow[-1, 0, 0] + u_nnow[-1, 1, 0])
        v_tens -= (0.25 * (z_fu_east + z_fu_west))

The current version returns None for these statements. One simple solution is to implement the AugAssign visitor method in the gtscript_frontend.IRMaker class, convert it to a standard Assign node, and visit that.

    def visit_AugAssign(self, node: ast.AugAssign) -> list:
        bin_op = ast.BinOp(left=node.target, op=node.op, right=node.value)
        assign = ast.Assign(targets=[node.target], value=bin_op)
        return self.visit_Assign(assign)

This approach has been implemented in the augmented_assign branch.

Write documentation

docs: Intro Page

Non-technical motivation (usecase)
Concise
motivating code sample / figure(s)
Fits on one screen

At least put
Opening Paragraph and feature highlights. Sample code / figures can be place holders

stencil failure with gtmc that works for numpy

Hello! I have specified a stencil that works with the numpy backend, but not gtmc. It appears too break because it's a long stencil, and can be fixed by using temporary variables.
Here is the problem stencil:

@gtscript.stencil(backend=backend, rebuild=True)
def mystencil(uc: sd, vc: sd, ut: sd, vt: sd, cosa_u: sd, cosa_v: sd):
    with computation(PARALLEL), interval(0, None):
        damp_u = 1. / (1.0 - 0.0625 * cosa_u[0, 0, 0] * cosa_v[-1, 0, 0])
        ut[0, 0, 0] = (uc[0, 0, 0]-0.25 * cosa_u[0, 0, 0] * (vt[-1, 1, 0] + vt[0, 1, 0] + vt[0, 0, 0] + vc[-1, 0, 0] - 0.25 * cosa_v[-1, 0, 0] * (ut[-1, 0, 0] + ut[-1, -1, 0] + ut[0, -1, 0]))) * damp_u

This gives:
GRIDTOOLS ERROR=> Horizontal extents of the outputs of ESFs are not all empty. All outputs must have empty (horizontal) extents

48 | GT_STATIC_ASSERT(extent_t::iminus::value == 0 && extent_t::iplus::value == 0
I don't quite know what this means..

It works if I break this long stencil into several functions.

Slices of gt4py storages cannot be copied

I have a piece of code which extracts variables (kind of like tracer variables) out of a larger array. After this, I needed to copy the array, and ran into this issue (shown as a MCVE):

>>> arr = gt4py.storage.empty("numpy", default_origin=[0, 0, 0], shape=[10, 10, 10], dtype=float)
>>> s = arr[:, :, 0]
>>> copy.deepcopy(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/copy.py", line 161, in deepcopy
    y = copier(memo)
  File "/usr/src/gt4py/src/gt4py/storage/storage.py", line 350, in __deepcopy__
    res = super().__deepcopy__(memo=memo)
  File "/usr/src/gt4py/src/gt4py/storage/storage.py", line 186, in __deepcopy__
    managed_memory=not isinstance(self, ExplicitlySyncedGPUStorage),
  File "/usr/src/gt4py/src/gt4py/storage/storage.py", line 38, in empty
    shape=shape, dtype=dtype, backend=backend, default_origin=default_origin, mask=mask
  File "/usr/src/gt4py/src/gt4py/storage/storage.py", line 141, in __new__
    shape = storage_utils.normalize_shape(shape, mask)
  File "/usr/src/gt4py/src/gt4py/storage/utils.py", line 49, in normalize_shape
    "len(shape) must be equal to len(mask) or the number of 'True' entries in mask."
ValueError: len(shape) must be equal to len(mask) or the number of 'True' entries in mask.

I may find a way to work around the issue, but copy.deepcopy should probably work when used on slices?