Giter Site home page Giter Site logo

pipeline-schemas's Introduction

pipeline-schemas

Location for the common pipeline JSON schemata and examples

Contents

This repository contains the JSON schema files pertaining to pipeline job flow authoring and execution:

The 'common-pipeline' directory

Common schemas related to pipeline definition and execution:

  • Operators: Contains the base operator, uihints, and condition schemas and examples
  • Pipeline-flow: Common pipeline-flow and pipeline-flow-ui JSON schemas
  • Dataset-metadata: JSON dataset metadata definition

The 'common-canvas' directory

Schema and example files for driving the Common Canvas and Property Editor tooling

  • Parameter-defs: Common Properties parameter editing schema and examples
  • Form: Common Properties low-level form JSON specification
  • Palette: Canvas palette JSON definition
  • Diagram: Older (e.g. pre-pipeline-flow) internal canvas diagram specification

pipeline-schemas's People

Contributors

aqtang avatar caritaou avatar cindyzh-ibm avatar curtis-browning avatar fresende avatar ibmkendrick avatar jesusguerrero avatar lresende avatar lucaslalima avatar matthoward366 avatar nabilibm avatar nmgokhale avatar sargent1266 avatar smanick0 avatar srikant-ch5 avatar terryobrien58 avatar tomlyn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pipeline-schemas's Issues

Add default_value condition definition in schema

Here is a potential example -

{
  "default_value": {
    "parameter_ref": "updated_param",
    "value": "New default value",
    "evaluate": {
      "condition": {
        "parameter_ref": "some_param",
        "op": "isNotEmpty"
      }
    }
  }
}

Add https:// to enum list for pipeline-flow-v3-schema.json

Hey team, we need to change the url reference to our pipeline schema for orchestration flows to use https to fix a scan issue. The hardcoded http is getting picked up by a vulnerability scanner.
We tried changing the http to https but it seems to be failing some validator in common-canvas.js. Can we update the validator to support https as an enum?

image

Enhance schema files to prevent schema validation issues

@tomlyn @matthoward366 @vlad-bunescu

This is a request to alter some of the pipeline-schema files to prevent schema validation errors from icorrectly occurring:

1, pipeline-flow-v3-schema.json needs to rewrite the app_data
app_data for overall pipeline, pipeline_def, execution_node_def, supernode_def, binding_entry_node_def, binding_exit_node_def, model_node_def, port_def, bound_port_def, link_def, and runtime_def uses
image

And here is the definition of app_data
image

And each of the definition in the pipeline-flow-ui-v3-schema.json allows additional properties.

This app_data specification makes it impossible for a validation tool to find what is the corresponding schema to check in the pipeline-flow-ui-v3-schema.json for a certain entity. It ends with nothing is validated for the schema defined in the pipeline-flow-ui-v3-schema.json.

2, pipeline-flow-ui-v3-schema.json
a, "required": [] in the definitions.runtime_info_def is not acceptable by Java Json Tool.
The validate requires there must be one element in the array.
As it is empty array, it is no needed to specified.

b, definitions.node_info_def.properties.messages.descriptions should be
definitions.node_info_def.properties.messages.description

3, pipeline-connection-v3-schema.json
app_data points back to pipeline-flow-v3-schema.json app_data. However, none of app_data definition in the pipeline-flow-v3-schema.json is for connection app_data.

4, datarecord-metadata-v3-schema.json
"required": [] in the definitions.runtime_info_def is not acceptable.
The validate requires there must be one element in the array.
As it is empty array, it is no needed to specified.

Binding nodes should have either required outputs or inputs in pipeline-flow-v3-schema.json

The pipeline flow schema says that the each node must be “oneOf”:
“$ref”: “#/definitions/execution_node_def”
“$ref”: “#/definitions/supernode_def”
“$ref”: “#/definitions/binding_entry_node_def”
“$ref”: “#/definitions/binding_exit_node_def”
“$ref”: “#/definitions/model_node_def”
And the definition of “oneOf” is “the given data must be valid against exactly one of the given subschemas.”

So if we have a node which only has an “id” property and a “type” property set to “binding” the node will comply with both these definitions:
“$ref”: “#/definitions/binding_entry_node_def”
“$ref”: “#/definitions/binding_exit_node_def”
because they both have “id” and “type” as required properties and for both the type property value is defined as “binding”.

So therefore such a node would not be valid against exactly one of the given definitions — and so we get schema validation errors.

To make things explicit, for “binding_entry_node_def” we should add “outputs” to the required properties list and for “binding_exit_node_def” we should add “inputs” to the required properties list since this is what is validated anyway.

Link ID need for link object

For API methods to retrieve and update individual links with new attributes etc the links need to be consistently referred to by the same ID. Presently, because link objects do not have IDs in the schema, internal link IDs get generated each time the piplelineFlow is loaded. The link object should have an optional id to allow any generated link IDs to be written into output pipelineFlows and also to be able to handle any IDs that a host application might want to provide.

Update number_generator description

Number generator link is removed from control label. Added a button next to numeric controls -
Screen Shot 2021-10-11 at 10 48 01 AM

Update description to reflect these changes.

Add a new config option "return_type_label" in function_info

Add a new config option return_type_label to the schema as an optional field that is of type resource_def. In the code we can first check for the return_type_label and default back to return_type if its undefined.

This can be used to translate return column in functions table -
Screen Shot 2022-09-19 at 12 17 45 PM

Schema ID uses api.dataplatform.ibm.com which does not resolve

e.g. http://api.dataplatform.ibm.com/schemas/common-pipeline/pipeline-flow/pipeline-flow-v3-schema.json

However it does resolve if you use http://api.dataplatform.cloud.ibm.com/schemas/common-pipeline/pipeline-flow/pipeline-flow-v3-schema.json

{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "WDP Pipeline Flow Schema",
"type": "object",
"id": "http://api.dataplatform.ibm.com/schemas/common-pipeline/pipeline-flow/pipeline-flow-v3-schema.json",

The pipeline-flow-ui-v3-schema.json does not even seem to be deployed to http://api.dataplatform.cloud.ibm.com and is unreachable.

{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "WDP Pipeline Flow UI Schema. Defines UI-only constructs for pipeline flow documents.",
"type": "object",
"id": "http://api.dataplatform.ibm.com/schemas/common-pipeline/pipeline-flow/pipeline-flow-ui-v3-schema.json",

Descriptions in pipeline flow should be deprecated

There are a mixture of locations for description properties for different pipeline flow entities. Some are in the pipeline-flow schema and others in the pipeline-flow-ui schema. The most appropriate place for description properties is in the pipeline-flow-ui schema because that is where other user-defined information such as labels is stored.

Description properties in the pipeline-flow schema should be deprecated and the schema user should be referred to description properties in the pipeline-flow-ui schema instead.

Remove limitations on zoom scale (k) value

Ref: elyra-ai/canvas#1047

The zoom scale limitations of min 0.2 and max 1.8, while needed for usability purposes in the UI are not necessary for the schema since any zoom scale value should be allowed.

This change will help the following situation: after a zoom-to-fit action on a pipeline with a large number of nodes a scale value lower than 0.2 can be calculated. When the pipeline is saved with that small scale value it needs to be able to be read into the UI when schema validation is switched on.

Incorrect reference to app_data_def in connections schema

In the pipeline-connection-v3-schema.json the reference from the data_asset field to app_data is

"$ref": "http://api.dataplatform.ibm.com/schemas/common-pipeline/pipeline-flow-v3-schema.json#/definitions/app_data_def"

but should be

"$ref": "http://api.dataplatform.ibm.com/schemas/common-pipeline/pipeline-flow/pipeline-flow-v3-schema.json#/definitions/app_data_def"

Add op property to the model node definition

Model nodes need to be able to record the node type of the modeling node that was used to create the model. For consistency with other nodes this should be stored in an op property.

additionalProperties in pipeline-connection-v3-schema blocks connection.properties having key/value where value is primitive type

This part of the connections schema is specifying that the properties object is an object but that, also, all additional properties must be objects:
image

This means that any regular field provided for properties like this will be rejected by schema validation:
image

However, if all of the fields inside properties are the selves objects then it will pass the validation:
image

Everywhere else in the schemas where additionalPropeties is used it is specified as a boolean. It should be changed to a boolean in this case which would allow any fields to be added to the “properties” field.

Fix schema for parameterDefs

While working on the schema validator for common-properties in elyra-ai/canvas#446, I found there are a few areas that needs to be fixed. A couple issues were reported from testing with the samples in the test harness.

image

image

Fix readmes for new repo

Some of the URLs and wording in the README.md files in this repo need fixing for the new opensource repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.