oam-dev / spec Goto Github PK
View Code? Open in Web Editor NEWOpen Application Model (OAM).
Home Page: https://oam.dev
License: Other
Open Application Model (OAM).
Home Page: https://oam.dev
License: Other
The Kubernetes implementation is getting confusing because we now have two meanings for the term service
. This is an unfortunate artifact of Kubernetes mis-using the term service
to mean proxy. But... it's a built-in Kubernetes type.
Since we have worker
, could we call it server
?
If we don't want to change it again, I'm okay with that. Just trying to forestall something that is going to be a user complaint in the future.
Current example didn't have an example for volume resource.
Add example for Resources.Volumes in the component schematics.
The spec states a runtime should "apply the traits in the defined order". However, as multiple traits can choose to apply to the same service, there's no global ordering to be defined/reinforced.
Now we have Configuration to combine multiple components together which define an Application instance, but if we want to define multiple applications and there will be some dependency on these applications. So we may need a place (or a concept) to define Application Metadata.
Application Metadata is used to store application level information, such as application logo, providers, maintainers, dependencies, etc. We will be able to launch a set of applications if we have defined the application metadata.
For example, we define this Application Meta
apiVersion: core.hydra.io/v1alpha1
kind: ApplicationMeta
metadata:
labels:
version: 1.0.0
spec:
appName: edas-console
maintainer: hydra
logo: /path/to/logo.svg
components:
- name: comp1
metadata:
version: 1.0.0
componentRef: compx
- name: comp2
metadata:
version: 1.2.0
componentRef: compy
dependency:
- zookeeper
- kafka
parameters:
- name: parma1
value: value1
With ApplicationMeta we will be able to define application relationships.
As per containers it states the following:
A path-like or URI-like representation of the location of an OCI image. Where applicable, this MAY be prefixed with a registry address, SHOULD be suffixed with a tag, and MUST be suffixed with a digest in OCI format. The digest may be used to compute the integrity of the image.
This means that everybody must specify a digest, however, I'd like to suggest to change this to MAY.
While it's a good idea to use digest to make sure you are using the same image, I think there are scenarios where this would be more of a burden than a benefit.
For example, if I deploy krancour/foobar:v1.0.0@sha256:72e996751fe42b2a0c1e6355730dc2751ccda50564fec929f76804a6365ef5ef
and one week later they discover a vulnerability I will never get the security patch while just deploying based on a tag would give me that.
If this will be enforced, we basically enforce all customers to use tools like Renovate to scan their images, automatically create PRs with the latest digest and force a redeployment.
Having it as a possibility which we recommend, but not enforce, feels like the sweet spot.
The spec doesn't specify any rules for naming of components, opconfigs, containers, etc. It also does not specify whether name comparison/matching should be performed as case-sensitive or case-insensitive. Per @technosophos, it might be necessary to restrict naming of at least certain objects to DNS naming rules in order to avoid problems with service discovery.
What is are the criterion for something to be deemed a core scope vs extended? For example, why is the identity scope an extended scope type?
In Component Model - Parameters, I think it would be beneficial to add an additional Secret
type (compare to securestring
type in ARM template (https://docs.microsoft.com/en-us/azure/azure-resource-manager/template-best-practices#security-recommendations-for-parameters)).
Secret parameters can't be read after deployment and MUST be redacted from all logs.
A network scope allows you to run multiple components in the same network as explained here.
But what if we do not specify a network scope?
It is mentioned that every scope should have a default 'root', however, it does not clarify what that means for networking.
In my opinion, we should be secure by default and enforce every implementing service to isolate all components unless specified otherwise.
Currently, Parameter
only support types of string
, number
, boolean
. However, when user pass in properties, it could be complex composite structs like Volume, PortMapping.
In order to solve that, we have to add another layer and/or UI for better experience. We have to do validation/defaulting outside Hydra. When passing through Hydra, we would have to use string
and json.Marshal the entire struct.
In order to provide more intuitive workflow, it would be better to support custom parameter types at Hydra layer. It'd better to do the validation/defaulting in a declarative way.
Currently, in the specification it is a little confusing how the parameters and variables relate to each other.
The component YAMLs have a notion of parameters that can be used throughout the component manifest. Presumably, these are authored by the developer. However, in the Operational Config, when a Component is instantiated, you have to provide values for the parameters the component required which can come from Variables.
This feels a little confusing because the Operator needs to know the specific names of the parameters the developer exposed or some client side tooling needs to help generate an ops config with variables for those parameters. The handoff and the separation of concerns seems to blur a little here. Any thoughts or clarifications on this would be awesome :)
With the introduction of extended workload types, we may need to have a way to pipe the output of components into parameters of other components.Consider the following deployment:
The components are deployed together as part of the same application deployment. Component A will need a connection string to connect to the MySQL database in Component B. However, the connection string is not available until Component B is created.
I can think of a few ways to handle this:
Each option augments the previous in this list, but ultimately some combination of 2 and 3 seems ideal.
In Application Scopes, it is mentioned that spec.type
is required and that it defines the type of the application scope.
However, no allowed values are listed in the spec nor pointed to them.
Currently, writing a Configuration yaml requires filling in all parameters one by one. It would be convenient if we can load the parameter values from data sources automatically. For example,
a ConfigMap have application data:
apiVersion: v1
kind: ConfigMap
metadata:
name: server-env
data:
SERVER_DOMAIN: internal.my-company.com
SERVER_PORT: "80"
apiVersion: core.hydra.io/v1alpha1
kind: Configuration
spec:
parametersFrom:
configMapRef: server-env
Secondly, we want to load it from different sources like k8s ConfigMap, etcd.
According to spec, Container, Image should be string:
https://github.com/microsoft/hydra-spec/blob/6eee9fc8ad1081f08e5bba66f64f380e1d00960b/schema/component.schema.json#L106-L117
(See also https://github.com/microsoft/hydra-spec/blob/master/3.component_model.md#container)
But the example in Component Model - Examples seems to have image
as a complex type with properties name
and digest
.
In the IoT space, frequently components will need access to hardware connected to the host (think sensors, accelerators, actuators etc.) to interact with the environment.
Any suggestions on how this can be accomplished?
We have the following example in our Operational Configuration page:
apiVersion: core.hydra.io/v1alpha1
kind: ComponenSchematic
metadata:
name: frontend
version: "1.0.0"
description: A simple webserver
spec:
workloadType: core.hydra.io/v1.ReplicatedService
parameters:
- name: message
description: The message to display in the web app.
type: string
value: "Hello from my app, too"
containers:
- name: web
env:
- name: MESSAGE
fromParam: message
image:
name: technosophos/charybdis-single:latest
apiVersion: core.hydra.io/v1alpha1
kind: OperationalConfiguration
metadata:
name: custom-single-app
description: Customized version of single-app
spec:
parameters:
- name: message
value: "Well hello there"
components:
componentName: frontend
instanceName: web-front-end
parameterValues:
- name: message
fromParam: message
Even though frontend component is a replicatedService, we don't have a parameter explicitly listing how many replicas should be instantiated. I propose we consider replica count an app ops concern and place it in the operational config file. See below for corrected example:
apiVersion: core.hydra.io/v1alpha1
kind: OperationalConfiguration
metadata:
name: custom-single-app
description: Customized version of single-app
spec:
parameters:
- name: message
value: "Well hello there"
components:
componentName: frontend
replicaCount: 3
instanceName: web-front-end
parameterValues:
- name: message
fromParam: message
Thoughts?
In hydra spec, we can see that OperationalConfiguration is the entry point of the application definition. I think it's more easy to understand by naming it to Application instead of OperationalConfiguration.
I know maybe OperationalConfiguration can contain some aspects of operation out of the application scope. But if we think the concept "Application" as the lifecycle of an application, then it's also reasonable.
The term "runtime" is used in two contexts in the spec:
We can quantify these terms when they're used in the spec, or come up with a different term altogether for (1). I'm open to either.
Currently there is parameters
defined in Component, and workloadSettings
is parameters that are passed into workload
. But as a spec writer I end up duplicating a lot of parameters by passing them as:
workloadSettings:
- name: app-format
fromParam: app-format
- name: jdk-version
fromParam: jdk-version
...
My question is why workload needs different settings? Isn't Parameters
the key-values that we want to pass into this workload already?
So far we've kept non-normative information, such as example traits and scopes, in the core spec. We've been talking about moving these out to separate repos, and leave this repo to define only the core spec that all compliant runtimes must implement.
There are several extension points that we would need to separate out:
I was thinking we use separate repos for each one: hydra-traits
, hydra-scopes
, hydra-extended-workload-types
.
These repos would contain examples of each, and maybe a way to start organizing namespaces for common implementations.
Thoughts?
As of today, HTTPHeader
only has a name
& value
.
However, if you want to add an API Key this will not be a fixed value but rather a parameter.
Are there plans to add a fromParam
that has priority over value
, similar to how env
works?
Source: Our internal team mainly discussed about this issue for PaaS integration with Hydra.
With Hydra spec, the implementation layer would create tons of sub resources. While for now we lack mechanism to GC all these resources, specifically, the owner of app Components etc is missing.
Sth like Azure Resource Manager or a common ownerReferences
would make a lot of sense here. WDYT?
When starting ReplicableService Workload, we first need to start an X Component. We need to pass a lot of properties first to X Component, and then the duplicate them in X Component definition in passing parameters in workloadSettings.
Resources allow you to define the resources that are required for the app to run, but who defines what the maximum resources are that the app should get?
This feels like an app operator responsibility but not sure where they can configure this?
Note - Haven't been through the whole spec yet so sorry if it's in there. Just felt like I expected it to be part fo "Component Model"
I make a guess on container path. Is it right?
If so, I am also wondering how a user can specify:
Scope parameters have a "required" attribute that declares whether or not a value must be provided for a parameter.
How do we want to handle cases where a set of parameters A & B are optional, but if the application operator does decide they want to provide a value for A they MUST provide a value for B? For example, an application operator might want to send their logs to a pre-existing Log Analytics workspace. They have the option to have Hydra provision a workspace automatically or explicitly connect their Health Scope to a preexisting workspace; but, if they choose the latter then they need to provide both a WorkspaceId and WorkspaceKey.
schemaVersion: core.hydra.io/v1alpha1
kind: ApplicationScope
metadata:
name: myHealthScope
description: main healthscope for my applicaton
spec:
type: core.hydra.io/v1.HealthScope
parameters:
- name: HealthThresholdPercentage
description: the percentage of components within a scope that must be healthy to complete an update
type: int
required: true
default: 80
- name: LogAnalyticsWorkspaceId
description: an Id for a unique environment for Azure Monitor log data
type: string
required: false
- name: LogAnalyticsWorkspaceKey
description: a key for a unique environment for Azure Monitor log data
type: string
required: false
Below is the only way to author the operational configuration file if the application operator wants to use a preexisting Log Analytics workspace:
schemaVersion: alpha1
kind: OperationalConfiguration
metadata:
name: custom-single-app
description: Customized version of single-app
spec:
components:
- componentName: votingweb
instanceName: voting-web-v1
scopes:
- name: healthScope
type: core.hydra.io/v1alpha1.HealthScope
parameterValues:
- name: HealthThresholdPercentage
value: 80
- name: LogAnalyticsWorkspaceId
value: d0bd5fdf-d2d8-488d-946c-97c29896e663
- name: LogAnalyticsWorkspaceKey
value: Kjd4S6nLFRbQfaPP8svIx42t+Vus9A1Ob9iEX/29tvkeLAI0nAq9bJUJsFbMXCnFBIJ4XpreFJFlQPae0/ezuQ==
However, based off of the scope schematic the following operational configuration file appears valid to the application operator even though the deployment would fail:
schemaVersion: alpha1
kind: OperationalConfiguration
metadata:
name: custom-single-app
description: Customized version of single-app
spec:
components:
- componentName: votingweb
instanceName: voting-web-v1
scopes:
- name: healthScope
type: core.hydra.io/v1alpha1.HealthScope
parameterValues:
- name: HealthThresholdPercentage
value: 80
- name: LogAnalyticsWorkspaceId
value: d0bd5fdf-d2d8-488d-946c-97c29896e663
This is a fairly simple example, but I could see this getting frustrating/confusing for the user as the number of parameters grows. Or is the goal to keep the number of parameters fairly limited so that it would be immediately obvious to users?
WorkloadSetting mentions that it contains a type
.
Would be good to list some examples or align them with parameter.type
to be more clear what is expected.
Most of the traits we have talked about are attached to a component after it is started. Examples:
But there are a few traits that require the component to initialize after the trait is applied. For example, attaching a filesystem has to be done at container startup, and cannot be done after the component has started.
To use traits as they are currently defined, we'd have to do this:
My guess is that what we want to do is add something to a trait definition that says where in the lifecycle a trait should be attached. Perhaps something like this:
lifecycle: pre-start
or
lifecycle: post-start
The JSON Schema is out of date substantially.
We need to specify how a property schema (probably a JSON Schema) is attached to a trait or scope definition.
Current we have a Component Schema CRD. But when we actually instantiate a component, we would need to expose its status, dependency status here. Additionally, we would need to reference this instance as a dependency for another component/instance.
This is also discussed in oam-dev/rudr#71. But it didn't exist in the spec repo.
In our Component Schema, we define the unit for CPU and GPU like this,
CPU count is represented as a float enclosed in a string (to avoid changes during serialization), where 1 means one CPU, 2 means 2 CPUs, and 0.5 means half of a CPU.
@technosophos / @vturecek - Do you see any issues making this a 'number'(floating point)? I am not able to reason/remember why this needs to be a string(not sure what issues we are avoiding during serialization)
When an architect designs a micro-services application they usually identify a group of components to be used. The operational configuration seems to be identifying itself as both a deployment and an application. I see it more of an instance of an application (group of components) with configuration. Application scopes seems to be the place for this. However, the available application scopes (Network, Health, Resource, Identity) don't seem to fit. There seems to be a need for more clarification on Identity. Would this be used for identifying an application? Maybe what I am talking about is not in the scope of the model. Kubernetes has the concept of namespaces to identify separate projects or applications.
This link https://github.com/microsoft/hydra-spec/blame/master/3.component_model.md#L184 goes nowhere, and I'm not sure if the referenced content is still here??
(Using blame view as it's the only way I can find to reference a line in md files...???)
In Traits an example of the autoscaler is provided where it allows you to configure maxReplicas
, minReplicas
& scalingPolicy
.
scalingPolicy
is the policy used for determining when to scale with a default of CPUUsage
.
That implies to me that it's more the scalingMetric
or something similar rather than the policy itself.
That said, the current autoscaler is lacking configuration such :
cooldownPeriod
that defines period to wait before re-evaluating scaling criteriamaxThreshold
or something similar that defines the metric to hit before triggering a scale outminThreshold
or something similar that defines the metric to hit before triggering a scale inAn app usually needs some volume of storage, and the storage could be on host (hostPath) or remote (NAS, EBS).
This is usually provided in the app configuration. I would propose to add a Volume abstraction just like what k8s provides and have container volumeMounts section.
Some components might need to work like DaemonSets in Kubernetes (one pod on each node) and provide services or resources that other components can consume. See https://github.com/microsoft/hydra-spec/issues/79
What would be a good way to achieve such behavior?
Currently, using the value from a parameter or variable is done via fromParameter()
built-in function. It would be better if we can use some well-known rendering scheme like Go template or Jinja.
We have already discussed changing Resouces.Paths in the component spec to Resources.Volumes and add disk section to the Volume type containing the required size and ephemeral or not.
I'll be creating this issue to create a PR for that.
In Component Model - Container the spec doesn't seem to include a options to authenticate with a private container registry.
Follow up on: https://github.com/microsoft/hydra-spec/pull/77#discussion_r312253397
It was suggested there that instead of using Go's i386
and amd64
terms, we use the more standard x86
and x64
terminology. I have no strong opinions. /cc @vturecek
Currently, configuration settings for a component are handled through environment variables. The developer declares the environment variables required for a container workload, and (most likely) parameterizes them through the component's parameters for an operator to fill in at deployment time.
For certain types of configuration values, a file may be a better fit. For example:
Users could mount a volume and place files on disk, but there are multiple steps involved there just to get configuration settings into a container. Populating the contents from a secure storage location would also be an exercise left up to the user.
The spec could define a simple way to do this and potentially allow a way to define the source of configuration values combined with authentication to populate with secrets and keys.
One way we could do this is to include a "config" section under the component container config:
apiVersion: hydra/v1alpha
kind: ComponentSchematic
metadata:
description: ""
spec:
workloadType: ReplicableService
containers:
- name: my-container
image:
name: vturecek/mycontainer:1.0.0
digest: sha256:6c3c624b58dbbcd3c0dd82b4c53f04194d1247c6eebdaab7c610cf7d66709b3b
env:
- name: BACKEND_ADDRESS
fromParam: 'backend-address'
config:
- name: USERNAME
fromParam: 'username'
- name: PASSWORD
fromParam: 'password'
The result of this would be two files mounted on a local volume with names USERNAME and PASSWORD, and the contents of each file would be the corresponding value.
The runtime would be responsible for mounting the volume locally and populating its contents each time the container instance is created.
The use of Containers here:
Containers: A list of the runnable pieces (OCI images, functions, etc.) necessary for this component. A component has at least one of these
is confusing as the Component spec's containers
attribute is definitely optional.
Some sort of alternative like Workload runtime configuration seems clunky though.
Related:
The components in the operationalConfiguration is a flat array right now. In some cases, we might want to express the dependencies of components. Component A might need to start before Component B, or Component B might fail.
We might solve this problem by blindly retry in starting phase of components, but ideally we can express ordering requirement explicitly.
We've tentatively stated that the deployment and upgrade strategy works like so:
For example, a component may have a "Canary" trait. This trait describes what happens when a deployment occurs for an instance of that component. The canary trait specifically says that traffic should gradually be shifted from the existing instance to the new instance.
This issue is meant to discuss the possibility of a "staging" step prior to carrying out the strategy defined in an upgrade trait.
Using the example above, a component has a "Canary" trait. A new deployment occurs for that instance of a component. Before the upgrade proceeds as described by the Canary trait, can the new component instance first be placed in a "staged" area where it is running side-by-side with the existing instance, is not included in the regular flow of traffic, but is reachable by an randomized endpoint by the operator for the purpose of smoke testing prior to receiving real traffic as part of the Canary strategy.
Most of the objects that the spec cover have a field called type
which refers to a defined component.
For example, core.hydra.io/v1.NetworkScope
is a valid type for an application scope.
However, for traits this does not seem to be the case while it feels like it should have one. autoscaler is an example where it's just a name and the runtime has to know what it is or am I missing something?
The spec as it is currently does not make it clear when a component is designed to have an endpoint or to run as a long-running task without an endpoint (e.g., reads from a service bus or queue). This in turn makes it unclear when a proxy needs to be applied to a component for load balancing and the necessary combination of traits required to achieve traffic routing to a component.
This proposal is to clarify developer intent further through unique workload types that cover the following developer scenarios:
For each of these, there is a scaleable and non-scaleable version. The proposal is to make the scalable version the default, with a Singleton as a special case of each.
The result is the following workload types:
Service
SingletonService
Job
SingletonJob
Task
SingletonTask
I was little suprised to see redis
, elasticsearch
are cataloged as workload together with Singleton
and Function
initially.
In my mind, workloads represent generic runnable entities. The user provides artifacts like a container, a war, a github repo, or func with some configurations to be started. And the workload instance prepares the environment accordingly, optionally build the binary, and starts the program.
When I see Redis
, I would think about managed service by either an operator in k8s world or a cloud service in cloud provider world. It is different from workload, since users do not provide the artifacts. It is more like a Managed Service
type of thing.
I understand that we need to allow customized types like Redis
, MySQL
to express the dependencies of applications. But I am wondering if we should separate these two kinds of Workload
to different types?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.