vmware-tanzu / cartographer Goto Github PK
View Code? Open in Web Editor NEWCartographer is a Supply Chain Choreographer.
Home Page: https://cartographer.sh
License: Apache License 2.0
Cartographer is a Supply Chain Choreographer.
Home Page: https://cartographer.sh
License: Apache License 2.0
"I want to discover created objects through their labels."
Given a *Template
When an object is created from the object
Then the object has a label whose key "carto.run/template-name" has the value of the template name
And the object has a label whose key "carto.run/template-kind" has the value of the template kind
Examples:
apiVersion: carto.run/v1alpha1
kind: ClusterSourceTemplate
metadata:
name: some-git-template
spec:
urlPath: .data.some-data
revisionPath: .data.some-data
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-source
data:
some-data: the-Data-on-the-enterprise
will pair with a workload-supplyChain to create
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-source
labels:
...
kontinue.io/template-name: some-git-template
kontinue.io/template-kind: ClusterSourceTemplate
...
The current implementation adds a label for the template name, but hardcodes the key as carto.run/cluster-build-template-name
.
Once we have non-cluster scoped templates we will need to add template-namespace
.
"I want to see how the project code quality is trending"
When New code appears on Main
Then new metrics evaluating the trend of our code quality should be displayed on a dashboard for all contributors to see
We evaluated sonarqube and have a config branch
We would need to:
We also need to be sure that we meet any code quality requirements from the CNCF and use tooling that has precedence with CNCF
"I want to be able to auto generate names of Cartographer created objects"
Given a *Template with a value for `spec.template.metadata.generateName`
When an object is created from the template
Then the object is successfully created whose name is the specified prefix and a random string suffix
Examples:
apiVersion: carto.run/v1alpha1
kind: ClusterBuildTemplate
metadata:
generateName: kpack-template---workload-supply-chain-hardcoded-templates
spec:
imagePath: .data.average_color_of_the_universe
template:
apiVersion: v1
kind: ConfigMap
metadata:
generateName: build
data:
average_color_of_the_universe: "Cosmic latte"
will create something like
apiVersion: v1
kind: ConfigMap
metadata:
name: build123-some-random-string-xyz
data:
average_color_of_the_universe: "Cosmic latte"
Currently, when applying the template to the cluster, we search for the templated object by name/namespace. (Then if we find it, we update it; otherwise we create it). If the template uses generateName, we will need to search for the object by labels instead of by name.
By assuring a unique object, we should be able to assert on object creation: https://kuttl.dev/docs/testing/asserts-errors.html#listing-resources-in-the-cluster
We can also create an errors file to assert that the object has not been created with only the prefix. e.g.
# 01-errors.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: build
data:
average_color_of_the_universe: "Cosmic latte"
Quite a few error conditions, and a couple of functions, are untested
From #21
"I want to know when an object I apply does not adhere to the basic schema expectations of a CRD"
Given a yaml spec for a Cartographer CRD (Workload, ClusterSupplyChain, Cluster[Opinion|Build|Source]Template, Deliverable) that does not meet basic* schema
When I apply the yaml
Then I see a k8s error describing what is not meeting the schema.
The use of basic here means any schema validation, that does not require knowledge of state or other objects.
Suggested process:
(todo: better description)
As a pipeline author
I want the status of my Pipeline to be based on the most recently submitted and successful PipelineRun
So that I can trigger work only when a test passes
Given a pipeline and runTemplate that have created an object
When that object has a Succeeded condition that is true
Then the Pipeline's Status includes the output fields defined on the runTemplate
Given a pipeline and runTemplate that has created an object
When that object has a Succeeded condition that is not true
Then the Pipeline's Status is still that of a previous successful run
Success
has been scoped to:
When stampedObject.Status.Conditions[@type='succeeded'].Status
== true
This scheme has a corner case for when a pipeline has spec A (creates an object), is updated to spec B (creates another object), and then is updated to spec A (will not always create a new object because the original exists). In that case, when both A and B are successful, the most recently submitted successful object should be A (it is so recent that it is the current spec of the pipeline). The current story need not handle this corner case.
One could imagine a scenario where 1 pipeline submitted two objects in such close succession that they have the same creationTimestamp. In that scenario, it would be impossible to judge which is the most recently submitted. This story need not handle that case. There may be a follow-on story to create some mechanism to move from timestamps to some monotonically increasing series. This series would need to be appended to the objects created by the pipeline. Story authors should remember that new object creation can be triggered by changes to the pipeline, the runTemplate or (possibly) some 3rd object to which they refer.
"I don't want to overwrite objects created by other Cartographer processes"
Given an object created by a workload/supply-chain/component
When a different workload/supply-chain/component attempts to patch that object
Then the patch fails
And the workload `ComponentsSubmitted` status is False
Examples:
---
apiVersion: carto.run/v1alpha1
kind: ClusterSourceTemplate
metadata:
name: MyCanonicalSourceTemplate
spec:
urlPath: .value.my-number
revisionPath: .value.my-number
template:
apiVersion: carto.run/v1alpha1
kind: Deliverable
metadata:
name: hard-coded-name
value:
my-number: 1
---
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: supply-chain-1
spec:
selector:
some-key: "some-value"
components:
- name: source-provider-first
templateRef:
kind: ClusterSourceTemplate
name: MyCanonicalSourceTemplate
- name: source-provider-second
templateRef:
kind: ClusterSourceTemplate
name: MyCanonicalSourceTemplate
---
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
name: workload1
labels:
some-key: "some-value"
spec:
params:
- name: a-number
value: 1
will result in the following Workload status
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
name: workload1
status:
conditions:
- type: SupplyChainReady
status: "True"
reason: Ready
- type: ComponentsSubmitted
status: "False"
reason: ObjectOwnedByDifferentComponent
message: Component source-provider-second may not overwrite the object owned by {workload: {name: workload1, namespace: <some-namespace>}, supply-chain: {name: MyCanonicalSourceTemplate}, component: {name: source-provider-first}}
- type: Ready
status: "False"
reason: ComponentsSubmitted
...
Validate the Reason and Message expected
I want updating a template to fix workload issues without waiting for the exponential backoff.
Given a StarTemplate whose inner template is malformed in a way that prevents proper object submission
And a workload that is in exponential backoff from erroring on this component submission
When the template is patched
Then the workload immediately begins the reconciliation process
kind: Workload
status:
conditions:
- type: ComponentsUnableToSubmit
status: "True"
reason: TemplateRejectedByAPIServer
- type: Ready
status: "False"
reason: ComponentsUnableToSubmit
In no more than 10 seconds after submitting new template:
status:
conditions:
- type: ComponentsUnableToSubmit
status: "False"
reason: ComponentSubmissionComplete
- type: Ready
status: "True"
reason: Ready
Some additions should be made to the documentation to capture:
(ported from docs
repo)
As a supply chain author
I want to emit the latest run's _outputs_
So that I can trigger work on test results
Do not worry about โsuccess yetโ
Cartographer's documentation needs to be updated with the required VMware OSS Hugo Templates.
"I want to run my configured containers on k8s"
Given a customer who has used Cartographer to take a workload through a supply-chain to config
When they read Cartographer documentation
Then they confidently configure Kapp to deploy the config
Cartographer currently takes workloads and turns them into k8s configuration, ready to be deployed on a cluster. Customers need a paved road to actually do this deployment. We should tell customers how they can use Carvel tooling to achieve this goal.
A Cartographer use case example of using Carvel to deploy and application
(ported from docs repo)
Customer Request:
can we express conditions/predicates that apply to the watched resource and act as gates on updating template target resources?
...Having the supply chain be โawareโ of specific conditions could also be a first step to UX conveniences around โis my supply chain green or red right nowโ
"I want to declare criteria for templated objects to be considered in good state"
Given a workload, supply chain, template triplet
When the user specifies a positive condition on the template
Then no value from that template is passed along the supply chain until the templated object fulfills that template
Examples:
apiVersion: carto.run/v1alpha1
kind: ClusterBuildTemplate
metadata:
name: example-build---consume-output-of-components
spec:
template:
apiVersion: kpack.io/v1alpha1
kind: Image
metadata:
name: ...
spec:
...
imagePath: $(status.latestImage)$
waitRules:
goAhead:
- path: $(status.conditions[?(@.type=="Succeeded")].status)$
matcher: True
As an operator
I want an easy deployment of Cartographer
See this release for an example. https://github.com/vmware-tanzu/carvel-kapp-controller/releases/tag/v0.24.0
"I want to use simple lables."
Given a template created object
When an object is created from the object
Then the object has a label whose key is "carto.run/supply-chain-name"
And the object does not have a label whose key is "carto.run/cluster-supply-chain-name"
Examples:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-source
labels:
carto.run/supply-chain-name: responsible-ops---templates-refer-to-workload
...
...
Once we have non-cluster scoped supply-chains we will need to add supply-chain-namespace
and supply-chain-kind
.
This is a placeholder issue for all of the work that is necessary to generate an Open Source License.
Problem:
If Cartographer has atomicity of operations, users will know that no other controller/actor in the k8s ecosystem can alter the spec of an object in a supply chain which Cartographer is reading. Update
gives controllers atomicity, ensuring that if the object has changed between read and update, the update operation will fail.
Chore:
Switch all client.patch
calls to client.update
calls.
get
then update
There is no coverage for this file whatsoever ๐ก๏ธ :D
https://github.com/vmware-tanzu/cartographer/blob/main/pkg/registrar/registrar.go
"I want to know which component is failing to stamp"
Given a running cartographer app
When I push a Workload with a stamped component that fails
Then the component stamping error (in status and log output) includes the kind of the resource we couldn't stamp
Examples:
Message: unable to apply object 'images.knative.io/tap-go' in namespace 'default': patch: admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: annotation value is immutable: metadata.annotations.serving.knative.dev/creator
Originally it read:
Message: unable to apply object 'default/tap-go': patch: admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: annotation value is immutable: metadata.annotations.serving.knative.dev/creator
This could have matched 3 different CRDs and was not meaningful
I want to specify a subpath for my git repo.
Given a Cartographer cluster
When I submit a Workload which has a subpath on the git/source field
Then the server persists the object
See this story where kpack included this: buildpacks-community/kpack#47
And the current kpack source object: https://github.com/pivotal/kpack/blob/main/docs/image.md#source-configuration
See discussion here: https://gitlab.eng.vmware.com/tanzu-delivery-pipeline/kontinue/-/merge_requests/11#note_6082153
https://vmware.slack.com/archives/C01QKL82CBB/p1620939470416700
Rob mentions wanting this feature early in his walk-through of Cartographer
"I want to be able to ingest the metrics from the controller"
Given that the controller is running
When I hit the metrics endpoint
Then I retrieve the metrics out of controller-runtime's instrumentation
controller-runtime
internally keeps track of some pretty useful metrics that an operator can make use of, like how long reconciles have been taking, etc etc
e.g.
// ReconcileErrors is a prometheus counter metrics which holds the total
// number of errors from the Reconciler
ReconcileErrors = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "controller_runtime_reconcile_errors_total",
Help: "Total number of reconciliation errors per controller",
}, []string{"controller"})
// ReconcileTime is a prometheus metric which keeps track of the duration
// of reconciliations
ReconcileTime = prometheus.NewHistogramVec(prometheus.HistogramOpts{
Name: "controller_runtime_reconcile_time_seconds",
Help: "Length of time per reconciliation per controller",
Buckets: []float64{0.005, 0.01, 0.025, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0,
1.25, 1.5, 1.75, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 60},
}, []string{"controller"})
to be able to fetch them, the controller needs to start an http server, which them takes care
of exporting those metrics for us. that's done via controller-runtime's manager config:
mgr, err := manager.New(cfg, manager.Options{
Scheme: scheme,
- MetricsBindAddress: "0",
+ MetricsBindAddress: "9000",
})
having that "activated" (non-zero port specified in the BindAddress), anyone can hit that the /metrics
endpoint on it and retrieve the information (something that a prometheus instance or similar would do).
recently, we introduced the -dev
flag to cmd/controller
in order to tweak the configuration of logging format .
we could make use of a similar approach for that: perhaps a -metrics-address
?
another approach is making use of the new controller-runtime
's mechanism for tweaking manager settings: component-config.
with it, you're able to tweak a controller's configuration via ... kubernetes objects! e.g.:
apiVersion: examples.x-k8s.io/v1alpha1
kind: CustomControllerManagerConfiguration
clusterName: example-test
cacheNamespace: default
metrics:
bindAddress: :8081
leaderElection:
leaderElect: false
I personally prefer just -metrics-address
from a flag, but if most products that use kubebuilder/controller-runtime are going for component-config (something we should check out), I'd vote towards going with that.
thanks!
I want to know about errors in my workload as soon as possible
Given a supply-chain pointing to a *Template which expects a workload value
When a workload without that value matches with the supply chain
Then the workload's ComponentsReady condition is False with reason "WorkloadProvidesInsufficientValues"
And there is a helpful message
And no templated object is created from any component
Example:
apiVersion: carto.run/v1alpha1
kind: SupplyChain
spec:
components:
- name: source-provider
templateRef:
kind: SourceTemplate
name: git-repository-battery
---
apiVersion: carto.run/v1alpha1
kind: SourceTemplate
metadata:
name: git-repository-battery
spec:
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-source
data:
someVal: $(workload.resources.limits.memory)$
---
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
name: petclinic
labels:
integration-test: "workload-supply-chain-hardcoded-templates"
spec:
metadata:
waciuma-com/quality: beta
waciuma-com/java-version: 11
git:
url: https://github.com/spring-projects/spring-petclinic.git
ref:
branch: main
resources:
requests:
memory: "1Gi"
cpu: "250m"
results in
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
name: petclinic
status:
conditions:
- type: ComponentsReady
status: "False"
reason: "WorkloadProvidesInsufficientValues"
- type: Ready
status: "False"
reason: ComponentsReady
supplyChainRef:
name: responsible-ops
This And no templated object is created from any component
means that validation should happen before attempting to template and submit a templated object to the API server.
Given I have a workload with an issue (eg not matching a supply chain)
When I submit the workload
Then I see a SubStatus explaining the issue
And I see that the top level "Ready" condition reflects the SubStatus' Reason and Message fields.
Given A supply chain with an issue (eg no matching template name)
And a workload matching the supply chain
When I submit the workload
Then in a cascade, I see:
* a SupplyChain Sub-Condition explaining the issue
* the top level Supply Chain "Ready" condition reflects the Sub-Condition' Reason and Message fields.
* the Workload "SupplyChainReady" Condition reflects the Supply Chain's Top Level Reason and Message fields
* the Workload top level condition's Reason And Message fields reflect those of the SupplyChainReady Condition
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops
status:
conditions:
- type: TemplatesReady
status: "False"
reason: TemplatesNotFound
message: Did not find the template of the component 'source-provider', 'image-provider', 'opinion-provider', 'cluster-sink'
- type: Ready
status: "False"
reason: TemplatesNotFound
message: Did not find the template of the component 'source-provider', 'image-provider', 'opinion-provider', 'cluster-sink'
---
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
name: petclinic
status:
conditions:
- type: SupplyChainReady
status: "False"
reason: TemplatesNotFound
message: Did not find the template of the component 'source-provider', 'image-provider', 'opinion-provider', 'cluster-sink'
- type: Ready
status: "False"
reason: TemplatesNotFound
message: Did not find the template of the component 'source-provider', 'image-provider', 'opinion-provider', 'cluster-sink'
I want to know about errors in my supply-chain as soon as possible
Given a *Template A which expects N sources/images/opinions
When a supply-chain component points to *Template A and provides fewer than N sources/images/opinions
Then the supply-chain Ready condition is False with reason "ComponentInputsInsufficientForTemplate"
And there is a helpful message
Example:
apiVersion: kontinue.io/v1alpha1
kind: SupplyChain
spec:
components:
...
- name: built-image-provider
templateRef:
kind: BuildTemplate
name: kpack-battery
sources:
- component: app-provider
name: app-provider
---
apiVersion: carto.run/v1alpha1
kind: BuildTemplate
metadata:
name: kpack-battery
spec:
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-build
data:
favoriteURL: $(sources.0.url)$
secondFavoriteURL: $(sources.1.url)$
results in
apiVersion: carto.run/v1alpha1
kind: SupplyChain
status:
conditions:
- type: Ready
status: "False"
reason: ComponentInputsInsufficientForTemplate
message: component 'provider' does not provide expected sources[1] # <--- does not need to be exactly this
Given a *Template A which expects source/image/opinion by name
When a supply-chain component points to *Template A and provides no source/image/opinion with said name
Then the supply-chain Ready condition is False with reason "ComponentInputsInsufficientForTemplate"
And there is a helpful message
Example:
apiVersion: carto.run/v1alpha1
kind: SupplyChain
spec:
components:
...
- name: built-image-provider
templateRef:
kind: BuildTemplate
name: kpack-battery
sources:
- component: app-provider
name: app-provider
---
apiVersion: carto.run/v1alpha1
kind: BuildTemplate
metadata:
name: kpack-battery
spec:
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-build
data:
favoriteURL: $(sources[?(@.name==solo-source-provider)].revision)$
results in
apiVersion: carto.run/v1alpha1
kind: SupplyChain
status:
conditions:
- type: Ready
status: "False"
reason: ComponentInputsInsufficientForTemplate
message: component 'provider' does not provide expected source with name solo-source-provider # <--- does not need to be exactly this
I want workloads to use a default supply-chain
Given a cluster with a workload with no labels
When a supply-chain is submitted with the selector "carto.run/is-default:true"
Then the workload begins to reconcile with that supply-chain
and
Given a cluster with a workload with labels that don't match an existing supply-chain
When a supply-chain is submitted with the selector "carto.run/is-default:true"
Then the workload begins to reconcile with that supply-chain
I want to be able to associate the input to a resource with the output from a resource.
Given a supply-chain, templates and workloads which have created a set of templated objects.
When a templated object has a Ready condition whose status is not "True" or "False"
Then Cartographer does not update the object with new template inputs.
Example:
Given the below state, Cartographer will not update the kpack image, regardless of a new value for sources.0.url
apiVersion: kpack.io/v1alpha1
kind: Image
status:
conditions:
- lastTransitionTime: "2021-03-02T22:38:29Z"
status: "Unknown"
type: Ready
---
apiVersion: experimental.carto.run/v1
kind: BuildTemplate
metadata:
name: kpack-battery
spec:
template:
apiVersion: kpack.io/v1alpha1
kind: Image
spec:
source:
blob:
url: $(sources.0.url)$
Scenario: ResourceN's Ready condition status is "Unknown". ResourceN-1 releases 3 outputs in a row: A, B, then C. When ResourceN's Ready condition status is "True", what input should Cartographer pass to ResourceN? A, or C?
Should we allow users to make the choice? Concourse has the concept of "every" and "latest" for which resource version to use. We know that sometimes customers want pipelines that run on every input. The eventual consistency model of k8s means that we're unable to assure that work won't be "dropped on the factory floor" and go missing in between steps when resource N gives multiple outputs between Cartographer's reconciliation loop. But we can make a best-faith effort of fulfilling such a user choice.
If we allow users to make this choice, it would become a field on the supply-chain.
My (Waciuma) recommendation is that we start with only the default option of using the latest version of the input resources
Scenario: ResourceN's Ready condition status has been "Unknown" for a long time. How can the App Operator get things back in a good state?
Every templated object is stamped out from a combination of the Template, SupplyChain, Workload, as well as the state of earlier work done. This story specifies that changes in the earlier work will not be reflected in the templated object until the templated object is Ready. What should happen if the Template, SupplyChain, Workload change?
Consider: Workloads are owners of the templated objects. If they are patched, does the child object automatically get deleted?
Not all resources have a Ready condition. Some options:
Controller use Credentials Supplied by a Service Account to provide to a client that reconciles that workload.
The service account is supplied on the workload? Should the permission lie with the template? Who should define the ability to provide permission for creating kpack images for example.
TODO:
As a supply chain author
I want to stamp immutable objects for changing inputs
So that I can run immutable resources based on constrained resources (EG SCM commits)
Assume there are no completion rules and wait for condition ready True
Use a new PipelineController.
generateName
? reject templated naming, make new issue"I want auto-garbage collection when I delete supply-chain components"
Given a Kontinue workload/supply-chain that has stamped out objects
When a supply-chain's existing component is deleted
Then the previously created object is deleted
Examples:
Initial State
apiVersion: carto.run/v1alpha1
kind: ClusterBuildTemplate
metadata:
name: first-template
spec:
imagePath: .data.average_color_of_the_universe
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: first-build
data:
average_color_of_the_universe: "Cosmic latte"
---
apiVersion: carto.run/v1alpha1
kind: ClusterBuildTemplate
metadata:
name: second-template
spec:
imagePath: .data.best_litigant_against_board
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: second-build
data:
best_litigant_against_board: "Brown"
---
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops
spec:
selector:
integration-test: "workload-supply-chain-hardcoded-templates"
components:
- name: first-image-provider
templateRef:
kind: ClusterBuildTemplate
name: first-template
- name: second-image-provider
templateRef:
kind: ClusterBuildTemplate
name: second-template
---
# STAMPED OUT OBJECT
apiVersion: v1
kind: ConfigMap
metadata:
name: first-build
labels:
carto.run/workload-name: petclinic
carto.run/cluster-supply-chain-name: responsible-ops
carto.run/component-name: first-image-provider
carto.run/cluster-build-template-name: first-template
ownerReferences:
- apiVersion: carto.run/v1alpha1
kind: Workload
...
data:
average_color_of_the_universe: "Cosmic latte"
---
# STAMPED OUT OBJECT
apiVersion: v1
kind: ConfigMap
metadata:
name: second-build
labels:
carto.run/workload-name: petclinic
carto.run/cluster-supply-chain-name: responsible-ops
carto.run/component-name: second-image-provider
carto.run/cluster-build-template-name: second-template
ownerReferences:
- apiVersion: carto.run/v1alpha1
kind: Workload
...
data:
best_litigant_against_board: "Brown"
when the supply-chain is updated to
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops
spec:
selector:
integration-test: "workload-supply-chain-hardcoded-templates"
components:
- name: first-image-provider
templateRef:
kind: ClusterBuildTemplate
name: first-template
then second-build
should no longer exist in the cluster.
This story also covers the case where the component's name is changed.
Recommend we update https://github.com/vmware-tanzu/cartographer/blob/main/tests/integration/supply_chain_validation_test.go#L18-337 to include:
[TBD]
Flows vmware-tanzu/cartographer-private#118
When the 'dog food' kontinue deployment's workload has an error in the conditions
Then I see a radiator status change, or get a notification
Problem:
Issue "If values cannot be found in Templated Objects, workload status should be Unknown" will default workloads to display an Unknown
condition if a *Template cannot find its object's expected fields. This is because templated objects can take some time to display expected values. But sometimes templated objects will be in a bad state that will not self-fix. In those cases, Cartographer would best report a problem, with a workload ready status False
. Cartographer must rely on Template Authors to specify indications that an object is in such a bad state.
"I want to specify indications that a templated object is in a failed state"
Given a template with badStateConditions
When the templated object matches that condition
Then the workload ComponentsSubmitted and Ready condition are "False"
Examples:
apiVersion: carto.run/v1alpha1
kind: ClusterBuildTemplate
metadata:
name: example-build---consume-output-of-components
spec:
template:
...
waitRules:
block:
- path: $(status.conditions[?(@.type=="Succeeded")].status)$
matcher: True
Bad state conditions should have an OR relation. Any one being bad should indicate the object is in a bad state.
"I want to easily be able to debug an issue in the field."
Given Cartographer is deployed at a customer site
When Something really bad happens
Then I should be easily able to debug from the logs
Examples:
<Code snippets that illustrate the when/then blocks>
Minimal info logging
More debug logging
We need a debug level and a way for customers to turn debug on (covered by #251)
"I want to assert on namespaces in the kuttl tests"
Given an object that will be created in a kuttl test
When I write an assert file that asserts the object is in the test's namespace
Then kuttl will generate a namespace in which to run the test and assert the presence of the namespace in the object
We want to use informers to switch to an event driven mechanism for stamped resource updates.
Outcome:
Tasks:
Note:
This has already been accomplished by the pipeline service
"I want to confidently cut a new release of Cartographer without worrying about doing it 'the wrong way'"
Given that we decided we want to ship a new version
When I go about doing it
Then I can follow a playbook that indicates the steps I should follow
(imagine if the playbook is: "git tag $version && git push origin $version" and wait for the supplychain to ship? that'd be nice!)
So far, we didn't really have to be concerned about releasing cartographer -
whenever we felt like we had something to give people to try, make release
and then commit (see "Creating new
releases").
That certainly served us well so far, but we'll definitely have to step up a
bit to provide a nice experience for our users, and to make the process easy
for anyone that gets onboarded to the team.
I do believe that automating the whole process is a great outcome to achieve at
some point, but not necessarily a must at first.
Regardless, I think we must make sure we understand what releasing means: does
it mean tagging a commit where the releases/*
directory got bumped? does it
mean that tagged commit and deploying a new version of docs? do we need to
craft a changelog? where do ship those assets? when do we care about semver, if
at all? do we only roll forward?
As a result of this, I believe we could come up with a RELEASING.md
doc with
some of those answered, proposing what the process could look like, ask folks
from other teams to see if we're missing anything and then work towards making
it as easy as possible (automation should come from it, I guess).
any thoughts? do you think having a REALEASING.md is a good acceptance criteria for it?
thx!
Setup:
We should think of the supply chain as some
Let us define a "valid" set of sources as a set that produces a graph where all sinks are resolvable. That is, a valid set of sources will not raise any errors in the nodes of the supply chain graph. Conversely, an "invalid" set will produce an error.
Problem:
Currently, it is possible to submit a valid set of sources, followed by an invalid set of sources, and thereby prevent a sink from producing an output. Posit this graph:
A->B->C->D
Assume C takes a significant period of time to resolve.
Valid input to A is given.
This passes to B and C.
While C is working, invalid input to A is given. This causes B to error when it attempts to stamp out a resource.
The Cartographer controller currently will not continue to check the output of C. Until A has another valid input, D will be starved.
This is bad.
"I want to ensure that components that can be deployed in the graph, are deployed, regardless of the state of other components."
Given a supply-chain that lists some components that will error before other components that could succeed
When Cartographer controller reconciles
Then the components that could succeed, do succeed.
Examples:
# Given a supply chain blueprint and templates
apiVersion: carto.run/v1alpha1
kind: ClusterSourceTemplate
metadata:
name: git-template
spec:
urlPath: .data.workload_git_url
revisionPath: .data.workload_git_url
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-source
data:
workload_git_url: $(workload.source.git.url)$
---
apiVersion: carto.run/v1alpha1
kind: ClusterSourceTemplate
metadata:
name: git-template-from-params
spec:
params:
- name: a-url
default: [email protected]:kontinue/example-app.git
urlPath: .value.cat-lives
revisionPath: .value.critters
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-source-from-params
data:
workload_params_url: $(params[0].value)$
---
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops---ordering
spec:
selector:
integration-test: "workload-1"
components:
- name: source-provider-from-params
templateRef:
kind: ClusterSourceTemplate
name: git-template-from-params
params:
- name: a-url
path: $(workload.params[0].value)$
- name: source-provider
templateRef:
kind: ClusterSourceTemplate
name: git-template
---
# then resources that can be stamped out are created
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-source-from-params
data:
workload_params_url: original-url
---
# when a workload does not provide all necessary values
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
name: petclinic
labels:
integration-test: "workload-1"
spec:
params:
- name: url
value: original-url
# source: # Missing value
# git:
# url: some-url
---
# And when the supply chain changes the order of the components
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops---ordering
spec:
components:
- name: source-provider
templateRef:
kind: ClusterSourceTemplate
name: git-template
- name: source-provider-from-params
templateRef:
kind: ClusterSourceTemplate
name: git-template-from-params
params:
- name: a-url
path: $(workload.params[0].value)$
# - name: source-provider-from-params
# templateRef:
# kind: ClusterSourceTemplate
# name: git-template-from-params
# params:
# - name: a-url
# path: $(workload.params[0].value)$
#
# - name: source-provider
# templateRef:
# kind: ClusterSourceTemplate
# name: git-template
---
# then the resources that can be stamped out are created
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-source-from-params
data:
# workload_params_url: original-url
workload_params_url: new-url
---
# And there is an update to values
apiVersion: carto.run/v1alpha1
kind: Workload
metadata:
name: petclinic
labels:
integration-test: "workload-1"
spec:
params:
- name: url
value: new-url
The test attempts to deploy two components that both depend on the workflow. One component will fail, the other will succeed. Regardless of the supply chain order, we should always see one component successfully updated.
"I want to run the controller with the minimum set of privileges possible."
Given that I deployed the controller in a Kubernetes cluster
When it reaches out to `kube-apiserver`
Then it does so using a token with the least amount of privileges for finding out cartographer-specific objects
And it uses a user provided token for maintaining objects templated out
Differently from products like kpack that
have a known set of resources to interact with (e.g., kpack
deals with
Image
, Build
, ClusterBuilder
, etc - a set known in advance), cartographer
doesn't - it sure knows about Workload
, *Template
and ClusterSupplyChain
,
but it can't know whether a supplychain will drive the creation of a
kpack/Image
or a cartographer/Pipeline
that creates a container image using
something like tekton's kaniko
task
as it's up to the user to define those.
CLSUTERSUPPLYCHAIN-1
source-provider <--src-- image-provider
. .
. .
fluxcd/gitrepoistory kpack/image
(1) (2)
==> a Workload reconciling against it would need to
provide to the controller a token that is authzed
to CRUD (1) and (2)
CLSUTERSUPPLYCHAIN-2
source-provider <--src------ image-provider
. .
. .
mozilla/MercurialRepository cartographer/Pipeline
(3) (4)
==> a Workload reconciling against it would need to
provide to the controller a token that is authzed
to CRUD (3) and (4)
In order for the controller to do anything (like, even figure out if there's a
Workload to reconcile), it must have access to kube-apiserver
's endpoints. As
doing so requires credentials (acquired either by interpreting $KUBECONFIG
or
looking up the token from the serviceaccount token secret mount in the pod),
our current approach is to provide to the controller a token that is bound to
cluster-admin
(when running in a Pod) or using an admin user when running
locally.
I believe that we should make it so that the controller's serviceaccount is
bound to a role that provides the most minimum set of permissions possible
rw
: workload
rw
: clustersupplychain
statusesr
: clustersupplychain
r
: *template
in which case, the credentials that would let the controller create component's
objects (like kpack/Image
) would be provided via
workload.spec.serviceAccountName
.
for instance, assuming we have the following Workload
kind: Workload
metadata: {name: foo, labels: {this: that}}
spec:
serviceAccountName: default
that reconciles against the following ClusterSupplyChain
and ClusterConfigTemplate
:
kind: ClusterSupplyChain
metadata: {name: supplychain}
spec:
selector: {this: that}
components:
- name: source-provider
templateRef: {kind: ClusterConfigTemplate, name: source-provider}
---
kind: ClusterConfigTemplate
metadata: {name: source-provider}
spec:
template:
kind: ConfigMap
apiVersion: v1
metadata: {name: $(worload.name)$}
data: {foo: bar}
that default
serviceaccount would provide to the cartographer
controller the
authz to CRUD ConfigMaps
in the workload's namespace.
This is quite similar to how
kapp-controller has a
very small permission set for the controller itself, but then it's up to the
user to provide the serviceaccount (via the App
's spec.serviceaccountname
)
that has the ability to deploy all the objects that such App
is supposed to
deploy (see https://carvel.dev/kapp-controller/docs/latest/security-model/#docs)
"I want to protect users from dangerous practices"
Given a cluster with Cartographer
When a *Template that defines `spec.template.metadata.namespace` is submitted
Then the template is rejected
Examples:
This template would be rejected
apiVersion: carto.run/v1alpha1
kind: ClusterBuildTemplate
metadata:
name: kpack-template
spec:
imagePath: .data.average_color_of_the_universe
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-build
namespace: default # <--- the offending line
data:
average_color_of_the_universe: "Cosmic latte"
"I want auto-garbage collection when I update supply-chain components"
Given a cartographer workload/supply-chain that has stamped out objects
When a supply-chain's existing component is altered to create a different object
Then the previously created object is deleted
Examples:
Initial State
apiVersion: carto.run/v1alpha1
kind: ClusterBuildTemplate
metadata:
name: original-template
spec:
imagePath: .data.average_color_of_the_universe
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: original-build
data:
average_color_of_the_universe: "Cosmic latte"
---
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops---workload-supply-chain-hardcoded-templates
spec:
selector:
integration-test: "workload-supply-chain-hardcoded-templates"
components:
- name: image-provider
templateRef:
kind: ClusterBuildTemplate
name: original-template
---
# STAMPED OUT OBJECT
apiVersion: v1
kind: ConfigMap
metadata:
name: original-build
labels:
carto.sh/workload-name: petclinic
carto.sh/cluster-supply-chain-name: responsible-ops---workload-supply-chain-hardcoded-templates
carto.sh/component-name: image-provider
carto.sh/cluster-build-template-name: original-template
ownerReferences:
- apiVersion: carto.sh/v1alpha1
kind: Workload
...
data:
average_color_of_the_universe: "Cosmic latte"
when the supply-chain is updated to
apiVersion: carto.sh/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops---workload-supply-chain-hardcoded-templates
spec:
selector:
integration-test: "workload-supply-chain-hardcoded-templates"
components:
- name: image-provider
templateRef:
kind: ClusterBuildTemplate
name: replacement-template
---
apiVersion: carto.sh/v1alpha1
kind: ClusterBuildTemplate
metadata:
name: replacement-template
spec:
imagePath: .data.best_litigant_against_board
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: test-configmap-opinion
data:
best_litigant_against_board: "Brown"
then original-build
should no longer exist in the cluster.
This story also covers the case where the component points to the same *Template, but the *Template has been changed so that it stamps out an object with a different name/kind.
"I want auto-garbage collection when I delete *Templates"
Given a Cartographer workload/supply-chain/*Template that has stamped out an object
When the *Template is deleted
Then the previously created object is deleted
Examples:
apiVersion: carto.run/v1alpha1
kind: ClusterBuildTemplate
metadata:
name: first-template
spec:
imagePath: .data.average_color_of_the_universe
template:
apiVersion: v1
kind: ConfigMap
metadata:
name: first-build
data:
average_color_of_the_universe: "Cosmic latte"
---
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops
spec:
selector:
integration-test: "workload-supply-chain-hardcoded-templates"
components:
- name: first-image-provider
templateRef:
kind: ClusterBuildTemplate
name: first-template
---
# STAMPED OUT OBJECT
apiVersion: v1
kind: ConfigMap
metadata:
name: first-build
labels:
carto.run/workload-name: petclinic
carto.run/cluster-supply-chain-name: responsible-ops
carto.run/component-name: first-image-provider
carto.run/cluster-build-template-name: first-template
ownerReferences:
- apiVersion: carto.run/v1alpha1
kind: Workload
...
data:
average_color_of_the_universe: "Cosmic latte"
when first-template
is deleted, first-build
should no longer exist in the cluster.
If the SupplyChain cannot find the template for a component, its Ready condition will be False. This is a viable candidate for where to put the logic for this garbage collection. The logic cannot go into the workload reconciler without a refactor, as the workload will not enter its reconciliation function with a supply-chain that is not Ready.
Controller-Runtime's client differentiates between a general error when trying to find an object and not finding the object. In the latter case, the error returned is k8s.io/apimachinery/pkg/api/errors
IsNotFound
https://github.com/kubernetes-sigs/controller-runtime/blob/7d83250a445f2b5371e39e4197d3d0024f95fbdc/pkg/cache/internal/cache_reader.go#L52-L71
https://pkg.go.dev/k8s.io/apimachinery/pkg/api/errors#IsNotFound
"I want auto-garbage collection when I delete supply-chains"
Given a Cartographer workload/supply-chain that has stamped out objects
When a supply-chain is deleted
Then the previously created objects are deleted
Examples:
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: responsible-ops
spec:
selector:
integration-test: "workload-supply-chain-hardcoded-templates"
components:
- name: first-image-provider
templateRef:
kind: ClusterBuildTemplate
name: first-template
- name: second-image-provider
templateRef:
kind: ClusterBuildTemplate
name: second-template
---
# SOME STAMPED OUT OBJECTS
When ClusterSupplyChain responsible-ops
is deleted, the stamped out objects are deleted.
A likely place to insert this logic is in the supply-chain reconciler here:
https://github.com/vmware-tanzu/cartographer/blob/main/pkg/controller/supplychain/reconciler.go
"I want to know about errors in my supply-chain as soon as possible"
Given a *Template which expects N params
When a supply-chain component points to the template and provides extra param definitions
Then the supply-chain Ready condition is False with reason "ExcessComponentParams"
And there is a helpful message
Examples:
apiVersion: carto.run/v1alpha1
kind: ClusterSourceTemplate
metadata:
name: some-name
spec:
params:
- name: number-of-cat-lives
default: 9
...
---
apiVersion: carto.run/v1alpha1
kind: ClusterSupplyChain
metadata:
name: some-name
spec:
selector:
integration-test: "workload-provided-params"
components:
- name: source-provider
templateRef:
kind: ClusterSourceTemplate
name: git-template---workload-provided-params
params:
- name: number-of-cat-lives
path: $(workload.params[2].value)$
- name: interesting-fact
path: "human head weighs 5 pounds"
results in
apiVersion: carto.run/v1alpha1
kind: SupplyChain
metadata:
name: some-name
status:
conditions:
- type: Ready
status: "False"
reason: ExcessComponentParams
message: component 'source-provider' does not expect param 'interesting-fact' # <--- does not need to be exactly this
Migrated from Private Gitlab. Original Author Daniel Chen (@chenbh)
I've seen a couple of places where this whole process is documented as RFC-0000
or RFC-0001
: python, kubernetes, rust
"I want to have a supplychain/buildtemplate/etc that is not available cluster-wide"
Given that I have a supplychain/sourcetemplate/buildtemplate/etc
When I create it
Then it is only available to those accounts with access to the namespaces where the objects are at
We previously moved from having all of our app-operator related resources namespace-scoped
to cluster-scoped (for instance, SupplyChain
became ClusterSupplyChain
, with
the scope moving from namespace to cluster).
While that is indeed convenient for making, say, SupplyChain
widely accessible
throughout the cluster (so that its reused by any of the Workload
s in their
own namespaces),
NS-1
Workload:
metadata:
name: workload
namespace: ns-1
labels: {foo: bar}
==> reconciles against `supplychain-1`
NS-2
Workload:
metadata:
name: workload
namespace: ns-2
labels: {foo: bar}
==> reconciles against `supplychain-1`
NS-3
Workload:
metadata:
name: workload
namespace: ns-3
labels: {foo: bar}
==> reconciles against `supplychain-1`
CLUSTER (objects of cluster-scoped resources have no namespace)
ClusterSupplyChain
metadata:
name: supplychain-1
spec:
selector: {foo: bar}
such pattern might be frowned upon given that it leaves no choice other than
having such objects never being namespaced (thus, you having to make sure
names never mismatch, despite of your namespaces setup).
I think It'd be better if we could still provide folks the ability to create
non-clusterwide supplychain and templates as, even if re-use is something they
want to achieve, they can do so giving users access (via RBAC primitives) to
the namespace where they exist, effectively achieving the goal we persued with
cluster-scoped resources in the first place.
e.g., re-using the example above
CLUSTER
clusterrole:
- readonly on namespace `NS-SHARED`
NS-1
Workload:
metadata:
name: workload
namespace: ns-1
labels: {foo: bar}
spec:
serviceAccountName: default --> bound to `readonly` on `NS-SHARED`
==> reconciles against `ns-shared/supplychain-1`
NS-2
Workload:
metadata:
name: workload
namespace: ns-2
labels: {foo: bar}
spec:
serviceAccountName: default --> bound to `readonly` on `NS-SHARED`
==> reconciles against `ns-shared/supplychain-1`
NS-3
Workload:
metadata:
name: workload
namespace: ns-3
labels: {foo: bar}
spec:
serviceAccountName: default --> bound to `readonly` on `NS-SHARED`
==> reconciles against `ns-shared/supplychain-1`
NS-SHARED
SupplyChain (namespace-scoped)
metadata:
name: supplychain-1
namespace: ns-shared
spec:
selector: {foo: bar}
note that if we do agree on using Workload-provided serviceaccountname
, we
have a reliance on vmware-tanzu/cartographer-private#39
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.