open-cluster-management-io / addon-framework Goto Github PK

View Code? Open in Web Editor NEW

23.0 23.0 39.0 24.19 MB

addon apis

License: Apache License 2.0

Makefile 1.13% Go 98.75% Mustache 0.12%

addon-framework's People

Stargazers

Watchers

addon-framework's Issues

The agentdeploy controller cannot update the agent manifestWorks in time when the cluster is changed

there are some cases the agent manifestworks have dependancies on the managed cluster, like the annotation or clusterClaim.
but the agentdeploy controller only watches the addon and manifestwork, does not watch cluster, so cannot update manifestWorks in time when the cluster is changed.

IndexManifestWorkByAddon ?

addon-framework/pkg/index/index.go

Line 135 in 51742bc

if len(addonName) == 0 || len(addonNamespace) > 0 || isHook {

if len(addonName) == 0 || len(addonNamespace) > 0 || isHook {
		return []string{}, nil
	}

Should len(addonNamespace) > 0 be len(addonNamespace) == 0 ?

Add ability to stop updating agent components temporarily

Currently, any modifications to the ManifestWork will be reverted by the addon controller. So, the only modifications that can be made to addon agents on managed clusters are the ones specifically allowed by the addon.

In development or troubleshooting scenarios, it might be helpful to deploy a change to one addon on a cluster without updating the addon controller. For example, to add another argument to an addon agent container in order to enable a feature flag.

In stolostron, several controllers have "pause" annotations that can be placed on their resources, which prevent the controller from reconciling them. Something similar here could be useful.

Stick with one term, either 'AddOn' or 'Addon'

I find that we're mixing the AddOn and Addon in the document and the code which can sometimes lead to some hiccups when operating the ocm infra. e.g. in the CRD names, all of them are AddOn while in the interface definition it slipped to Addon. i think we should stick with one convention to avoid confusion/misoperation.

cc @qiujian16

Use manifestwork status to update addon status

Some agent use lease to maintain the status of managedclusteraddon. We should have an approach in addon-framework that let addon-controller to update available status of managedclusteraddon. It could be based on:

feedback status from manifestwork
other customized way.

WithAgentHealthProber(...) should pass ManagedClusterAddon to correctly detect agent namespace

Currently WithAgentHealthProber() does not have access to the agent namespace. This is required for workapi type prober to correctly identify the workloads for a given cluster.

Changing an addon to be hosted mode does not delete old ManifestWork

If I define a ManagedClusterAddOn like this:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  name: config-policy-controller
  namespace: cluster2
spec:
  installNamespace: cluster2-grc

Then change it to hosted mode:

apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
  annotations:
    addon.open-cluster-management.io/hosting-cluster-name: cluster1
  name: config-policy-controller
  namespace: cluster2
spec:
  installNamespace: cluster2-grc

The ManifestWork for both deployment types remain:

❯ kubectl get manifestwork -A
NAMESPACE   NAME                                                       AGE
cluster1    addon-config-policy-controller-deploy-hosting-cluster2-0   3m18s
cluster2    addon-config-policy-controller-deploy-0                    10m

Opt-out from "InstallAll" strategy to offload some addons on purpose

currently the addon-framework allows developers to automatically install addons to the managed clusters upon discovering new clusters via prescribing InstallStrategy=InstallAll. this is a great feature for fluently installing addons w/o manual operations. however, in some case, a hub admin may want to (1) opt-out from certain addons or (2) just install addons to a selected set of clusters in the first place. for (1) now i have no idea how to uninstall addons even manually. and for (2) i think we can have a new install strategy based on label selection or claim selection.

cc @qiujian16

Support for multiple resources of the same GVR in the ManagedClusterAddOn resource

Request/Feature

Add support for multiple resources of the same GVR in .spec.configs section of the ManagedClusterAddOn resource. The goal is that AddOns can then iterate over ManagedClusterAddOn.Status.ConfigReferences and have access to multiple resources with the same GVR.

Use Case

Our AddOn (rhobs/multicluster-observability-addon) can configure ClusterLogForwarding on a fleet of clusters to forward logs to a Loki central instance. For this, we need to provide the addon with a configmap containing the authentication method that should be used, a configmap with the Loki URL that should be used for that specific spoke cluster and finally if we are using mTLS a configmap with the CA Bundle of the Loki instance.

In total that's 3 configmaps that we have to add to the .spec.configs section of the resource ManagedClusterAddOn when we create it in the namespace of a spoke cluster.

Allow triggering addon reconciliation

I am trying to create an addon for OCM, the goal is to deploy workload based on an external configuration.

I managed to follow the path of the "hello world" addon, deploying the "hello-world" configmap into the managed/target cluster.

However, I would like to have more dynamic workload deployed to the managed cluster. To my understanding, whatever the method Manifest of the addon returns, will be deployed on the managed cluster.

I only can't find a way, how to re-trigger this reconcile loop. So that a change in our system triggers the Manfiest function to be evaluated again, so that the new state will will rendered.

`AddOnDeloyment` is missing a `p`

AddOnDeloyment --> AddOnDeployment

Affects:

GetAddOnDeloymentConfigValues
NewAddOnDeloymentConfigGetter

Not sure the best route for handling this, but a deprecation note and having the misspelled function point to the new function might work? It's a minor thing, but I think fixing it is important.

Installed ManagedClusterAddon not removed after unregistering the ManagedCluster

After deleting a managed cluster from the hub, the cluster namespace still remains and the ManagedClusterAddons under that cluster namespace are also not stripped from the hub cluster.

I'm not sure if this is the expected behavior, but clarifying the deletion order of these per-cluster resources in the hub data-plane will be helpful.

Supports managing addon components in hub cluster

currently a typical addon consists of (1) addon manager (2) addon agent. in which the addon manager is sometimes not only responsible for delivering resources to the managed clusters upon addons' enablment but also some server components in the hub clusters paired with the agent components. e.g. in the cluster-proxy addon, we're deploying anp servers upon global addon enablement which the addon agent will be connecting.

in the long term, the addon framework should help the addon developers to easily manage the components in the hub.

Allow non-K8S addon hooks

To me it looks like currently there are "pre deletion" hooks, which make use of the finalizer of a managed addon resource.

However, they only allow one to spawn a pod or a job. Assuming one wants to simply reach to an external API, that would mean spawning a job, with a simple HTTP call.

It would be great if the Agent interface would allow for a PreDelete function, which would handle the same, and report back with: ok, need-more-time, failed. Which is basically the same as a job would, just without spawning additional pods.

Add support to configure helm-agent-addon namespace.

Problem Statement:

We are creating a helm-agent-addon using the installStrategy feature defined in the ClusterManagementAddon API. Currently we can't configure the namespace dynamically as per the ocm:issues:298 describes. In our use case, we need the ability to configure the namespace for the helm-agent-addon dynamically.
Additionally, we want to leverage helm chart built-in values such as {{ Release.Namespace }} and {{ Release.Name }} to set the values of the helm-agent-addon.

Proposed Solution:

We propose adding a feature/enhancement that allows for the dynamic configuration of the helm-agent-addon namespace. This would enable users to specify the namespace at build time through some configuration parameters.

Use Case Example:

For example, in our use case, we want to set the helm-agent-addon namespace using the helm chart built-in value {{ Release.Namespace }} which would ensure consistency with the parent Helm release.

Mysterious values referenced in helloword template

https://github.com/open-cluster-management-io/addon-framework/blob/main/examples/helloworld/manifests/templates/deployment.yaml references some values that are not explained in the documentation and, I suspect, may be outdated.

In that template I see the following mysterious value references.

.Tolerations
.NodeSelector
.HTTPProxy
.HTTPSProxy
.NoProxy

Looking in https://github.com/open-cluster-management-io/addon-framework/blob/main/pkg/addonfactory/addondeploymentconfig.go I found ToAddOnNodePlacementValues and ToAddOnProxyConfigValues. These support the following references from a template.

.tolerations
.global.nodeSelector
.global.proxyConfig.HTTP_PROXY
.global.proxyConfig.HTTPS_PROXY
.global.proxyConfig.NO_PROXY

They also support .global.proxyConfig.PROXY_CA_BUNDLE but the example template does not use anything like that.

I also found ToAddOnDeloymentConfigValues, which provides values named Tolerations and NodeSelector but nothing for proxy config. This function is not mentioned in https://open-cluster-management.io/developer-guides/addon/ .

RFE: UpdateStrategy in AgentAddonOptions

Today it is not possible to specify in AgentAddonOptions the UpdateStrategy.
It would be useful to specify that "create only" or "server side apply" is to be used for some manifests containing configuration data so that regular changes made by users do not get overwritten by an update of the resource pushed by the addon as described here:
https://open-cluster-management.io/concepts/manifestwork/#resource-race-and-adoption

If you find this request meaningful I am happy to try to contribute the implementation.

Updating cluster management addon conflict

version: 0.7.0

Running multiple addon managers using the addon-framework 0.7.0 in a hub cluster, the addon managers' log showed some errors:

# logs of the addon1

E0721 09:15:03.025319       1 base_controller.go:159] "management-addon-config-controller" controller failed to sync "addon2", err: clustermanagementaddons.addon.open-cluster-management.io "addon2" is forbidden: User "system:serviceaccount:multicluster-engine:addon1-sa" cannot patch resource "clustermanagementaddons/status" in API group "addon.open-cluster-management.io" at the cluster scope

# logs of the addon2

E0721 09:15:03.025319       1 base_controller.go:159] "management-addon-config-controller" controller failed to sync "addon1", err: clustermanagementaddons.addon.open-cluster-management.io "addon1" is forbidden: User "system:serviceaccount:multicluster-engine:addon2-sa" cannot patch resource "clustermanagementaddons/status" in API group "addon.open-cluster-management.io" at the cluster scope

From the logs, it seems that the management-addon-config-controller in the addon-manger is trying to update other clustermanagementaddons, so we need to filter the cma when running the management-addon-config-controller controller.

Move common helper workapplier and workbuilder libraries to API repo

For an external project that is going to be a consumer of the OCM API but not the addon framework, I want to leverage the library functions in workapplier and workbuilder in https://github.com/open-cluster-management-io/addon-framework/tree/main/pkg/common
It will be nice if I only need to import the API repo and not the addon-framework repo.

CC @zhiweiyin318 @qiujian16

Extensible hub addon manager by integration controller-runtime

the addon manager can also be a normal kubernetes operator in the hub cluster. it will be better if the addon-framework can help addon developers to extend the addon manager with controller behaviors. in the latest version of controller-runtime, it supports receiving external events via source.Channel and the reconcile loop receives a context which is extensible for us to plumb the target cluster name. we can have a few integration utilities that helps us to easily build a controller based on the widely-used controller-runtime.

50k limit on ManifestWork not enough to define larger/more complex add-ons

The current add-on framework creates a single ManifestWork that includes all the resources the add on requires to run in the managed clusters.

There is a 50k limit on the size of a ManifestWork resource set here which creates a problem when attempting to deploy more complex add ons , that define many resources. A single CRDs can easily exceed 50k on it's own - for example kustomizations.kustomize.toolkit.fluxcd.io is 60k.

Suggestion is to make the limit on a single ManifestWork configurable (primarily for large CRDs or ConfigMaps) AND update the addon controller to create a ManifestWork per resource (or similar) rather than bundling them all together into a single ManifestWork.

Support change installNamespace on agent.

Support for using Cluster Labels in AddonTemplate

As part of AddonTemplate, it would be great to have read the values from Cluster Labels.

For example, if I set some hostname in managed cluster label, as part of addon manager, I can read that label key and use it while applying manifest.

This way, we can use the variable name and may be use cluster specific value also.

Allow reporting status from `Manifest` function of `AgentAddon`

When the agent reconciles the target's state in the Manifest function, it might encounter an error, which at this moment doesn't allow it reconcile.

However, there seems to be no proper way to record this, other than returning an error, which will re-trigger the reconciliation.

I think it would be helpful if the agent could contribute to the conditions of the ManagedClusterAddOn resource.

helm addon: GetSpecHash unable to handle ConfigMap and Secret resources

Describe the bug

As far as I understand the function GetSpecHash is used to generate a hash with the Spec content of a k8s resource. This hash is stored in the status of ManagedClusterAddOn and used by the framework to know if the current state of ManifestWorks is up to date. The problem is that when we use as config resources k8s resources that don't have a Spec field such as ConfigMaps or Secrets.

To Reproduce

Deploy a ManagedClusterAddOn that has in its config a ConfigMap or Secret

Expected behavior

The framework should not discriminate these two resources and it should handle them properly

Additional context

Problem is with

addon-framework/pkg/utils/helpers.go

Lines 354 to 371 in 71f1b13

    
           func GetSpecHash(obj *unstructured.Unstructured) (string, error) { 
        
           	if obj == nil { 
        
           		return "", fmt.Errorf("object is nil") 
        
           	} 
        
           	spec, ok := obj.Object["spec"] 
        
           	if !ok { 
        
           		return "", fmt.Errorf("object has no spec field") 
        
           	} 
        
           	specBytes, err := json.Marshal(spec) 
        
           	if err != nil { 
        
           		return "", err 
        
           	} 
        
           	hash := sha256.Sum256(specBytes) 
        
           	return fmt.Sprintf("%x", hash), nil 
        
           }

Support change `installNamespace` of agent addons.

Currently, the agent addon is installed in ns open-cluster-management-agent-addon by default.

If a user is using the WithInstallStrategy function, then the installNamespace is unchangeable.

Could we provide a way to support user change installNamespace of agent addon in running time?

Suggesstions on README

Hi, i'm a new learner. While following the README to get start with addon, i found some problems in Deploy the helloworld addon.

the command make image needs to call https://github.com/openshift/imagebuilder to be installed & configured first.
this will be better:

$ go get github.com/openshift/imagebuilder/cmd/[email protected]
$ export PATH=$PATH:$(go env GOPATH)/bin
$ make images

the exported environment variable EXAMPLE_IMAGE_NAME is not used.
kind load command can be modified:

kind load docker-image  $EXAMPLE_IMAGE_NAME --name <your-kind-cluster-name> # kind load docker-image  $EXAMPLE_IMAGE_NAME --name cluster1

`make images` fails

The build seems to be done within a container, so I don't think it's my environment that is at fault here..

$ make images
imagebuilder --allow-pull -t quay.io/open-cluster-management/helloworld-addon -f ./Dockerfile .
--> Image registry.ci.openshift.org/open-cluster-management/builder:go1.16-linux was not found, pulling ...
--> Pulled 2/3 layers, 67% complete
--> Pulled 2/3 layers, 70% complete
--> Pulled 2/3 layers, 73% complete
--> Pulled 2/3 layers, 77% complete
--> Pulled 2/3 layers, 81% complete
--> Pulled 2/3 layers, 86% complete
--> Pulled 2/3 layers, 93% complete
--> Pulled 2/3 layers, 98% complete
--> Pulled 3/3 layers, 100% complete
--> Extracting
--> FROM registry.ci.openshift.org/open-cluster-management/builder:go1.16-linux as builder
--> WORKDIR /go/src/open-cluster-management.io/addon-framework
--> COPY . .
--> ENV GO_PACKAGE open-cluster-management.io/addon-framework
--> RUN make build --warn-undefined-variables
go: inconsistent vendoring in /go/src/open-cluster-management.io/addon-framework:
	github.com/containerd/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/containers/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/fsouza/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/golang/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/klauspost/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/onsi/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/openshift/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	golang.org/x/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	google.golang.org/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt
	github.com/onsi/[email protected]: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod

	To ignore the vendor directory, use -mod=readonly or -mod=mod.
	To sync the vendor directory, run:
		go mod vendor
vendor/github.com/openshift/build-machinery-go/make/targets/golang/build.mk:13: *** no packages to build: GO_BUILD_PACKAGES_EXPANDED var is empty.  Stop.
running 'make build --warn-undefined-variables' failed with exit code 2
make: *** [image-helloworld-addon] Error 1

Since just using EXAMPLE_IMAGE_NAME=quay.io/open-cluster-management/helloworld-addon:latest as suggested in the README does not work either (apparently there is no such image), I was unsuccessful in trying this framework out :-(

	func GetSpecHash(obj *unstructured.Unstructured) (string, error) {
	if obj == nil {
	return "", fmt.Errorf("object is nil")
	}
	spec, ok := obj.Object["spec"]
	if !ok {
	return "", fmt.Errorf("object has no spec field")
	}

	specBytes, err := json.Marshal(spec)
	if err != nil {
	return "", err
	}

	hash := sha256.Sum256(specBytes)

	return fmt.Sprintf("%x", hash), nil
	}

open-cluster-management-io / addon-framework Goto Github PK

addon-framework's People

Stargazers

Watchers

Forkers

addon-framework's Issues

Request/Feature

Use Case

Recommend Projects

Recommend Topics

Recommend Org