kusionstack / operating Goto Github PK
View Code? Open in Web Editor NEWManage k8s resources effectively with risk under control.
Home Page: https://www.kusionstack.io/operating/introduction/
License: Apache License 2.0
Manage k8s resources effectively with risk under control.
Home Page: https://www.kusionstack.io/operating/introduction/
License: Apache License 2.0
No response
No response
update currentRevision to updatedRevision after all Pods is updated and got ready
Updating currentRevision should not depending on Pods' turning to available.
Field names in RuleSet should end with Seconds
, like:
Interval
in https://github.com/KusionStack/kafed/blob/main/apis/apps/v1alpha1/ruleset_types.go#L175TraceTimeout
in https://github.com/KusionStack/kafed/blob/main/apis/apps/v1alpha1/ruleset_types.go#L179It should contain time unit in order not to confuse users.
A doc to introduce ResourceConsistent.
Introduce ResourceConsistent:
Change project name to Operating
Get project ready to publish
No response
Support install cafed by helm
$ helm repo add kusionstack https://kusionstack.io/charts
$ helm repo update
$ helm install kafed kusionstack/kafed --version v0.1.0
$ helm uninstall kafed
No response
No response
An example can help user to go through the general features provided in kafed v0.1.0
to demo kafed features
apiVersion: apps.kusionstack.io/v1alpha1
kind: CollaSet
metadata:
name: server
namespace: operating-tutorial
spec:
replicas: 2 # scale down from 3 to 2
selector:
matchLabels:
app: server
updateStrategy:
podUpgradePolicy: InPlaceIfPossible
rollingUpdate:
byPartition:
partition: 3
template:
metadata:
labels:
app: server
spec:
containers:
- image: wu8685/echo:1.3
name: server
command:
- /server
resources:
limits:
cpu: "0.1"
ephemeral-storage: 100Mi
memory: 100Mi
requests:
cpu: "0.1"
ephemeral-storage: 100Mi
memory: 100Mi
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 3
NAMESPACE NAME READY STATUS RESTARTS AGE
kusionstack-system kusionstack-controller-manager-6b6db85868-mgxc2 0/1 Error 8 (5m24s ago) 22m
panic: runtime error: slice bounds out of range [:3] with capacity 2
goroutine 328 [running]:
kusionstack.io/operating/pkg/controllers/collaset/synccontrol.decidePodToUpdateByPartition(0xc000158a00, {0xc00094d8c0?, 0x1b78b80?, 0x2})
/home/runner/work/operating/operating/pkg/controllers/collaset/synccontrol/update.go:108 +0xb2
kusionstack.io/operating/pkg/controllers/collaset/synccontrol.decidePodToUpdate(0xc00094d850?, {0xc00094d8c0?, 0x2?, 0xc00094c790?})
/home/runner/work/operating/operating/pkg/controllers/collaset/synccontrol/update.go:85 +0x4f
kusionstack.io/operating/pkg/controllers/collaset/synccontrol.(*RealSyncControl).Update(0xc0003e7e80, 0xc000158a00, {0xc00094d850, 0x2, 0x2}, {0xc00094c790, 0x2, 0x2}, 0xc000108f20, 0xc000745800, ...)
/home/runner/work/operating/operating/pkg/controllers/collaset/synccontrol/sync_control.go:358 +0x193
kusionstack.io/operating/pkg/controllers/collaset.(*CollaSetReconciler).doSync(0xc000324a40, 0x58?, 0x7f21f1aeb5b8?, {0xc00094c790, 0x2, 0x2}, 0x0?)
/home/runner/work/operating/operating/pkg/controllers/collaset/collaset_controller.go:193 +0x1b5
kusionstack.io/operating/pkg/controllers/collaset.(*CollaSetReconciler).DoReconcile(0xc000339260?, 0x1b88c18?, 0xc000158a00?, {0xc00094c790?, 0x8?, 0xc000860350?}, 0x0?)
/home/runner/work/operating/operating/pkg/controllers/collaset/collaset_controller.go:174 +0x2b
kusionstack.io/operating/pkg/controllers/collaset.(*CollaSetReconciler).Reconcile(0xc000324a40, {0x1b775b8, 0xc000652b10}, {{{0xc000846168?, 0x186e6a0?}, {0xc000860350?, 0x281b890?}}})
/home/runner/work/operating/operating/pkg/controllers/collaset/collaset_controller.go:164 +0x63b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0xc000000fa0, {0x1b775b8, 0xc000652ab0}, {{{0xc000846168?, 0x186e6a0?}, {0xc000860350?, 0xc000628380?}}})
/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114 +0x22c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000000fa0, {0x1b77510, 0xc0003165c0}, {0x174b160?, 0xc00080fb20?})
/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311 +0x2f2
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000000fa0, {0x1b77510, 0xc0003165c0})
/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:223 +0x30c
operating 版本: latest
Kubernetes 版本: v1.27.3
Resourceconsist will be moved to kusionstack.io/resourceconsist, alibabacloudslb controller will be added by importing kusionstack.io/resourceconsist.
Resourceconsist in operating is no longer used. All adapters need be started by importing kusionstack.io/resourceconsist.
After the traffic is turned off, if CollaSet immediately act some operation on Pods, the traffic on the fly will get chance to fail.
CollaSet provides users a options to indicate a operationDelaySeconds
to delay the operation for a while.
No response
No response
CollaSet feature introduction in v0.1.0
guide users to use CollaSet
灰度流量的能力,比如一个应用有4个节点,如果使用beta发布,会有1/4的流量进来,这个流量比较大,能不能增加参数控制流量大小,比如10%这种
beta发布不能精准控制流量,生产环境发布,如果发现异常影响较大
build e2e test framework
to promote kafed quality by running kafed e2e test
When upgrading pods under CollaSet, maxSurge is supposed to be supported.
If maxSurege works, a new pod need to be created and turn to ServiceAvailable podOpsLifecycle status, before upgrading the target pod by recreating it.
MaxSurge could provide more stable operation experience, when upgrading pods under CollaSet.
CollaSet basic features:
Provide CollaSet to run application base on PodOpsLifecycle.
CollaSets are able to exclude their owned Pods and designate them as orphaned. Conversely, they can also include these orphaned Pods to build the ownerReference relationship.
In order to support some features like migrating Pods from other workloads, preserving a specific scene, etc.
Doc to introduce kafed project.
Basic intro for kafed including goal, arch, roadmap, etc.
How about get rid of the annotation ruleset.kusionstack.io/rulesets
Instead we maintain the ownership as a fieldIndex in Cache?
Reference: https://github.com/KusionStack/kafed/blob/main/pkg/utils/inject/inject.go#L36
lower the frequency to update Pod.
PodOpsLifecycle should consider the case that users delete pod directly by kubectl delete pod
to keep user experience
CollaSet supports stable storage like StatefulSet in the way like Spec.VolumeClaimTemplates
.
Stable storage support is a standard capability of CollaSet.
We should use united format to set as label value
for example:
time.Now().Format(time.RFC3339)
time.Now()
label value on Pod is not united
operating.podopslifecycle.kusionstack.io/collaset: "1692352738414032929"
operation-permission.podopslifecycle.kusionstack.io/update: "1692352738"
operation-type.podopslifecycle.kusionstack.io/collaset: update
podopslifecycle.kusionstack.io/pod-instance-id: "1"
pre-checked.podopslifecycle.kusionstack.io/collaset: "1692352738"
A doc to help users getting started with kafed.
We need a document to help users to install kafed components in their environment and run a simple demo.
add compensation logic in service-available calculation.
if service-available condition not matched, get the resource from expected finalizer, check whether the resource still select the pod.
if pod labels changed or service selector changed, the pod won't be service-available anymore if the expected finalizer not removed correctly.
we can't fully rely on others do the clean job successfully in the case of label/selector change.
add some comment to explain PodOpsLifecycle procedure in webhook codes, like here https://github.com/KusionStack/kafed/blob/main/pkg/webhook/server/generic/pod/opslifecycle/webhook.go#L123
also the codes here could be refactored more readable.
Users may be confused while reading these codes.
RuleSet features in v0.1.0
Provide risk-control to pod operations
容器发布时,希望先部署新的容器,再去掉旧的容器
目前先去掉容器,节点较少的情况下,对剩下节点压力比较大
Release automation by github release action.
(binary build & archive & docker build/push & release)
No response
No response
No response
Do not deep copy k8s resources, when listing them from cache.
Deep copy is one of the key reasons lowering the controller performance. We need to avoid it.
a new interface implement record/query employer/employees will be added
in the case of employer selecter changed and GetCurrentEmployees not returned those added but no selected employees, those not selected employees' lifecycle finalizer won't be cleaned
灰度流量的能力,比如一个应用有4个节点,如果使用beta发布,会有1/4的流量进来,这个流量比较大,能不能增加参数控制流量大小,比如10%这种
beta发布不能精准控制流量,生产环境发布,如果发现异常影响较大
Nothing to add.
ResourceConsist controller will be moved to an individual repo, offering an ability of getReconciler/addToMgr.
Built-in adapters will be added into the repo so that common-used controller can be started by calling AddBuiltinAdaptersToMgr.
Customized adapters can import the new repo to add a customized controller to Manager.
Make customized adapters independent with KusionStack/Operating.
Validation policy in webhook
It is a standard way to validate CRD resource and ensure them work well.
helm charts yaml not found
$ helm repo add kusionstack https://kusionstack.io/charts
$ Error: looks like "https://kusionstack.io/charts" is not a valid chart repository or cannot be reached: failed to fetch https://kusionstack.io/charts/index.yaml : 404 Not Found
A workload designed to manage additional Pod configurations, such as sidecars, PVC, environment variables, as well as extra labels and annotations. These configurations are independent of the CollaSet control to enable parallel updates.
In many scenarios, configurations such as sidecars, associated PVCs, and environment variables are typically handled by separate teams. This approach allows different departments to manage different aspects of Pod configuration independently.
We need a Expected finalizer controller to attach service-available condition on Pods
Recently, Pods will directly turn to service-available if there is not a expected finalizer controller to attach available condition on Pods.
Introduction to tell the background, mechanism and implementation of PodOpsLifecycle.
PodOpsLifecycle is a important feature in kafed, we should provide a detail introduction.
ResourceConsistant Controller to reconcile service spec and backend resource status.
It enables users to include traffic control around Pod operation conveniently.
No response
No response
PodOpsLifecycle features in v0.1.0
It is the important feature in kafed, which provide the base for kafed other features.
No response
No response
only update .goreleaser.yaml to release linux/arm64 image.
.
A doc to introduce RuleSet features in v0.1.0
Introduce users the background of RuleSet and how it works.
change the label "podopslifecycle.kusionstack.io/control" to "kusionstack.io/control"
No response
容器发布时,希望先部署新的容器,再去掉旧的容器
目前先去掉容器,节点较少的情况下,对剩下节点压力比较大
No response
No response
No response
No response
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.