Giter Site home page Giter Site logo

rollouts's People

Contributors

asa3311 avatar chenxi-seu avatar fillzpp avatar furykerry avatar jackie1457 avatar kuromesi avatar likakuli avatar mrsumeng avatar specialyang avatar veophi avatar wangyikewxgm avatar xiao-jay avatar yadan-wei avatar yangsoon avatar yike21 avatar zhangsetsail avatar zhengjr9 avatar zmberg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rollouts's Issues

Kruise Rollout: Traffic Graying, Batch Release

What would you like to be added:

Kruise Rollout Features:

  • Traffic graying (Canary, Blue-Green Release, A/B Test)
  • batch release (Deployment, CloneSet)
  • post-release check (Metrics Analysis, hook)

CRD is defined as follows:

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  strategy:
    canary:
      steps:
        ## the first 20% grayed out replicas, and 20% of the traffic
        - weight: 20
          ## Manual confirmation of continuation
           pause: {}
      trafficRouting: 
        service: echoserver  # required
        ## nginx ingress 
        type: nginx
        nginx:
          ingress: echoserver  # required
  objectRef: 
    workloadRef: 
      apiVerison: apps/v1
      kind: Deployment
      name: echoserver

How about add paused in RolloutPause?

In

// Duration the amount of time to wait before moving to the next step.
, how about add paused in struct RolloutPause like:

type RolloutPause struct {
	// Duration the amount of time to wait before moving to the next step.
	// +optional
	Duration *int32 `json:"duration,omitempty"`
      
	Paused *bool `json:"paused,omitempty"`
}

Paused indicates that the deploy should be paused or resumed.Set true to pause the deploy, change to false to resume the deploy. Only Duration can not control the deploy.

[Feature] Rollout support A/B Testing release

Rollout support A/B Testing release:

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  objectRef:
    ...
  strategy:
    canary:
      steps:
      - matchs:
         - headers:
            - name: user
            - value: xiaoming
        pause: {}
        replicas: 20%
      trafficRoutings:
        # echoserver service name
      - service: echoserver
        # echoserver ingress name, current only nginx ingress
        ingress:
          name: echoserver

金丝雀发布无法继续推进

当流程进行到 金丝雀验证成功,完成发布 时无法继续推进,具体状态如下:

版本信息:
图片

测试的 Yaml:
https://kubevela.io/docs/tutorials/k8s-object-rollout

相关资源状态:
图片

相关日志:

rollout pod:
图片

kubectl describe application canary-demo

Name:         canary-demo
Namespace:    default
Labels:       <none>
Annotations:  app.oam.dev/publishVersion: v2
              oam.dev/kubevela-version: v1.7.4
API Version:  core.oam.dev/v1beta1
Kind:         Application
Metadata:
  Creation Timestamp:  2023-04-27T04:00:28Z
  Finalizers:
    app.oam.dev/resource-tracker-finalizer
  Generation:  2
  Managed Fields:
    API Version:  core.oam.dev/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:oam.dev/kubevela-version:
        f:finalizers:
          .:
          v:"app.oam.dev/resource-tracker-finalizer":
    Manager:      kubevela
    Operation:    Update
    Time:         2023-04-27T04:00:36Z
    API Version:  core.oam.dev/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:app.oam.dev/publishVersion:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:components:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2023-04-27T04:00:56Z
    API Version:  core.oam.dev/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:appliedResources:
        f:conditions:
        f:latestRevision:
          .:
          f:name:
          f:revision:
          f:revisionHash:
        f:observedGeneration:
        f:services:
        f:status:
        f:workflow:
          .:
          f:appRevision:
          f:contextBackend:
          f:finished:
          f:mode:
          f:startTime:
          f:status:
          f:steps:
          f:suspend:
          f:terminated:
    Manager:         kubevela
    Operation:       Update
    Subresource:     status
    Time:            2023-04-27T04:01:56Z
  Resource Version:  3019240
  UID:               5161a2d1-b736-475b-a995-84c80287dfe3
Spec:
  Components:
    Name:  canary-demo
    Properties:
      Objects:
        API Version:  apps/v1
        Kind:         Deployment
        Metadata:
          Name:  canary-demo
        Spec:
          Replicas:  5
          Selector:
            Match Labels:
              App:  demo
          Template:
            Metadata:
              Labels:
                App:  demo
            Spec:
              Containers:
                Image:  boot.powerk8s.cn/test-project/rust_web_test:2.0.0
                Name:   demo
                Ports:
                  Container Port:  8080
        API Version:               v1
        Kind:                      Service
        Metadata:
          Labels:
            App:      demo
          Name:       canary-demo
          Namespace:  default
        Spec:
          Ports:
            Name:         http
            Port:         8080
            Protocol:     TCP
            Target Port:  8080
          Selector:
            App:      demo
        API Version:  networking.k8s.io/v1
        Kind:         Ingress
        Metadata:
          Labels:
            App:      demo
          Name:       canary-demo
          Namespace:  default
        Spec:
          Ingress Class Name:  nginx
          Rules:
            Host:  canary-demo.com
            Http:
              Paths:
                Backend:
                  Service:
                    Name:  canary-demo
                    Port:
                      Number:  8080
                Path:          /version
                Path Type:     ImplementationSpecific
    Traits:
      Properties:
        Canary:
          Steps:
            Weight:  20
            Weight:  90
          Traffic Routings:
            Type:  ingress
      Type:        kruise-rollout
    Type:          k8s-objects
Status:
  Applied Resources:
    API Version:  apps/v1
    Creator:      workflow
    Kind:         Deployment
    Name:         canary-demo
    Namespace:    default
    API Version:  v1
    Creator:      workflow
    Kind:         Service
    Name:         canary-demo
    Namespace:    default
    API Version:  networking.k8s.io/v1
    Creator:      workflow
    Kind:         Ingress
    Name:         canary-demo
    Namespace:    default
    API Version:  rollouts.kruise.io/v1alpha1
    Creator:      workflow
    Kind:         Rollout
    Name:         canary-demo
    Namespace:    default
  Conditions:
    Last Transition Time:  2023-04-27T04:00:27Z
    Reason:                Available
    Status:                True
    Type:                  Parsed
    Last Transition Time:  2023-04-27T04:00:27Z
    Reason:                Available
    Status:                True
    Type:                  Revision
    Last Transition Time:  2023-04-27T04:00:28Z
    Reason:                Available
    Status:                True
    Type:                  Policy
    Last Transition Time:  2023-04-27T04:00:28Z
    Reason:                Available
    Status:                True
    Type:                  Render
  Latest Revision:
    Name:               canary-demo-v2
    Revision:           2
    Revision Hash:      a08d1df877cb296f
  Observed Generation:  2
  Services:
    Healthy:    true
    Message:    Rollout is in step(2/2), and upgrade workload to new version
    Name:       canary-demo
    Namespace:  default
    Traits:
      Healthy:  false
      Message:  Rollout is in step(2/2), and upgrade workload to new version
      Type:     kruise-rollout
    Workload Definition:
      API Version:  
      Kind:         
  Status:           runningWorkflow
  Workflow:
    App Revision:  v2
    Context Backend:
      API Version:  v1
      Kind:         ConfigMap
      Name:         workflow-canary-demo-context
      Namespace:    default
      UID:          0d762543-4114-4b6f-87cf-c2aebc57316d
    Finished:       false
    Mode:           DAG-DAG
    Start Time:     2023-04-27T04:00:56Z
    Status:         executing
    Steps:
      First Execute Time:  2023-04-27T04:00:56Z
      Id:                  nz6bq8ypkd
      Last Execute Time:   2023-04-27T04:01:48Z
      Message:             wait healthy
      Name:                canary-demo
      Phase:               running
      Reason:              Wait
      Type:                apply-component
    Suspend:               false
    Terminated:            false
Events:
  Type    Reason           Age                  From         Message
  ----    ------           ----                 ----         -------
  Normal  Applied          2m2s                 Application  Workflow finished
  Normal  Deployed         2m2s (x2 over 2m2s)  Application  Deployed successfully
  Normal  PolicyGenerated  92s (x5 over 2m2s)   Application  Policy generated successfully
  Normal  Rendered         92s (x5 over 2m2s)   Application  Rendered successfully
  Normal  Parsed           91s (x6 over 2m3s)   Application  Parsed successfully
  Normal  Revisioned       91s (x6 over 2m3s)   Application  Revisioned successfully

支持在时间片范围内进行发布

有些时候,因为一些业务特性,我们希望滚动更新在特定时间范围内执行,避开高峰期执行,而不是单纯的使用.pause去限定,因为这对于运维人员来说,这是一个不小的负担,如果只有.pause进行限定,意味着我每次的更改需要计算好时间再去发布,而不能随时发布变更

Improve the current Deployment batch release solution

What would you like to be added:

The current batch release program for Deployment, the extreme case (as follows) there is a double resource situation, the subsequent optimization to solve:

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  strategy:
    canary:
      steps:
        ## the first 20% grayed out replicas, and 20% of the traffic
        - weight: 20
          ## Manual confirmation of continuation
           pause: {}
        -  weight: 40
            pause: {duration: 60}
        -  weight: 60
            pause: {duration: 60}
       -  weight: 80
           pause: {duration: 60}
       -  weight: 100
           pause: {duration: 60}
  objectRef: 
    workloadRef: 
      apiVerison: apps/v1
      kind: Deployment
      name: echoserver

So currently for Deployment Rollout, it is recommended to configure it in the following way:

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  strategy:
    canary:
      steps:
        ## the first 20% grayed out replicas, and 20% of the traffic
        - weight: 20
          ## Manual confirmation of continuation
           pause: {}
      trafficRouting: 
        service: echoserver  # required
        ## nginx ingress 
        type: nginx
        nginx:
          ingress: echoserver  # required
  objectRef: 
    workloadRef: 
      apiVerison: apps/v1
      kind: Deployment
      name: echoserver

[Feature] advanced Workload selector

So far ,User use "workloadRef" to specify the target workload. Need one new feature let user specify customized workload selector ,such as "label selector"

是否只支持weight方式

strategy:
canary:
# canary published, e.g. 20%, 40%, 60% ...
steps:
# routing 5% traffics to the new version
- weight: 5
# Manual confirmation of the release of the remaining pods
pause: {}
# optional, The first step of released replicas. If not set, the default is to use 'weight', as shown above is 5%.
replicas: 20%
这个配置是否支持weight 这种配置,有没有其他方式。

kruise rollout could support all traffic convert to gray environments then delete old product environments when rollout going done.

Current, when rollout going done, the product environment will firstly rolling update with the product traffic on, then convert all traffic to product environment, then delete gray environments. It will have a problem: if two application name A and B have call dependency,
They rolling update pod concurrently will lead to traffic confusion: the traffic flow from A product environments may flow to B gray environments.

Because the gray environment will be new product environment. the gray workload name should support config by user when it start rollout.

Deploy Business Application error

env:mac,minikube
the yaml file is https://github.com/openkruise/rollouts/blob/master/docs/tutorials/basic_usage.md#1-deploy-business-application-contains-deployment-service-and-ingress
problem:when i enter kubectl apply -f echoserver.yamland kubectl get pods,i will get pods error,such as

NAME                                                             READY   STATUS    RESTARTS      AGE
echoserver-56b6c7cc94-5qslt                                      0/1     Error     2 (17s ago)   22s
echoserver-56b6c7cc94-dwqqz                                      0/1     Error     2 (18s ago)   22s
echoserver-56b6c7cc94-pt6th                                      0/1     Error     2 (17s ago)   22s
echoserver-56b6c7cc94-tkw26                                      0/1     Error     2 (17s ago)   22s
echoserver-56b6c7cc94-xggxs                                      0/1     Error     2 (17s ago)   22s
➜  kubenetes kubectl describe pod echoserver-56b6c7cc94-5qslt
Name:         echoserver-56b6c7cc94-5qslt
Namespace:    default
Priority:     0
Node:         minikube/192.168.58.2
Start Time:   Sun, 08 May 2022 18:57:03 +0800
Labels:       app=echoserver
              pod-template-hash=56b6c7cc94
Annotations:  <none>
Status:       Running
IP:           172.17.0.22
IPs:
  IP:           172.17.0.22
Controlled By:  ReplicaSet/echoserver-56b6c7cc94
Containers:
  echoserver:
    Container ID:   docker://5773ea0bc0ee021c19ae812c9c30d2f6c8cd719921dfb6230863c7ec97b09ec2
    Image:          cilium/echoserver:1.10.2
    Image ID:       docker-pullable://cilium/echoserver@sha256:f8c125b8ad412c65be38c721463737e4e80721208c744cbacc2a8977b61441c6
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sun, 08 May 2022 20:04:22 +0800
      Finished:     Sun, 08 May 2022 20:04:23 +0800
    Ready:          False
    Restart Count:  18
    Environment:
      PORT:  8080
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nzdg9 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-nzdg9:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                  From     Message
  ----     ------   ----                 ----     -------
  Warning  BackOff  57s (x328 over 70m)  kubelet  Back-off restarting failed container
➜  kubenetes kubectl logs echoserver-56b6c7cc94-5qslt
Generating self-signed cert
Generating a 2048 bit RSA private key
...........+++
................................................................+++
writing new private key to '/certs/privateKey.key'
-----
Starting nginx
2022/05/08 12:04:23 [error] 24#24: failed to initialize Lua VM in /etc/nginx/nginx.conf:88
nginx: [error] failed to initialize Lua VM in /etc/nginx/nginx.conf:88

i don't know where to look /etc/nginx/nginx.conf:88,and why will lead to this error,please teach me how to slove it

By the way, in my opinion,the inage maybe have error

url:https://github.com/openkruise/rollouts/blob/master/docs/images/rollout_canary.png
Canary Releases should be allocated less traffic,but in image allocated 95% traffic to new version service .Don't be offended if my opinion is incorrect
@zmberg

[Feature] Support specifying batch index for rolling

By far, in the spec of Rollout, we can only specify the steps of the rolling process but cannot set which step we want to reached. In the case that the Rollout process needs to be managed by upper layer system, like App Delivery Tools / GitOps Tools / Business Platform, we need to set all the steps before the desired step to use no duration to allow a direct rolling.

For example, if we have a rolling process that contains 5 steps

steps:
  - weight: 5
  - weight: 10
  - weight: 20
  - weight: 50
  - weight: 100

If we want to let the rolling to reach the 4th step and stop there, we need to write

steps:
  - weight: 5
    pause:
      duration: 0
  - weight: 10
    pause:
      duration: 0
  - weight: 20
    pause:
      duration: 0
  - weight: 50
    pause: {}
  - weight: 100

This might not be a very convenient way especially when user needs to interact with the whole spec of the rollout object directly.

If we can support a more convenient way to do that, like directly specifying the desired step id or index, it could make it easier for the interaction like

stepIndex: 3
steps:
  - weight: 5
  - weight: 10
  - weight: 20
  - weight: 50
  - weight: 100

BTW, the currentStepIndex in the status field could also be an alternative place for changing step index. As in kubectl v1.26, we could use kubectl edit --subresource=status to edit the status of the Kubernetes resources.

CloneSet support rollback in batch-by-batch way

In some complex scenes, the rollback of applications may relate with not only workload, but also other resources, such as configMaps and secrets. If just the workload is rolled back directly, users may face the risk that the application is unavailable. So we allow users rollback in a batch-by-batch way to reduce the risks like this.

[feature]取消 spec.replicas 在 Int 类型下 100 的上限限制

当前 spec.replicas 的数据类型为 IntOrString,但我们的校验都是同一套标准,会导致 spec.replicasInt 类型下不能突破 100 的上限,我知道我们的设计初衷可能是希望大于100 的副本数,就由百分比去控制,但是在使用场景中,难免会有希望添加一个准确数字的需求,所以我觉得是否可以根据 spec.replicas 的数据类型来分开判断,取消在 Int 类型下 100 的上限,将这个权限交给用户。

支持ALB

I0826 18:54:53.687162 11941 request.go:1181] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"admission webhook "vrollout.kb.io" denied the request: Spec.Strategy.Canary.TrafficRouting.Type: Invalid value: "alb": TrafficRouting only support 'nginx' type","code":422}

[Bug] can not complete rollout

my rollout has 3 steps. the 3rd one raise this error

I0107 15:40:53.810662       1 rollout_controller.go:107] Begin to reconcile Rollout ap1/zjn-test-test-zjn-01
I0107 15:40:53.810976       1 rollout_progressing.go:37] reconcile rollout(ap1/zjn-test-test-zjn-01) progressing action...
I0107 15:40:53.811124       1 rollout_progressing.go:78] rollout(ap1/zjn-test-test-zjn-01) is Progressing, and in reason(InRolling)
I0107 15:40:53.811151       1 rollout_canary.go:84] rollout(ap1/zjn-test-test-zjn-01) run canary strategy, and state(StepTrafficRouting)
W0107 15:40:53.811156       1 manager.go:85] rollout(ap1/zjn-test-test-zjn-01) stableRevision or podTemplateHash can not be empty, and wait a moment

cloneset use kruise-rollout

I use kruise-rollout on cloneset (InPlaceIfPossible) . when i "describe rollouts " get info :
image

I found the rollout process strange. step 2 and step 3 upgrade at same time. Why is there no time difference ? Shouldn't each step wait for the new version of pod to be launched?

Rollout and kubevela build multi-cluster publishing capabilities together

What would you like to be added:

Rollout itself is more focused on grayscale within a single cluster, and the capabilities for multiple clusters need to be built with kubevela, as follows:

apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: podinfo
spec:
  components:
    traits:
      - type: kruise-rollout
        properties:
          canary:
            steps:
              - weight: 20
                 pause: {}
            trafficRouting:
              type: nginx

httproute 权重更新有问题

httproute

apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  creationTimestamp: "2023-04-27T07:30:47Z"
  generation: 57
  labels:
    app_id: d3251dc4bd6748a4ae9ec4e1da73c7a8
  managedFields:
  - apiVersion: gateway.networking.k8s.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:parents: {}
    manager: pilot-discovery
    operation: Update
    subresource: status
    time: "2023-04-27T07:30:47Z"
  - apiVersion: gateway.networking.k8s.io/v1beta1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:app_id: {}
      f:spec:
        .: {}
        f:hostnames: {}
        f:parentRefs: {}
    manager: rainbond-api
    operation: Update
    time: "2023-04-27T07:30:47Z"
  name: default-038a99
  namespace: zhangqh
  resourceVersion: "7457252"
  uid: cd174b49-a50c-4151-a7aa-a7cede3d0c87
spec:
  hostnames:
  - rainbond.example.com
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: bookinfo-gateway
    namespace: zhangqh
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: nginx
      namespace: zhangqh
      port: 80
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /
status:
  parents:
  - conditions:
    - lastTransitionTime: "2023-05-06T03:57:02Z"
      message: Route was valid
      observedGeneration: 57
      reason: Accepted
      status: "True"
      type: Accepted
    - lastTransitionTime: "2023-05-06T03:57:02Z"
      message: All references resolved
      observedGeneration: 57
      reason: ResolvedRefs
      status: "True"
      type: ResolvedRefs
    controllerName: istio.io/gateway-controller
    parentRef:
      group: gateway.networking.k8s.io
      kind: Gateway
      name: bookinfo-gateway
      namespace: zhangqh

rollout

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
metadata:
  annotations:
    rollouts.kruise.io/hash: 2f5d4z94dvv5z4zb9fz4vd4z95w7297fd4825wb7cdw8464fdfcbvz49d5vbd2w2
    rollouts.kruise.io/rolling-style: partition
  creationTimestamp: "2023-05-06T05:41:24Z"
  finalizers:
  - rollouts.kruise.io/rollout
  generation: 1
  labels:
    app_id: d3251dc4bd6748a4ae9ec4e1da73c7a8
    component_id: 2dd5cc625b09f75b9a01ef632045d183
    hostname: rainbond.example.com
  managedFields:
  - apiVersion: rollouts.kruise.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:rollouts.kruise.io/hash: {}
        f:finalizers:
          .: {}
          v:"rollouts.kruise.io/rollout": {}
    manager: kruise-rollout
    operation: Update
    time: "2023-05-06T05:41:24Z"
  - apiVersion: rollouts.kruise.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:rollouts.kruise.io/rolling-style: {}
        f:labels:
          .: {}
          f:app_id: {}
          f:component_id: {}
          f:hostname: {}
      f:spec:
        .: {}
        f:objectRef:
          .: {}
          f:workloadRef:
            .: {}
            f:apiVersion: {}
            f:kind: {}
            f:name: {}
        f:strategy:
          .: {}
          f:canary:
            .: {}
            f:steps: {}
            f:trafficRoutings: {}
    manager: rainbond-worker
    operation: Update
    time: "2023-05-06T05:41:24Z"
  - apiVersion: rollouts.kruise.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:canaryStatus:
          .: {}
          f:canaryReadyReplicas: {}
          f:canaryReplicas: {}
          f:canaryRevision: {}
          f:currentStepIndex: {}
          f:currentStepState: {}
          f:lastUpdateTime: {}
          f:message: {}
          f:observedWorkloadGeneration: {}
          f:podTemplateHash: {}
          f:rolloutHash: {}
          f:stableRevision: {}
        f:conditions: {}
        f:message: {}
        f:observedGeneration: {}
        f:phase: {}
    manager: kruise-rollout
    operation: Update
    subresource: status
    time: "2023-05-06T05:41:27Z"
  name: default-nginx
  namespace: zhangqh
  resourceVersion: "7489716"
  uid: e60ce2c7-2753-4355-a9aa-ed796fda264c
spec:
  objectRef:
    workloadRef:
      apiVersion: apps/v1
      kind: Deployment
      name: default-nginx
  strategy:
    canary:
      steps:
      - matches:
        - headers:
          - name: qqq
            type: Exact
            value: qqq
        pause: {}
        weight: 50
      - matches:
        - headers:
          - name: qqq
            type: Exact
            value: qqq
        pause: {}
        weight: 70
      - matches:
        - headers:
          - name: qqq
            type: Exact
            value: qqq
        pause: {}
        weight: 100
      trafficRoutings:
      - gateway:
          httpRouteName: default-038a99
        service: nginx
status:
  canaryStatus:
    canaryReadyReplicas: 2
    canaryReplicas: 2
    canaryRevision: 66c4494759
    currentStepIndex: 1
    currentStepState: StepPaused
    lastUpdateTime: "2023-05-06T05:41:39Z"
    message: BatchRelease is at state Ready, rollout-id , step 1
    observedWorkloadGeneration: 265
    podTemplateHash: 66c4494759
    rolloutHash: 2f5d4z94dvv5z4zb9fz4vd4z95w7297fd4825wb7cdw8464fdfcbvz49d5vbd2w2
    stableRevision: 8487bf6555
  conditions:
  - lastTransitionTime: "2023-05-06T05:41:24Z"
    lastUpdateTime: "2023-05-06T05:41:24Z"
    message: Rollout is in Progressing
    reason: InRolling
    status: "True"
    type: Progressing
  message: Rollout is in step(1/3), and you need manually confirm to enter the next
    step
  observedGeneration: 1
  phase: Progressing

A/B发布 matches 字段貌似和文档不一致

文档:https://openkruise.io/zh/rollouts/user-manuals/api-specifications/

案例中使用

matches:
          - headers:
              - type: Exact # or "RegularExpression"
                key: <matched-header-key>
                value: <matched-header-value, or reg-expression>

报错:Error from server (BadRequest): error when creating "demo_rollout.yaml": Rollout in version "v1alpha1" cannot be handled as a Rollout: strict decoding error: unknown field "spec.strategy.canary.steps[0].matches[0].headers[0].key"

源代码中:rollouts/api/v1alpha1/rollout_types.go

type HttpRouteMatch struct {
	Headers []gatewayv1alpha2.HTTPHeaderMatch `json:"headers,omitempty"`
}
type HTTPHeaderMatch struct {
	Type *HeaderMatchType `json:"type,omitempty"`
	Name HTTPHeaderName `json:"name"`
	Value string `json:"value"`
}

key目前应该换成name吗

Rollout canary steps supports Metrics analysis, Hook Mechanism

What would you like to be added:

Rollout release process supports post-checking, Metrics analysis, as follows:

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
spec:
  strategy:
    objectRef:
      ...
    canary:
      steps:
      - weight: 5
        ...
      # metrics analysis
      analysis:
        templates:
        - templateName: success-rate
          startingStep: 2 # delay starting analysis run until setWeight: 40%
          args:
          - name: service-name
            value: guestbook-svc.default.svc.cluster.local

# metrics analysis
apiVersion: rollouts.kruise.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  args:
  - name: service-name
  metrics:
  - name: success-rate
    interval: 5m
    # NOTE: prometheus queries return results in the form of a vector.
    # So it is common to access the index 0 of the returned array to obtain the value
    successCondition: result[0] >= 0.95
    failureLimit: 3
    provider:
      prometheus:
        address: http://prometheus.example.com:9090
        query: |
          sum(irate(
            istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}",response_code!~"5.*"}[5m]
          )) / 
          sum(irate(
            istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}"}[5m]
          ))

开发文档需求

希望能够有一份开发文档,整体架构,依赖关系和本地运行的文档

Add a CRD to record the Rollout release process

From the design doc of rollouts, I know that the rollouts resource is bound to a deployment or cloneset, which is a one to one mode, we should record the information such as pod names, pod ips, deploy strategy... when user do a deploy action.

关于kurise-rollout 使用aliyun-alb 作为ingress controller时支持canary-by-header的灰度能力。

版本: kruise-rollout v0.3.0
Kubernetes 版本:1.25.4
ingress controller: aliyun-alb

rollout manifest:

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
metadata:
  name: rollout-with-traffic
  annotations:
    rollouts.kruise.io/rolling-style: partition
  namespace: foo
spec:
  objectRef:
    workloadRef:
      apiVersion: apps/v1
      kind: Deployment
      name: echoserver
  strategy:
    canary:
      steps:
      - replicas: 1
        weight: 5
        pause: {}
      - replicas: 30%
        pause: {}
        matches:
        - headers:
          - name: user-agent
            type: Exact
            value: pc
      - replicas: 60%
        pause: {}
        matches:
        - headers:
          - name: user-agent
            type: Exact
            value: mobile
      - replicas: 100%
        weight: 100
      trafficRoutings:
      - service: echoserver
        ingress:
          name: echoserver
          classType: aliyun-alb

在rollout 的过程当中接入管理入口流量,并且需求根据httpheader 的方式进行对后端灰度,第二阶段通过灰度 user-agent: pc 流量, 第三阶段灰度 user-agent: mobile 流量。 再此场景下
aliyun-alb 会产生的alb ingress annotation 如下:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/canary: "true"
    alb.ingress.kubernetes.io/canary-by-header: user-agent
    alb.ingress.kubernetes.io/canary-by-header-value: pc
    alb.ingress.kubernetes.io/order: "1"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"networking.k8s.io/v1","kind":"Ingress","metadata":{"annotations":{},"name":"echoserver","namespace":"foo"},"spec":{"ingressClassName":"alb","rules":[{"host":"echoserver.caocaokeji.cn","http":{"paths":[{"backend":{"service":{"name":"echoserver","port":{"number":80}}},"path":"/apis/echo","pathType":"Exact"}]}}]}}
  creationTimestamp: "2023-03-14T07:53:54Z"
  generation: 1
  name: echoserver-canary
  namespace: foo
...

此规则在执行至第二阶段时对alb 产生全新的rule 规则,并将自己的order ID 该ALB的最前端,并配置ID 为1。此时灰度流量访问正常,如果同时具备多个需要http header的 应用同时进行灰度,则会出现aliyun-alb 无法创建 rule的情况。原因为:alb 的 orderID 在宣告的 ingress 当中必须唯一。

# kube-ali-stable get ingress -n foo
NAME                      CLASS   HOSTS                            ADDRESS                                               PORTS   AGE
echoserver                alb     echoserver.example.com          alb-g2jxqzz0nxknlsapxl.cn-hangzhou.alb.aliyuncs.com   80      25h
echoserver-canary         alb     echoserver.example.com          alb-g2jxqzz0nxknlsapxl.cn-hangzhou.alb.aliyuncs.com   80      87m
echoserver-clone          alb     echoserver.example.com          alb-g2jxqzz0nxknlsapxl.cn-hangzhou.alb.aliyuncs.com   80      102m
echoserver-clone-canary   alb     echoserver.example.com                                                                80      117s

# kube-ali-stable get ingress -n foo -o yaml echoserver-clone-canary
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/canary: "true"
    alb.ingress.kubernetes.io/canary-weight: "5"
    alb.ingress.kubernetes.io/order: "1"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"networking.k8s.io/v1","kind":"Ingress","metadata":{"annotations":{},"name":"echoserver-clone","namespace":"foo"},"spec":{"ingressClassName":"alb","rules":[{"host":"echoserver-clone.caocaokeji.cn","http":{"paths":[{"backend":{"service":{"name":"echoserver-clone","port":{"number":80}}},"path":"/apis/echo","pathType":"Exact"}]}}]}}
  creationTimestamp: "2023-03-14T09:19:46Z"
  generation: 1
  name: echoserver-clone-canary
  namespace: foo
...

如果将 kruise-rollout-controller-manager 内/lua_configuration/trafficrouting_ingress/aliyun-alb.lua 文件 annotations["alb.ingress.kubernetes.io/order"] = "1" 进行删除,则canary ingress 可以正确创建出,并在aliyun-alb上生成全新ruleID ,但客户端进行
curl --silent -H "user-agent: pc " -H "Host: echoserver.example.com" alb-g2jxqzz0nxknlsapxl.cn-hangzhou.alb.aliyuncs.com/apis/echo 时无法进行灰度请求。原因目前查到,由于该canary 的ingress 在aliyun-alb上的生成的 ruleID 后于现有ingress ruleID ,按照aliyun-alb 匹配从上至下匹配规则,由于无法匹配到该header 请求并进行转发。

** 需求:希望kruise-rollout能够在使用aliyun-alb作为ingress controller的场景下,rollout 对象开启 trafficRoutings 时能支持canary-by-header模式的灰度功能。**

We have multiple Ingress in one workload, but currently only one is supported.

kubectl explain Rollout.spec.strategy.canary.trafficRoutings
KIND: Rollout
VERSION: rollouts.kruise.io/v1alpha1

RESOURCE: trafficRoutings <[]Object>

DESCRIPTION:
TrafficRoutings hosts all the supported service meshes supported to enable
more fine-grained traffic routing todo current only support one
TrafficRouting

[Help] What happened in `Check diff` under [CI/unit-tests] ?

In pull/61, I changed three files:

  1. add api/v1alpha1/rollouthistory_types.go,
  2. add config/crd/bases/rollouts.kruise.io_rollouthistories.yaml,
  3. changed api/v1alpha1/zz_generated.deepcopy.go. ( add some deep-copy method which are generated by kubebuilder)

Situation : In [CI/unit-tests] https://github.com/openkruise/rollouts/actions/runs/3288369126/jobs/5418624092, I found that

  1. there is warning Warning: The save-state command is deprecated and will be disabled soon. Please upgrade to using Environment Files. For more information see: https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands/ in step Cache Go Dependencies
  2. In step Run Unit Tests, after command git status, there are infomation as below:
    Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: api/v1alpha1/zz_generated.deepcopy.go
  3. and thus in step Check diff, expression -z $(git status -s) will be false, leading to false for [CI/unit-tests].

Question : How can I pass this -z $(git status -s), or what should I do to make changes modified: api/v1alpha1/zz_generated.deepcopy.go staged? Thanks!

Rollout CanaryStatus support Replicas

Can CanaryStatus add a total number of replicas, especially when the workload is automatically HPA, I can get the number of replicas directly from the rollout without checking the specific workload

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.