kubernetes-retired / federation Goto Github PK

View Code? Open in Web Editor NEW

209.0 209.0 82.0 53.39 MB

[EOL] Cluster Federation

License: Apache License 2.0

Python 4.94% Makefile 0.90% Shell 22.34% C 0.07% Go 42.80% HTML 28.58% Dockerfile 0.37%

federation's People

Contributors

Stargazers

Watchers

Forkers

irfanurrehman shashidharatd gabrielsvinha font jianhuiz schwankl ddrechse onyiny-ang neotracker sdminonne walteraa csbell nikox94 ggaaooppeenngg perotinus koulq g-harmon cimomo spiffxp yastij xtophs ixdy k82cn etsangsplk mastamark jeis2497052 leahnp grahamhayes tedli tenxcloud wanlinghao fid-dev chenpengdev wongds adamdang myang32 bretagne-peiqi sara4dev smirzai weiwei04 zhaohaidan zmoon111 mirake riverzhang gyliu513 zzchun ra489 ethan-2017 kentxun nikhita qiuhui mooncak sataqiu maduhu xichengliudui spadoop xiezongzhe namingwaysway alpe ming-ddtechcg viaheavn not-a-dev0 lingarajonline joelsmith quinton-hoole nashim220 strikingraghu luodidiao cleey zixuan-jiang wjd198344 jetlwx abirdcfly isgasho cqmh sfowl isabella232

federation's Issues

Federation: google cloud dns records aren't properly when create/update/delete endpoints or LB service

Issue by mozhuli
Monday Sep 18, 2017 at 09:06 GMT
Originally opened as kubernetes/kubernetes#52643

I have a federation with two clusters in a GKE cluster, the version is v1.7.5 , show below:

$ kubectl get clusters --context=kfed
NAME       STATUS    AGE
cluster1   Ready     1h
cluster2   Ready     1h

When I create a federated LoadBalancer service, the LoadBalancer Ingress are created in both clusters.

$ kubectl describe svc nginx --context=kfed
Name:			nginx
Namespace:		default
Labels:			app=nginx
Annotations:		federation.kubernetes.io/service-ingresses={}
Selector:		app=nginx
Type:			LoadBalancer
IP:			
LoadBalancer Ingress:	35.189.185.189, 35.200.87.19
Port:			http	80/TCP
Endpoints:		<none>
Session Affinity:	None
Events:			<none>

However, when only cluster1 has healthy endpoints (I create deployment use --context=cluster1), There
are not CNAME records in DNS with cluster2.(DNS should create CNAME records with cluster2)

$ kubectl describe svc nginx --context=kfed
Name:			nginx
Namespace:		default
Labels:			app=nginx
Annotations:		federation.kubernetes.io/service-ingresses={"items":[{"cluster":"cluster1","items":[{"ip":"35.189.185.189"}]}]}
Selector:		app=nginx
Type:			LoadBalancer
IP:			
LoadBalancer Ingress:	35.189.185.189, 35.200.87.19
Port:			http	80/TCP
Endpoints:		<none>
Session Affinity:	None
Events:			<none>

$ kubectl create -f deployment.yaml --context=cluster1

$ gcloud  dns record-sets list --zone=kfed
NAME                                                                 TYPE  TTL    DATA
infra.mycompany.com.                                                 NS    21600  ns-cloud-e1.googledomains.com.,ns-cloud-e2.googledomains.com.,ns-cloud-e3.googledomains.com.,ns-cloud-e4.googledomains.com.
infra.mycompany.com.                                                 SOA   21600  ns-cloud-e1.googledomains.com. cloud-dns-hostmaster.google.com. 1 21600 3600 259200 300
nginx.default.kfed.svc.asia-east1-c.asia-east1.infra.mycompany.com.  A     180    35.189.185.189
nginx.default.kfed.svc.asia-east1.infra.mycompany.com.               A     180    35.189.185.189
nginx.default.kfed.svc.infra.mycompany.com.                          A     180    35.189.185.189

when I create healthy endpoints in cluster2,the dns records(correct) show below:

$ kubectl create -f deployment.yaml --context=cluster2

$ kubectl describe svc nginx --context=kfed
Name:			nginx
Namespace:		default
Labels:			app=nginx
Annotations:		federation.kubernetes.io/service-ingresses={"items":[{"cluster":"cluster1","items":[{"ip":"35.189.185.189"}]},{"cluster":"cluster2","items":[{"ip":"35.200.87.19"}]}]}
Selector:		app=nginx
Type:			LoadBalancer
IP:			
LoadBalancer Ingress:	35.189.185.189, 35.200.87.19
Port:			http	80/TCP
Endpoints:		<none>
Session Affinity:	None
Events:			<none>

$ gcloud  dns record-sets list --zone=kfed
NAME                                                                           TYPE  TTL    DATA
infra.mycompany.com.                                                           NS    21600  ns-cloud-e1.googledomains.com.,ns-cloud-e2.googledomains.com.,ns-cloud-e3.googledomains.com.,ns-cloud-e4.googledomains.com.
infra.mycompany.com.                                                           SOA   21600  ns-cloud-e1.googledomains.com. cloud-dns-hostmaster.google.com. 1 21600 3600 259200 300
nginx.default.kfed.svc.asia-east1-c.asia-east1.infra.mycompany.com.            A     180    35.189.185.189
nginx.default.kfed.svc.asia-east1.infra.mycompany.com.                         A     180    35.189.185.189
nginx.default.kfed.svc.asia-northeast1-c.asia-northeast1.infra.mycompany.com.  A     180    35.200.87.19
nginx.default.kfed.svc.asia-northeast1.infra.mycompany.com.                    A     180    35.200.87.19
nginx.default.kfed.svc.infra.mycompany.com.                                    A     180    35.189.185.189,35.200.87.19

And when i delete deployment in cluster2, the related records not update properly.

$ kubectl delete -f deployment.yaml --context=cluster2
deployment "nginx-deployment" deleted

$ kubectl describe svc nginx --context=kfed
Name:			nginx
Namespace:		default
Labels:			app=nginx
Annotations:		federation.kubernetes.io/service-ingresses={"items":[{"cluster":"cluster1","items":[{"ip":"35.189.185.189"}]}]}
Selector:		app=nginx
Type:			LoadBalancer
IP:			
LoadBalancer Ingress:	35.189.185.189, 35.200.87.19
Port:			http	80/TCP
Endpoints:		<none>
Session Affinity:	None
Events:			<none>

$ gcloud  dns record-sets list --zone=kfed
NAME                                                                           TYPE  TTL    DATA
infra.mycompany.com.                                                           NS    21600  ns-cloud-e1.googledomains.com.,ns-cloud-e2.googledomains.com.,ns-cloud-e3.googledomains.com.,ns-cloud-e4.googledomains.com.
infra.mycompany.com.                                                           SOA   21600  ns-cloud-e1.googledomains.com. cloud-dns-hostmaster.google.com. 1 21600 3600 259200 300
nginx.default.kfed.svc.asia-east1-c.asia-east1.infra.mycompany.com.            A     180    35.189.185.189
nginx.default.kfed.svc.asia-east1.infra.mycompany.com.                         A     180    35.189.185.189
nginx.default.kfed.svc.asia-northeast1-c.asia-northeast1.infra.mycompany.com.  A     180    35.200.87.19
nginx.default.kfed.svc.asia-northeast1.infra.mycompany.com.                    A     180    35.200.87.19
nginx.default.kfed.svc.infra.mycompany.com.                                    A     180    35.189.185.189

The nginx.default.kfed.svc.infra.mycompany.com. record update 35.189.185.189,35.200.87.19 to 35.189.185.189, but nginx.default.kfed.svc.asia-northeast1-c.asia-northeast1.infra.mycompany.com. and nginx.default.kfed.svc.asia-northeast1.infra.mycompany.com. didn't update to CNAME.

Also when i delete LB service, the dns records show below:

$ gcloud  dns record-sets list --zone=kfed
NAME                                                                           TYPE   TTL    DATA
infra.mycompany.com.                                                           NS     21600  ns-cloud-e1.googledomains.com.,ns-cloud-e2.googledomains.com.,ns-cloud-e3.googledomains.com.,ns-cloud-e4.googledomains.com.
infra.mycompany.com.                                                           SOA    21600  ns-cloud-e1.googledomains.com. cloud-dns-hostmaster.google.com. 1 21600 3600 259200 300
nginx.default.kfed.svc.asia-east1-c.asia-east1.infra.mycompany.com.            CNAME  180    nginx.default.kfed.svc.asia-east1.infra.mycompany.com.
nginx.default.kfed.svc.asia-east1.infra.mycompany.com.                         CNAME  180    nginx.default.kfed.svc.infra.mycompany.com.
nginx.default.kfed.svc.asia-northeast1-c.asia-northeast1.infra.mycompany.com.  A      180    35.200.87.19
nginx.default.kfed.svc.asia-northeast1.infra.mycompany.com.                    A      180    35.200.87.19

which confuse me that record nginx.default.kfed.svc.asia-east1-c.asia-east1.infra.mycompany.com. and nginx.default.kfed.svc.asia-east1.infra.mycompany.com. update to CNAME which desire to be deleted.

The problems same in v1.6.9

/kind bug

/cc @kubernetes/sig-federation-bugs @quinton-hoole

Kubefed init should not fail if resource are already existed

Issue by liqlin2015
Wednesday Aug 16, 2017 at 07:40 GMT
Originally opened as kubernetes/kubernetes#50746

Is this a BUG REPORT or FEATURE REQUEST?:

What happened:
When I run kubefed init to deploy federation control plane in host cluster, some error happened. After I fixed the error and run kubefed init again, it throws error like "Service account already exist" or "federation-system namespace already exist".

What you expected to happen:
kubefed init should continue if related resource is already existed in host cluster. Or we expect a --force flag to force overwrite previous setting of federation.

How to reproduce it (as minimally and precisely as possible):

Create federation-system namespace in host cluster.
Run kubefed init.

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration**:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

Enable `kubefed init` accept a file to set up the control panel

Issue by gyliu513
Thursday Aug 17, 2017 at 02:28 GMT
Originally opened as kubernetes/kubernetes#50817

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
Now I need to use a complex command to for kubefed init to set up the federation control panel as following:

kubefed init federation-cluster \
     --host-cluster-context=host-context \
     --dns-provider="coredns" \
     --dns-zone-name="example.com." \
     --dns-provider-config="/root/federation/coredns-provider.conf" \
     --api-server-service-type="NodePort" \
     --api-server-advertise-address="x.x.x.x" \
     --etcd-image="library/etcd:v3.1.5" \
     --etcd-persistent-storage=false \
     --etcd-pv-capacity="10Gi" \
     --etcd-pv-storage-class="" \
     --federation-system-namespace="federation-system" \
     --image="kubernetes:v1.7.3" \
     --apiserver-arg-overrides="" \
     --controllermanager-arg-overrides=""

It would be great to enable kubefed init can accept a file just like what we do for kubectl apply -f <filename>, so that I can put all parameters in a file which is easy to maintain.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration**:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

/sig federation

/cc @kubernetes/sig-federation-feature-requests

Federation: API server does not allow privileged containers

Issue by erikgrinaker
Tuesday Sep 12, 2017 at 14:14 GMT
Originally opened as kubernetes/kubernetes#52341

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

Attempting to deploy a DaemonSet with securityContext.privileged: true in a federated cluster gives an error: Forbidden: disallowed by cluster policy

What you expected to happen:

The DaemonSet to be created.

How to reproduce it (as minimally and precisely as possible):

Create a federated cluster using kubefed, and submit the following DaemonSet:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: privileged
  labels:
    app: privileged
spec:
  template:
    metadata:
      labels:
        app: privileged
    spec:
      containers:
      - name: privileged
        image: busybox
        command: ['sleep', '10000']
        securityContext:
          privileged: true

This gives the following error:

$ kubectl apply -f privileged.yaml
The DaemonSet "privileged" is invalid: spec.template.spec.containers[0].securityContext.privileged: Forbidden: disallowed by cluster policy

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): 1.7.3
Cloud provider or hardware configuration**: GKE
OS (e.g. from /etc/os-release): COS 59 build 9460.73.0
Kernel (e.g. uname -a): 4.4.52
Install tools:
Others: kubefed 1.7.3

federation: ingress does not work on GCP for multiple clusters in the same zone

Issue by nikhiljindal
Thursday Jun 15, 2017 at 01:36 GMT
Originally opened as kubernetes/kubernetes#47565

Problem: When we create a federated ingress with 2 clusters in the same zone, ingress controller in both clusters generate the same instance group name and hence fight with each other by removing each others nodes from that instance group.
This is a problem only with clusters in the same zone since get instance group API takes the name and zone and hence clusters in different zones get different instance groups.

Possible solution: We can solve it in the same way we did for firewall rules. Federation adds a provider Uid which is unique to all clusters. We can update ingress controller to use the provider Uid if present to generate the instance group name and hence different clusters will have different instance group names.
This will break existing federation users.

This is not high priority in my mind, since most multi cluster users run clusters in different zones.

cc @nicksardo @madhusudancs

[Federation] Requirement & Design of Federated ResourceQuota

Issue by weijinxu
Thursday May 25, 2017 at 21:21 GMT
Originally opened as kubernetes/kubernetes#46464

This issue is created to track the development of federated resource quota.
The design document was updated here.

cc @kubernetes/sig-cluster-federation
@derekwaynecarr

ubernetes: How to collect cluster resource metrics

Issue by nikhiljindal
Wednesday Mar 30, 2016 at 01:49 GMT
Originally opened as kubernetes/kubernetes#23614

Forked from kubernetes/kubernetes#23430 (comment)

The ubernetes scheduler needs to know the available resources in each cluster to be able to divide resources (such as replicasets) appropriately.

As per https://github.com/kubernetes/kubernetes/blob/master/docs/design/federation-phase-1.md#cluster:

There is also a proposal to provide cluster level metrics via apiserver: #23376, which we can start using whenever it is available.

cc @kubernetes/sig-cluster-federation

kubefed init errors if namespace already exists

Issue by derekwaynecarr
Thursday Jun 08, 2017 at 06:04 GMT
Originally opened as kubernetes/kubernetes#47159

Is this a BUG REPORT or FEATURE REQUEST?
BUG

What happened:

$ kubectl create ns federation-system
$ kubefed init my-fed --dns-provider=google-clouddns \
        --federation-system-namespace="federation-system" \
        --etcd-persistent-storage=false
Error from server (AlreadyExists): namespaces "federation-system" already exists

What you expected to happen:
I expected the command to not error if the namespace already existed.
kubefed should only create a namespace if it doesnt already exist, otherwise just use the one i gave.

How to reproduce it (as minimally and precisely as possible):
see above

Anything else we need to know:
nope

Expand petset volume zone spreading

Issue by bprashanth
Tuesday Jun 21, 2016 at 22:28 GMT
Originally opened as kubernetes/kubernetes#27809

We got petset disk zone spreading in and it's really useful. However we left a couple of todos to follow up on:

Don't embed a zone scheduler in the pv provisioner (https://github.com/kubernetes/kubernetes/pull/27553/files#diff-b3d75e3586a2c9a5140cd549861da9c0R2094)
Write a unitest that protects the zone spreding from petset implementation changes (kubernetes/kubernetes#27553 (comment))
Maybe a multi-az e2e with petset before it goes beta?

@justinsb

[e2e test]: pull-kubernetes-federation-e2e-gce test always fail

Issue by guangxuli
Monday Sep 11, 2017 at 12:12 GMT
Originally opened as kubernetes/kubernetes#52270

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:
Tests about kubernetes/kubernetes#51889 always generate pull-kubernetes-federation-e2e-gce failure. Seems not relevant to the PR kubernetes/kubernetes#51889 after checking the details.

log:
I0911 10:09:03.239] process 22525 exited with code 0 after 0.5m
I0911 10:09:03.239] Call: git checkout -B test db809c0eb7d33fac8f54d8735211f2f3a8fc4214
W0911 10:09:04.570] Switched to a new branch 'test'
I0911 10:09:04.582] process 22538 exited with code 0 after 0.0m
I0911 10:09:04.583] Call: git merge --no-ff -m 'Merge +refs/pull/51889/head:refs/pr/51889' 3e3f9990ddfc3a003aad71340dc1294daa6764f6
I0911 10:09:05.164] Merge made by the 'recursive' strategy.
I0911 10:09:05.169] plugin/pkg/scheduler/algorithm/predicates/error.go | 39 +++----
I0911 10:09:05.169] .../scheduler/algorithm/predicates/predicates.go | 44 ++++----
I0911 10:09:05.169] .../algorithm/predicates/predicates_test.go | 115 +++++++++++++--------
I0911 10:09:05.169] 3 files changed, 114 insertions(+), 84 deletions(-)
I0911 10:09:05.170] process 22539 exited with code 0 after 0.0m
I0911 10:09:05.170] Checkout: /var/lib/jenkins/workspace/pull-kubernetes-federation-e2e-gce/go/src/k8s.io/release master
I0911 10:09:05.170] Call: git init k8s.io/release
... skipping 573 lines ...
I0911 10:22:30.580] md5sum(kubernetes-test.tar.gz)=0225e5099a5d437c9f404a5302ae54c7
I0911 10:22:31.161] sha1sum(kubernetes-test.tar.gz)=7528c3aada4254ce0458942a7eb928207940ffb7
I0911 10:22:31.161]
I0911 10:22:31.162] Extracting kubernetes-test.tar.gz into /workspace/kubernetes
W0911 10:22:38.006] 2017/09/11 10:22:38 util.go:131: Step './get-kube.sh' finished in 11.992213626s
W0911 10:22:38.006] 2017/09/11 10:22:38 util.go:129: Running: ./federation/cluster/federation-down.sh
W0911 10:22:38.195] error: context "e2e-f8n-agent-pr-52-0" does not exist
I0911 10:22:38.296] Cleaning Federation control plane objects
I0911 10:22:38.608] No resources found
I0911 10:22:39.115] No resources found
I0911 10:22:39.263] +++ [0911 10:22:39] Removing namespace "f8n-system-agent-pr-52-0", cluster role "federation-controller-manager:e2e-f8n-agent-pr-52-0-federation-e2e-gce-us-central1-a-federation-e2e-gce-us-central1-f" and cluster role binding "federation-controller-manager:e2e-f8n-agent-pr-52-0-federation-e2e-gce-us-central1-a-federation-e2e-gce-us-central1-f" from "federation-e2e-gce-us-central1-a"
I0911 10:22:39.263] +++ [0911 10:22:39] Removing namespace "f8n-system-agent-pr-52-0", cluster role "federation-controller-manager:e2e-f8n-agent-pr-52-0-federation-e2e-gce-us-central1-b-federation-e2e-gce-us-central1-f" and cluster role binding "federation-controller-manager:e2e-f8n-agent-pr-52-0-federation-e2e-gce-us-central1-b-federation-e2e-gce-us-central1-f" from "federation-e2e-gce-us-central1-b"
I0911 10:22:39.263] +++ [0911 10:22:39] Removing namespace "f8n-system-agent-pr-52-0", cluster role "federation-controller-manager:e2e-f8n-agent-pr-52-0-federation-e2e-gce-us-central1-f-federation-e2e-gce-us-central1-f" and cluster role binding "federation-controller-manager:e2e-f8n-agent-pr-52-0-federation-e2e-gce-us-central1-f-federation-e2e-gce-us-central1-f" from "federation-e2e-gce-us-central1-f"
... skipping 20 lines ...
I0911 10:44:42.118] Creating federation control plane service...................................................................................................................................................................................................................................................Dumping Federation and DNS pod logs to /workspace/_artifacts
W0911 10:44:42.219] I0911 10:24:42.157731 4041 init.go:305] Creating a namespace f8n-system-agent-pr-52-0 for federation system components
W0911 10:44:42.219] I0911 10:24:42.179313 4041 init.go:314] Creating federation control plane service
W0911 10:44:42.219] 2017/09/11 10:44:42 util.go:131: Step './federation/cluster/federation-up.sh' finished in 20m0.402805001s
W0911 10:44:42.219] 2017/09/11 10:44:42 e2e.go:460: Dumping Federation logs to: /workspace/_artifacts
W0911 10:44:42.219] 2017/09/11 10:44:42 util.go:129: Running: ./federation/cluster/log-dump.sh /workspace/_artifacts
W0911 10:44:42.620] Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver, verb=get, resource=nodes, subresource=proxy)
W0911 10:44:42.623] 2017/09/11 10:44:42 util.go:131: Step './federation/cluster/log-dump.sh /workspace/_artifacts' finished in 553.886159ms
W0911 10:44:42.623] 2017/09/11 10:44:42 util.go:129: Running: ./federation/cluster/federation-down.sh
W0911 10:44:42.812] error: context "e2e-f8n-agent-pr-52-0" does not exist
I0911 10:44:42.913] Cleaning Federation control plane objects
I0911 10:44:43.028] service "e2e-f8n-agent-pr-52-0-apiserver" deleted
I0911 10:44:43.555] secret "default-token-d678j" deleted

What you expected to happen:

pull-kubernetes-federation-e2e-gce test pass.

How to reproduce it (as minimally and precisely as possible):
Just trigger the test job.

Anything else we need to know?:
none

@kubernetes/test-infra-maintainers

kubefed should use an API version that both it and server uses

Issue by nikhiljindal
Friday Aug 11, 2017 at 23:57 GMT
Originally opened as kubernetes/kubernetes#50540

Forked from kubernetes/kubernetes#50534 (comment).

kubefed does API version discovery and then uses the preferred API group version that server supports. This breaks when server supports a newer version that the generated clientset that kubefed uses does not know about.
kubefed should choose the version that both server and it knows about.

kubernetes/kubernetes#50537 fixes this for RBAC. We need the same for all API resources that kubefed creates.

cc @kubernetes/sig-federation-bugs @liggitt

Solution discussion for rebuilding federation service controller cache on restart

Issue by mfanjie
Thursday May 26, 2016 at 02:51 GMT
Originally opened as kubernetes/kubernetes#26329

I am working on federation service controller, and there is a limitation on current cache, after had a call with @quinton-hoole, as it's not a common case, we will leave it as is in v1.3, and I open this issue to track the problem and to find out a more robust solution.

Federation service controller behavior description:

service controller watch federation apiserver for new services, create service map in the form of
fedServiceMap: [serviceName]cachedService: {service info, endpointMap: [clusterName]int, serviceStatusMap: [clusterName]api.LoadBalancerStatus}
endpointMap and serviceStatusMap have no element at this moment.
service controller create the service on all healthy clusters
service controller watches all healthy clusters for services and endpoints.
When new serviceStatus with LB info is returned, add it to serviceStatusMap and service info and update federation apiserver with the new LB info.
when new endpoint info returned, add it to endpoint map with flag 1 which indicates there is reachable endpoints

Problem description:
From above behavior description, we can see that serviceStatusMap is the only place to link clusterName and LB info, it is not persisted anywhere and it will go on service controller down.

So if during service controller down time, the service on any clusters is deleted, and after federation service controller being starts, the service will be rebuild on that cluster again, with different LB info, and as we lost the link between cluster and LB info, we do not know which old LB info in federation service was returned by from that cluster, as a result, we can add the new IP to federation service, but there is no way to remove the old one.

One typical example
Federation service:S1
Clusters: C1, C2
after S1 is created, it will be created in C1 and C2, with LB IP1, and LB IP2 assigned.
So after the a while S1 will changed to S1[LB:{IP1, IP2}]
serviceCache is same S1[LB:{IP1, IP2}]
serviceStatusMap [C1]S1[LB:IP1], [C2]S1[LB:IP2] , most import one, it links Ingress IP and clusters.
endpointMap [C1]1,[C2]1

For the case that LB IP in C1 changed from IP1 to IP3
If we have the cache info, we know IP1 is from C1 previously, so we can remove it and add IP3.
But If Service Controller is restarted, we will lose the old LB IP info, we do not know which IP we should remove from Federation IP, and the ingress info will be changed to S1[LB:{IP1, IP2, IP3}]

Move federation out of core repo

Issue by shashidharatd
Monday Sep 25, 2017 at 14:57 GMT
Originally opened as kubernetes/kubernetes#52992

This issue is to track all the tasks associated with moving out federation related code out of core kubernetes repo into its own repo (k8s.io/federation).

Refer to the plan and discussion on moving out federation in this google doc

/cc @kubernetes/sig-federation-pr-reviews
/assign @irfanurrehman

Zone-spread HA master for Ubernetes Lite

Issue by davidopp
Friday Feb 12, 2016 at 07:21 GMT
Originally opened as kubernetes/kubernetes#21124

The current incarnation of Ubernetes Lite makes the application tolerant to zone outages but does not make the Kubernetes control plane tolerant to zone outages since it doesn't use (zone-distributed) HA master. For a short or medium duration zone outage this is totally fine because the containers will continue running even if the master goes away, but if the zone where the master was running goes out for a long time, eventually there will be problems (since pods from failed nodes won't reschedule, users can't update the containers, etc.).

It would be good to do a zone distributed HA master for Ubernetes Lite, with at least one etcd replica per zone, at least one API server per zone, and the master-elected components like scheduler and controller-manager distributed across the zones.

cc/ @quinton-hoole @justinsb

ref/ #17059

How do admins of non-cloud nodes apply node zone labels for Kubernetes Lite?

Issue by quinton-hoole
Friday Nov 20, 2015 at 18:31 GMT
Originally opened as kubernetes/kubernetes#17575

kubernetes/kubernetes#16866 (comment) refers, copied here:

erictune commented:
We are going to want to have kubelet-computed labels, as well as labels that a user puts into a file on the kubelet and the kubelet automatically syncs to the node. We should think about how those two sets of labels interact.

Specifically, if a user is running on bare metal, and the user does not have a cloud provider to get the zone info from, the user are going to want to set the zone label via the file method.

Should the user should also use the failure-domain.kubernetes.io/zone label in this case?

Or should they use example.com/custom-zone, and then adjust the scheduler to somehow spread on this label instead?

I think this depends on whether the domain prefix means who owns the code that generated the label value, or if it means who owns the meaning of the label.

If we allow kubernetes.io/ domain labels in the file, then we need to decide which takes precedence, the file or the cloud provider. If we don't allow kubernetes.io in the file, then (maybe) we do not have to think about precedence.

Enable coredns etcd secure mode

Issue by gyliu513
Wednesday Aug 16, 2017 at 09:41 GMT
Originally opened as kubernetes/kubernetes#50761

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
The kubernetes federation has a provider of coreos, we cannot enable secure mode for etcd, it is better support secure mode for etcd.

[root@cfc-555 federation]# cat coredns-provider.conf
[Global]
etcd-endpoints = http://etcd-cluster.default:2379 
zones = example.com.
coredns-endpoints = coredns-coredns:53

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration**:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

/sig federation

/cc @kubernetes/sig-federation-feature-requests

Federation: Support federated ingress outside of GCP

Issue by quinton-hoole
Friday Sep 01, 2017 at 16:38 GMT
Originally opened as kubernetes/kubernetes#51806

/kind feature

Splitting off the subthread in kubernetes/kubernetes#39989 (comment)

This looks like it might have just got easier on AWS and cross-cloud:

New – Application Load Balancing via IP Address to AWS & On-Premises Resources

I haven't gone through all the details yet but this seems like it might form a good base for cross-cloud load balancing. As usual, the devil will be in the details.

federation: non-federation DNS lookup can return federation CNAME

Issue by nikhiljindal
Thursday Jun 23, 2016 at 20:15 GMT
Originally opened as kubernetes/kubernetes#27969

With the current KubeDNS code, it can happen that a pod requests mysvc.somens and resolve.conf adds myns.svc.cluster.local as a search path so that KubeDNS gets mysvc.somns.myns.svc.cluster.local.
If there is a federation with the name myns, isFederationQuery in KubeDNS will think this is a federation query and try to resolve mysvc.somens.svc.cluster.local. If there is a local service resolving to that DNS, then user will get that as expected (all is fine). But if there isnt, then KubeDNS will return mysvc.somens.myns.svc.myzone.myregion.mydomain. If there is no federation service with that name and in that namespace, then user will still get an NXDOMAIN as expected. But if there is, then user might be pointed to a service in another cluster, which user may or may not have wanted.

We can consider this a feature, if users want it or should fix it, if we consider it a bug.

Filing this to keep track.

@kubernetes/sig-cluster-federation

Enable federation apiserver connect to external etcd

Issue by gyliu513
Wednesday Aug 16, 2017 at 06:15 GMT
Originally opened as kubernetes/kubernetes#50737

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
If I want to enable the federation control panel, then I need to use kubefed init to install etcd and control panel. For etcd, as the kubernetes already have etcd running, so I want to re-use the existing etcd in my kubernetes cluster and do not want to install a new etcd for federation only.

What you expected to happen:
I can specify external etcd when using kubefed init.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration**:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

/sig federation

/cc @kubernetes/sig-federation-feature-requests

Federated Ingress fails with hybrid (gke/on-prem) environments

Issue by clenimar
Wednesday Jun 14, 2017 at 15:07 GMT
Originally opened as kubernetes/kubernetes#47522

Is this a request for help?: yes
What keywords did you search in Kubernetes issues before filing this one?: hybrid federated ingress, on-premise ingress
Is this a BUG REPORT or FEATURE REQUEST?: bug report
Kubernetes version (use kubectl version): v1.5.2
What happened: I've got a hybrid federation running on GKE with one on-premise cluster:

c1 (host): GKE/us-east1-b/v1.6.4
c2       : GKE/asia-east1-b/v1.6.4
local    : on-premise/v1.6.3

I'm using Google Cloud DNS as my DNS provider. I'm able to create a federated deployment and a federated service, and they are correctly spread across all the clusters. The DNS entries for all services replicas are correctly created in my DNS zone.

When I create a Federated Ingress, though, the Ingress resource is created in all the clusters, but the on-premise cluster replica never gets an IP address, thus not being able to receive load from the global IP address.

This is the local replica:

clenimar@local:~$ kubectl describe ing 
Name:			nginx
Namespace:		default
Address:		
Default backend:	nginx:80 (192.168.181.52:80,192.168.181.62:80)
Rules:
  Host	Path	Backends
  ----	----	--------
  *	* 	nginx:80 (192.168.181.52:80,192.168.181.62:80)
Annotations:
  first-cluster:	c1
Events:			<none>

This is a replica running in one GKE cluster:

clenimar@gke-us:~$ k describe ing
Name:			nginx
Namespace:		default
Address:		35.186.232.168
Default backend:	nginx:80 (10.32.0.8:80,10.32.1.8:80)
Rules:
  Host	Path	Backends
  ----	----	--------
  *	* 	nginx:80 (10.32.0.8:80,10.32.1.8:80)
Annotations:
  first-cluster:	c1
  backends:		{"k8s-be-30036--699c6201827491b5":"HEALTHY"}
  forwarding-rule:	k8s-fw-default-nginx--699c6201827491b5
  target-proxy:		k8s-tp-default-nginx--699c6201827491b5
  url-map:		k8s-um-default-nginx--699c6201827491b5
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason	Message
  ---------	--------	-----	----				-------------	--------	------	-------
  7m		7m		1	{loadbalancer-controller }			Normal		ADD	default/nginx
  6m		6m		1	{loadbalancer-controller }			Normal		CREATE	ip: 35.186.232.168
  6m		4m		10	{loadbalancer-controller }			Normal		Service	default backend set to nginx:30036
  5m		4m		2	{loadbalancer-controller }			Warning		GCE	googleapi: Error 400: The resource 'projects/angular-sorter-167417/global/healthChecks/k8s-be-31996--699c6201827491b5' is not ready, resourceNotReady

There are some clear differences between the two resources. I have tried to manually created them, without success. I'm open to suggestions.

What you expected to happen: The load to be routed to the on-premise cluster too. I'm aware that this is not officially supported as of today, but I've seen people getting it to work in AWS/Azure with no problem at all. I'm filling this issue in order to chase at least a temporary workaround until the spec for a hybrid Ingress is finished.

Ho to reproduce it (as minimally and precisely as possible):

create a federation in GKE.
join an on-premise cluster. I've followed these instructions: https://github.com/henriquetruta/hybrid-k8s-federation/
create a federated deploy, a federated service and a federated ingress.
the on-premise ingress will not work properly.

Anything else we need to know:
The on-premise cluster has no /var/log/glbc.log file.
The l7-default-backend pod has nothing in its logs.
federation-controller-manager logs:

E0614 14:50:13.426549       1 ingress_controller.go:898] Failed to execute updates for default/nginx: the server has asked for the client to provide credentials (put ingresses.extensions nginx)
E0614 14:50:16.159188       1 cluster_client.go:147] Failed to list nodes while getting zone names: the server has asked for the client to provide credentials (get nodes)
W0614 14:50:16.159206       1 clustercontroller.go:197] Failed to get zones and region for cluster c1: the server has asked for the client to provide credentials (get nodes)
E0614 14:50:21.905797       1 ingress_controller.go:518] Internal error: Cluster "c1" queued for configmap reconciliation, but not found.  Will try again later: error = <nil>
E0614 14:50:41.906015       1 ingress_controller.go:518] Internal error: Cluster "c1" queued for configmap reconciliation, but not found.  Will try again later: error = <nil>
E0614 14:50:57.012691       1 cluster_client.go:147] Failed to list nodes while getting zone names: the server has asked for the client to provide credentials (get nodes)
W0614 14:50:57.012712       1 clustercontroller.go:197] Failed to get zones and region for cluster c1: the server has asked for the client to provide credentials (get nodes)
E0614 14:51:01.906283       1 ingress_controller.go:518] Internal error: Cluster "c1" queued for configmap reconciliation, but not found.  Will try again later: error = <nil>
I0614 14:51:16.168346       1 deploymentcontroller.go:594] Updating nginx in c2
I0614 14:51:16.168410       1 deploymentcontroller.go:594] Updating nginx in local
E0614 14:51:21.906545       1 ingress_controller.go:518] Internal error: Cluster "c1" queued for configmap reconciliation, but not found.  Will try again later: error = <nil>
E0614 14:51:37.864862       1 cluster_client.go:147] Failed to list nodes while getting zone names: the server has asked for the client to provide credentials (get nodes)
W0614 14:51:37.864909       1 clustercontroller.go:197] Failed to get zones and region for cluster c1: the server has asked for the client to provide credentials (get nodes)
E0614 14:51:41.906786       1 ingress_controller.go:518] Internal error: Cluster "c1" queued for configmap reconciliation, but not found.  Will try again later: error = <nil>
E0614 14:52:01.907232       1 ingress_controller.go:518] Internal error: Cluster "c1" queued for configmap reconciliation, but not found.  Will try again later: error = <nil>
E0614 14:52:18.733527       1 cluster_client.go:147] Failed to list nodes while getting zone names: the server has asked for the client to provide credentials (get nodes)
W0614 14:52:18.733561       1 clustercontroller.go:197] Failed to get zones and region for cluster c1: the server has asked for the client to provide credentials (get nodes)
E0614 14:52:21.907710       1 ingress_controller.go:518] Internal error: Cluster "c1" queued for configmap reconciliation, but not found.  Will try again later: error = <nil>
E0614 14:52:41.908053       1 ingress_controller.go:518] Internal error: Cluster "c1" queued for configmap reconciliation, but not found.  Will try again later: error = <nil>
I0614 14:53:09.787667       1 deploymentcontroller.go:594] Updating nginx in c1

events:

31m        31m         3         nginx     Ingress                  Normal    CreateInCluster       {federated-ingress-controller }      Creating ingress in cluster c1
31m        31m         1         nginx     Ingress                  Normal    CreateInCluster       {federated-ingress-controller }      Creating ingress in cluster c2
31m        31m         5         nginx     Ingress                  Normal    UpdateInCluster       {federated-ingress-controller }      Updating ingress in cluster local
30m        30m         3         nginx     Ingress                  Normal    CreateInCluster       {federated-ingress-controller }      Creating ingress in cluster c1
30m        30m         1         nginx     Ingress                  Normal    CreateInCluster       {federated-ingress-controller }      Creating ingress in cluster c2
30m        30m         1         nginx     Ingress                  Normal    CreateInCluster       {federated-ingress-controller }      Creating ingress in cluster local
4m         29m         152       nginx     Ingress                  Normal    UpdateInCluster       {federated-ingress-controller }      Updating ingress in cluster c2
4m         29m         135       nginx     Ingress                  Normal    UpdateInCluster       {federated-ingress-controller }      Updating ingress in cluster c1
19m        19m         1         nginx     Ingress                  Normal    FailedClusterUpdate   {federated-ingress-controller }      Ingress update in cluster c1 failed: the server has asked for the client to provide credentials (put ingresses.extensions nginx)

Surface kubectl clusters in kubectl help when talking to federation apiserver

Issue by nikhiljindal
Friday May 13, 2016 at 21:26 GMT
Originally opened as kubernetes/kubernetes#25592

From kubernetes/kubernetes#24016 (comment):

Will be great to have this. Not critical for 1.3

Ref kubernetes/kubernetes#23653

cc @jianhuiz @kubernetes/sig-cluster-federation

Prioritized pods (ubernetes)

Issue by juanjoperl
Monday Feb 22, 2016 at 20:35 GMT
Originally opened as kubernetes/kubernetes#21705

For a possible workaround to the problem of federation, can kubernetes have this behabour?: the ability the services have several sets of labels to select pods. The first set would be the priority, the second come into play if the priority pods do not respond, and so on.
With this capability, could have pods tagged area, and the services have sorted priority areas

pr:pull-kubernetes-federation-e2e-gce flaked 94 times in the past week

Issue by fejta-bot
Wednesday Sep 13, 2017 at 22:28 GMT
Originally opened as kubernetes/kubernetes#52455

Flaky Job: pr:pull-kubernetes-federation-e2e-gce

Flakes in the past week: 94
Consistency: 86.80%

Flakiest tests by flake count:

Test	Flake Count
Federation Up	95
DumpFederationLogs	89
FederationTest	5

Flakiest Jobs

[Federation] Implement federated pod autoscaler

From @irfanurrehman on December 19, 2016 16:22

This issue is created to track the development of a federated pod autoscaler.
The design document was updated here some days ago.

cc @kubernetes/sig-cluster-federation
@shashidharatd @kshafiee @deepak-vij

Copied from original issue: kubernetes/kubernetes#38974

Federated Ingress sends unnecessary cross-region requests due to low maxRPS setting

Issue by dgpc
Tuesday May 30, 2017 at 20:20 GMT
Originally opened as kubernetes/kubernetes#46645

Kubernetes version (use kubectl version):

$ kubectl version --context=dev
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.3", GitCommit:"0480917b552be33e2dba47386e51decb1a211df6", GitTreeState:"clean", BuildDate:"2017-05-10T15:48:59Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:33:17Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

GKE (cluster version 1.6.4)

What happened:

I create a Federated Ingress with clusters in two GCE Regions (us-central1 & europe-west1).

Both clusters are created with balancingMode=rate, maxRPS=1

When more than (1 RPS * num_nodes) is being sent from clients in Europe, traffic appears to be round-robined to both the cluster in the US and the cluster in Europe.

What you expected to happen:

I expect to be able to send most traffic from European clients to the cluster in Europe, and only spill-over to the US when Europe is overloaded. To accomplish this, maxRPS needs to be set to an appropriately large value (based on our load testing), or we need to be able to use the utilization-based mode.

Right now, our clients unnecessarily have to pay a latency penalty due traffic being sent cross-region, even when we are not close to being overloaded in any region.

Perhaps the Ingress Controller could adopt a maxRPS from an annotation on the deployment, similar to the way it adopts a Health Check from the readinessProbe?

/kind bug
/sig federation

[Federation] Follow up items on federated pod autoscaling

Issue by irfanurrehman
Wednesday Jul 26, 2017 at 16:16 GMT
Originally opened as kubernetes/kubernetes#49644

The original issuse for feature federated pod autoscaler is here, and the feature is being tracked here. This issue is to track and follow up pending items from the first phase of implementation especially items chosen to defer in the first phase of implementation here.

1 - Make the replica nums offered (both min and max) and in turn increased or reduced from an existing cluster local hpa, configurable. Right now the value is kept to a pessimistic low of 1 at a time.
2 - Optimise the algorithm such that if there are clusters which need the replicas, they can first be taken from those clusters that have capacity to offer more.
3 - Implement hard limit based min/max preferences per cluster from the user.
4 - Contemplate usage of weighted distribution of replicas based on the weights applied on per cluster hpas by the user.

Refactor kubernetes scheduler and controller manager to enable code reuse

Issue by nikhiljindal
Friday Apr 29, 2016 at 20:06 GMT
Originally opened as kubernetes/kubernetes#24996

Forked from kubernetes/kubernetes#24038 (comment)

We are adding federation scheduler and controller manager as per the plan in kubernetes/kubernetes#23653.

There is obviously some common code that is shared with kubernetes controller manager and kubernetes scheduler.
We want to pull out that code into libraries to enable code reuse rather than having to duplicate code.

We need to figure out the right name and directory structure for such libraries.

cc @davidopp @madhusudancs @kubernetes/goog-control-plane @kubernetes/sig-cluster-federation

Enable `kubefed init` support `tolerations`

Issue by gyliu513
Thursday Aug 17, 2017 at 02:15 GMT
Originally opened as kubernetes/kubernetes#50815

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
The kubefed init can not specify tolerations, this will cause that I cannot deploy my federation panel on node which has taint, it is better to enable the kubefed init can specify the tolerations so that I can deploy federation panel on node which have taint that the federation panel can tolerate.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration**:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

/sig federation

/cc @kubernetes/sig-federation-feature-requests

Federated service DNS should be created for all zones

Issue by zihaoyu
Friday Aug 11, 2017 at 17:06 GMT
Originally opened as kubernetes/kubernetes#50529

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

If a federated cluster is a multi-zone cluster, then when a federated LoadBalancer type service is created, there is only one zone DNS record created. For example:

# This cluster spans three zones
➜  kubernetes git:(master) kubectl --context=hurley-beta-federation get cluster beta-us-west-2 -o json | jq '.status'
{
  "conditions": [
    {
      "lastProbeTime": "2017-08-11T17:36:25Z",
      "lastTransitionTime": "2017-08-08T15:15:40Z",
      "message": "/healthz responded with ok",
      "reason": "ClusterReady",
      "status": "True",
      "type": "Ready"
    }
  ],
  "region": "us-west-2",
  "zones": [
    "us-west-2a",
    "us-west-2b",
    "us-west-2c"
  ]
}

In Route53 console, I only see DNS records for one of the zones (us-west-2a). Same thing for us-east-1 federated cluster too.

I also see the following in kube-dns logs when I tried to resolve nginx.default.k8s-beta-federation locally:

I0810 20:31:17.072874       1 dns.go:636] Federation: skipping record since service has no endpoint: {10.5.35.160 0 10 10  false 30 0  /skydns/local/cluster/svc/default/nginx/6466336132663632}
I0810 20:31:17.147368       1 logs.go:41] skydns: incomplete CNAME chain from "nginx.default.my-federation.svc.us-west-2b.us-west-2.mycompany.com.": rcode 3 is not equal to success

What you expected to happen:

DNS records should be created for each zone of the cluster.

How to reproduce it (as minimally and precisely as possible):

Launch a multi-zone cluster.
Join this cluster to a federation.
Create a federated service with LoadBalancer type.

Anything else we need to know?:

It looks like, as of v1.7.3, this behavior is by design. I think we should address that TODO and add support for multi-zone clusters.

Environment:

Kubernetes version (use kubectl version): Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.2", GitCommit:"922a86cfcd65915a9b2f69f3f193b8907d741d9c", GitTreeState:"clean", BuildDate:"2017-07-21T08:08:00Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider or hardware configuration**: AWS
OS (e.g. from /etc/os-release): Container Linux by CoreOS 1353.7.0 (Ladybug)
Kernel (e.g. uname -a): Linux ip-10-72-230-146.ec2.internal 4.9.24-coreos #1 SMP Wed Apr 26 21:44:23 UTC 2017 x86_64 Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz GenuineIntel GNU/Linux
Install tools: Custom tool
Others:

kubefed should emit error when passing invalid dns-provider

Issue by henriquetruta
Monday Jul 10, 2017 at 21:00 GMT
Originally opened as kubernetes/kubernetes#48732

/kind bug

What happened:
I've misspelled the dns-provider (put googlecloud-dns instead of google-clouddns) when running kubefed init
It successfully ran and said everything was ok. However, federation-control-plane pod entered in CrashLoopBackOff and I could only see the error in the pod's logs:
F0710 20:15:37.946633 1 controllermanager.go:174] Cloud provider could not be initialized: unknown DNS provider "googlecloud-dns"

What you expected to happen:
If any string different than 'google-clouddns', 'aws-route53' or 'coredns' is provided in dns-provider, kubefed should return an error to the client.

Environment:

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T23:15:59Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

kubernetes-e2e-gce-ubernetes-lite should verify the cluster created is actually multi-zone

Issue by jlowdermilk
Tuesday Jun 14, 2016 at 17:04 GMT
Originally opened as kubernetes/kubernetes#27372

Filing separate follow up from #27150. The issue should have been caught by http://kubekins.dls.corp.google.com/job/kubernetes-e2e-gce-ubernetes-lite/ failing. The suite needs a specific feature test that e.g. looks at node labels to verify multiple zones are present.

cc @quinton-hoole

Secure etcd for `kubefed init`

Issue by gyliu513
Wednesday Aug 16, 2017 at 06:08 GMT
Originally opened as kubernetes/kubernetes#50734

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
The kubefed init will install etcd, but the etcd is not secured, we should enable secure mode for etcd, checkout here for how to enable tls for etcd https://coreos.com/etcd/docs/latest/op-guide/security.html

      --api-server-advertise-address string      Preferred address to advertise api server nodeport service. Valid only if 'api-server-service-type=NodePort'.
      --api-server-service-type string           The type of service to create for federation API server. Options: 'LoadBalancer' (default), 'NodePort'. (default "LoadBalancer")
      --apiserver-arg-overrides string           comma separated list of federation-apiserver arguments to override: Example "--arg1=value1,--arg2=value2..."
      --apiserver-enable-basic-auth              Enables HTTP Basic authentication for the federation-apiserver. Defaults to false.
      --apiserver-enable-token-auth              Enables token authentication for the federation-apiserver. Defaults to false.
      --controllermanager-arg-overrides string   comma separated list of federation-controller-manager arguments to override: Example "--arg1=value1,--arg2=value2..."
      --dns-provider string                      Dns provider to be used for this deployment.
      --dns-provider-config string               Config file path on local file system for configuring DNS provider.
      --dns-zone-name string                     DNS suffix for this federation. Federated Service DNS names are published with this suffix.
      --dry-run                                  dry run without sending commands to server.
      --etcd-image string                        Image to use for etcd server. (default "gcr.io/google_containers/etcd:3.0.17")
      --etcd-persistent-storage                  Use persistent volume for etcd. Defaults to 'true'. (default true)
      --etcd-pv-capacity string                  Size of persistent volume claim to be used for etcd. (default "10Gi")
      --etcd-pv-storage-class string             The storage class of the persistent volume claim used for etcd.   Must be provided if a default storage class is not enabled for the host cluster.
      --federation-system-namespace string       Namespace in the host cluster where the federation system components are installed (default "federation-system")
      --host-cluster-context string              Host cluster context
      --image string                             Image to use for federation API server and controller manager binaries. (default "gcr.io/google_containers/hyperkube-amd64:v0.0.0-master+$Format:%h$")
      --kubeconfig string                        Path to the kubeconfig file to use for CLI requests.

/sig federation
/cc @kubernetes/sig-federation-feature-requests

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration**:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

Federation-e2e flake - federation-apiserver connection refused

Issue by colhom
Thursday Jun 02, 2016 at 18:16 GMT
Originally opened as kubernetes/kubernetes#26725

The kubernetes-e2e-gce-federation jenkins jobs in consistently flaking with the following output.

â€¢ Failure [5.905 seconds]
[k8s.io] Federation apiserver [Feature:Federation]
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:641
  should allow creation of cluster api objects [It]
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/federation-apiserver.go:63

  creating cluster: Post https://146.148.36.68:443/apis/federation/v1alpha1/clusters: dial tcp 146.148.36.68:443: getsockopt: connection refused
  Expected error:
      <*url.Error | 0xc8209cc6f0>: {
          Op: "Post",
          URL: "https://146.148.36.68:443/apis/federation/v1alpha1/clusters",
          Err: {
              Op: "dial",
              Net: "tcp",
              Source: nil,
              Addr: {
                  IP: "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\x92\x94$D",
                  Port: 443,
                  Zone: "",
              },
              Err: {
                  Syscall: "getsockopt",
                  Err: 0x6f,
              },
          },
      }
      Post https://146.148.36.68:443/apis/federation/v1alpha1/clusters: dial tcp 146.148.36.68:443: getsockopt: connection refused
  not to have occurred

  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/federation-apiserver.go:53
------------------------------

This is kubectl attempting to open a connection with the federation-apiserver via a loadbalancer service.

There are two possible underlying causes here, which are not mutually exclusive:

federation-apiserver pod is not yet ready
GCE loadbalancer has not finished initializing

I plan to remedy this by introducing some more "is it ready yet?" logic to federated-up procedure:

wait until the federation-apiserver pod stays ready for 10 seconds. right now we only wait until it's running
attempt to do kubectl version for another 30 seconds before allowing the e2e tests to start, to verify loadbalancer is initialized. right now we assume the loadbalancer is ready to serve traffic after the service ingress metadata becomes available. this is most likely a poor assumption

\cc @quinton-hoole @nikhiljindal @madhusudancs

include federation e2e in merge bot testing

Issue by nikhiljindal
Thursday Jun 02, 2016 at 17:57 GMT
Originally opened as kubernetes/kubernetes#26723

We should include federation e2e in merge bot testing, so that we can catch PRs that break the test.
We need to ensure that it is 100% green before doing this.

@colhom Can you take this up?

cc @kubernetes/sig-cluster-federation

Support for CloudFlare DNS provider for Federated clusters

Issue by thecodeassassin
Wednesday Jul 26, 2017 at 14:27 GMT
Originally opened as kubernetes/kubernetes#49639

/kind feature

Currently federation only supports Google CloudDNS, Route53 and CoreDNS. I would love if if CloudFlare could also be supported since we use them for our DNS zones. I'm sure this will be a popular addition as well. It's already supported for ExternalDNS. I'm sure most of the work can be driven from existing providers and ExternalDNS:

https://github.com/kubernetes-incubator/external-dns/blob/master/provider/cloudflare.go

Maybe it would be nice to merge them at some point. Is there anybody already working on this or will it be my summer project?

Refactor the getClusterZones implementations.

Issue by madhusudancs
Thursday Jun 23, 2016 at 19:31 GMT
Originally opened as kubernetes/kubernetes#27966

There are two implementations to get the cluster zone and regions. There is one here - https://github.com/kubernetes/kubernetes/blob/6dde087f69b2bdd5f9191edbd572985e00f8c808/federation/pkg/federation-controller/cluster/cluster_client.go#L143 and the other here - https://github.com/kubernetes/kubernetes/blob/c87b61341241bae37017db5f76902ea2642ea169/pkg/dns/dns.go#L630. The two implementations must be refactored and merged.

cc @quinton-hoole @kubernetes/sig-cluster-federation

Kubefed can't handle relative paths in kubeconfig

Issue by Thermi
Wednesday Oct 04, 2017 at 12:09 GMT
Originally opened as kubernetes/kubernetes#53431

/kind bug

What happened:
Kubefed doesn't handle relative paths in kubeconfig correctly, like kubectl does.

What you expected to happen:
That kubefed handles it like kubectl.

How to reproduce it (as minimally and precisely as possible):
Use relative paths in kubeconfig for credentials (e.g. CA certificates). Then use kubectl to see that it works from arbitrary directories. Then try to use kubefed and see it fail.

Anything else we need to know?:
No.

Environment:

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0+0b9efaeb34a2f", GitCommit:"638b0a06d928e7a12fa29bb12610158a235585d7", GitTreeState:"dirty", BuildDate:"2017-10-02T09:48:25Z", GoVersion:"go1.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7+", GitVersion:"v1.7.6-gke.1", GitCommit:"407dbfe965f3de06b332cc22d2eb1ca07fb4d3fb", GitTreeState:"clean", BuildDate:"2017-09-27T21:21:34Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider or hardware configuration**:
GCP
OS (e.g. from /etc/os-release):
Ubuntu Xenial
Kernel (e.g. uname -a): Irrelevant
Install tools: GKE

[Federation] Follow up items on federated pod autoscaling

Issue by irfanurrehman
Wednesday Jul 26, 2017 at 16:16 GMT
Originally opened as kubernetes/kubernetes#49644

[Federation] Better usability for kubefed

Issue by irfanurrehman
Sunday Feb 19, 2017 at 20:31 GMT
Originally opened as kubernetes/kubernetes#41725

kubefed tool, which is used to deploy a federation control plane:

1 - Does not do a very good job of giving status information to the end user. If the operations take long time it appears to be hung (especially kubefed init).
2 - There is a possibility that the deployed pods of the federation control plane, do not come up successfully, in such a scenario the kubefed init will wait indefinitely. Ideally, there could be a timeout set to override this behavior (Issue #43123).
3 - If some operation fails midway of overall execution, the tool should ideally clean up all of the objects that it creates into host cluster or target cluster before exiting.
4 - There should ideally be a kubefed deinit which can enable removing the federation control plane deployment from the host cluster.
5 - Both kubefed init and kubefed join can create service accounts to access either the underlying cluster or joining clusters; A workaround is needed such that DNS-1123 subdomain naming is not violated while doing this. Refer kubernetes/kubernetes#50888.

Update e2e_federation with specific sig prefix

Issue by guangxuli
Wednesday Aug 16, 2017 at 06:11 GMT
Originally opened as kubernetes/kubernetes#50735

Is this a BUG REPORT or FEATURE REQUEST?:

Ref to kubernetes/kubernetes#49161, e2e_federation still test use k8s.io prefix. Not sure if we could use sig-federation instead of k8s.io just as kubernetes/kubernetes#49161 described.

i would be happy to send a PR to update it if we indeed need.

/cc @quinton-hoole @csbell

Federation: kubectl rollout status with Federated Deployment

Issue by bspradling8
Monday Jun 05, 2017 at 22:19 GMT
Originally opened as kubernetes/kubernetes#46994

It looks like currently with Federated Deployments, the federated API server will get stuck at waiting with 0 out of n. I can only assume that this occurs because the federationa API server knows about the deployment and how many replicas there should be but isn't the one responsible for running the pods therefore stuck at 0. Is there a plan to support this functionality in Federated Deployments?

HA Kubernetes etcd configuration

Issue by justinsb
Saturday Jan 09, 2016 at 16:48 GMT
Originally opened as kubernetes/kubernetes#19443

I'm looking at setting up k8s with a HA master, and following along with the HA documentation at http://kubernetes.io/v1.1/docs/admin/high-availability.html

The k8s HA doc suggests using a discovery token. However, the etcd documentation at https://github.com/coreos/etcd/blob/master/Documentation/clustering.md says:

I was thinking of running the masters in an AWS auto-scaling group, but this means that their IP addresses would change (I have some strategies involving grabbing disks to ensure that they have some notion of persistent identity though).

An alternative idea I had was to use static config, but to have a process that queried the cloud API, and located the instances, and then dynamically updated a hosts file with the IP addresses. So for example etcd would be launched with

etcd -name node1 -initial-advertise-peer-urls http://10.0.1.10:2380 \
  -listen-peer-urls http://10.0.1.10:2380 \
  -listen-client-urls http://10.0.1.10:2379 \
  -advertise-client-urls http://10.0.1.10:2379 \
  -initial-cluster-token etcd-cluster-1 \
  -initial-cluster node1=http://node1:2380,node2=http://node2:2380,node3=http://node3:2380 \
  -initial-cluster-state new

And then the hosts file would look like

10.0.1.10 node1 
10.0.1.11 node2
# node3 not yet running

(I guess I could also use skydns rather than hosts files, but I am not sure about the bootstrapping there. Or even Route53...)

Will discovery actually work and 'self-heal' as IP addresses change? Is there a better alternative to dynamically updating DNS / host files? @xiang90 I think you would know best here.

Also cc @quinton-hoole because I think you'll find this interesting...

Don't require a trailing '.' in federation-controller DNS zone

Issue by quinton-hoole
Friday Jul 01, 2016 at 04:40 GMT
Originally opened as kubernetes/kubernetes#28335

See kubernetes/kubernetes#28324

CustomResourceDefinition support for federation

Issue by ljmatkins
Wednesday Oct 04, 2017 at 12:58 GMT
Originally opened as kubernetes/kubernetes#53432

Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature

What happened:
When I attempt to kubectl apply a CustomResourceDefinition to a federation, I receive the following error:

error: unable to recognize "resourcedefinition.yaml": no matches for apiextensions.k8s.io/, Kind=CustomResourceDefinition

What you expected to happen:

It would be nice if federations supported this resource type, so that we can run applications expecting it on top of federated clusters.

How to reproduce it (as minimally and precisely as possible):

Bring up a federated cluster
Create a resourcedefinition.yaml file (e.g. https://kubernetes.io/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/)
Run kubectl apply -f resourcedefinition.yaml against the federated cluster

Environment:

Kubernetes version (use kubectl version): 1.7.6
Cloud provider or hardware configuration: GKE 1.7.6-gke.1
OS (e.g. from /etc/os-release): debian stretch

Integrate federated service DNS record management: Address remaining code review comments.

Issue by quinton-hoole
Wednesday Jun 01, 2016 at 23:52 GMT
Originally opened as kubernetes/kubernetes#26669

See unresolved comments in #25991 for details.

In particular:

Possibly return ErrNotFound rather than empty slice from getRrsets()

Federation: Google Cloud DNS provider error 404 caused by the ProjectID

Issue by walteraa
Tuesday Sep 19, 2017 at 14:25 GMT
Originally opened as kubernetes/kubernetes#52726

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened: I've tried to use Google DNS as DNS provider in a federation with FCP in an on-premises cluster. I've configured a secret with the Google credential file with correct permissions, I mounted the secret as a volume and assigned the "volume path" to the environment variable GOOGLE_APPLICATION_CREDENTIALS(before that I was receiving an authorization error, but this configuration fixed it) by editing the federation-controller-manager deployment.

The credential file was generated from the Google Cloud console credential page and has the following structure

{
  "type": "service_account",
  "project_id": "corc-tutorial",
  "private_key_id": "{THE PRIVATE KEY ID}",
  "private_key": "-----BEGIN PRIVATE KEY-----\n
{THE PRIVATE KEY CONTENT}
\n-----END PRIVATE KEY-----\n",
  "client_email": "{ACCOUNT}",
  "client_id": "{CLIENT_ID}",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://accounts.google.com/o/oauth2/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/{GSERVICEACCOUNT_URI}"
}

Then I've got the following logs in the federation-controller-manager pod

I0822 13:17:33.998181       1 clouddns.go:100] Successfully got DNS service: &{0xc420e19b00 https://www.googleapis.com/dns/v1/projects/  0xc42000e840 0xc42000e848 0xc42000e850 0xc42000e858}
E0822 13:17:35.621941       1 dns.go:97] Failed to retrieve DNS zone: error querying for DNS zones: googleapi: got HTTP response code 404 with body: Not Found
F0822 13:17:35.622009       1 controllermanager.go:144] Failed to start service dns controller: error querying for DNS zones: googleapi: got HTTP response code 404 with body: Not Found

Analyzing the issue, I figured out that the Google cloud provider isn't getting the project_id key/value, which is causing a bug in the ProjectID property, thus generating an invalid URL when trying to get information from my DNS zone.

What you expected to happen: Once generated the credential file and configure it in the federation-controller-manager deployment as previously described, the Google DNS provider should get the DNS information from the REST API and successfully manage it.

How to reproduce it (as minimally and precisely as possible):

Create an on-premises cluster
Create a DNS zone in your Google cloud console
Initialize a federation which the FCP is the on-premises cluster created before and using Google as DNS provider and the DNS zone created before
Create a credential file giving permission to manage the project's DNS
Create a secret using the credential file created before
Edit the federation-controller-manager by creating the volumes using the secret created before and an environment variable named GOOGLE_APPLICATION_CREDENTIALS by using secret's mount path as value.

Anything else we need to know?:
I did it work, as it should, doing a workaround which sets the ProjectID value hardcoded as you can see here

Environment:

Kubernetes version (use kubectl version): 1.7.4
Cloud provider or hardware configuration**: irrelevant
OS (e.g. from /etc/os-release): irrelevant
Kernel (e.g. uname -a): irrelevant
Install tools: irrelevant
Others: irrelevant

Address outstanding DNS review comments in #26694

Issue by quinton-hoole
Monday Jun 06, 2016 at 23:42 GMT
Originally opened as kubernetes/kubernetes#26921

See kubernetes/kubernetes#26694 for details. Specifically:

Don't call ensureDnsRecords() if the DNS provider has not been initialized.
Don't discard errors returned by getClusterZoneNames()

cc: @mfanjie FYI

Proposal: Service connection affinity

Issue by eliaslevy
Wednesday Oct 14, 2015 at 23:34 GMT
Originally opened as kubernetes/kubernetes#15675

As of now connections to Services are either handled in a round robin fashion (the default) or can make use of client IP based affinity, where the endpoint for the initial connection to the service by a client is selected in a round robin fashion and subsequent connections will attempt to use the previously selected endpoint.

In some environments there is a need for a topographical or administrative prioritization of the Service endpoint selection. For instance, a cluster may be comprised of multiple zones with Service endpoints spread across zones. Latencies across zones may be low, yet still higher than intra-zone latencies. There may also be financial costs associated to inter-zone traffic. Thus, it is could be desirable for a number of reasons to prioritize endpoints for a Service within the same zone as that of Pods making connections to the Service.

Note that this proposal is distinct from that of #14484. #14484 proposes priority affinity when scheduling Pods (colocation). I am proposing priority affinity of network connections to Services.

This could be implemented allowing the admin to specify a tag selector within the session affinity spec. kube-proxy could then look up the tag on the Pod that initiated a connection and on any Pods that are endpoints for the Service. It would prioritize endpoint Pods that a tag with a value that matches the value of the tag on the client Pod. Partial matches be prioritized by the number of tags that match.

If multiple endpoints match at the same priority, then the selection of one of them can be performed in a round robin fashion, or ClientIP affinity could be applied optionally.

kubefed join breaks with aws cluster

Issue by henriquetruta
Monday Jul 17, 2017 at 18:23 GMT
Originally opened as kubernetes/kubernetes#49041

Joining a cluster from AWS fails in kubefed 1.7. The AWS cluster is also k8s 1.7
The federation is running in a GKE cluster 1.7.0.

$ kubefed join aws --host-cluster-context=gke --context=fed
Error from server (Forbidden): clusterroles.rbac.authorization.k8s.io "federation-controller-manager:fed-aws-gke" is forbidden: attempt to grant extra privileges: [PolicyRule{Resources:["*"], APIGroups:["*"], Verbs:["*"]} PolicyRule{NonResourceURLs:["/healthz"], Verbs:["get"]}] user=&{admin admin [system:authenticated] map[]} ownerrules=[] ruleResolutionErrors=[]

This same command works in kubefed 1.6.6
I'm able to access the AWS cluster using its context.

/kind bug

How to reproduce it (as minimally and precisely as possible):
1 - Create a GKE cluster
2 - Create an AWS cluster (I used conjure-up)
3 - kubefed (1.7) init on GKE cluster
4 - Join the AWS cluster
5 - It should fail

If kubefed 1.6 is used, it works.
Joining a GKE or On-prem cluster also works as expected

New Federation API Resource for ClusterDaemonSets

Issue by nutchalum
Thursday Sep 28, 2017 at 03:04 GMT
Originally opened as kubernetes/kubernetes#53178

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

What happened:
I want a to have a new federation api resource ClusterDaemonSets

What you expected to happen:
This works a little like DemonSets resource.
DemonSets makes sure every nodes must have a pod running in it.
ClusterDaemonSets makes sure every clusters must have a pod running in it.

For example
Every cluster must has a kubernetes-dashboard.
I want to make sure there's only one (or more) dashboard per cluster
I can do that by using federation feature and create a Deployment and then scaling it out.
However, If one cluster is down, It'll try to make a pod in other cluster instead.
I want a resource to make sure there's only one or more per cluster
And if the cluster is gone, It won't try to pop up in other cluster

Anything else we need to know?:
Thank you.

kubernetes-retired / federation Goto Github PK

federation's People

Contributors

Stargazers

Watchers

Forkers

federation's Issues

Flaky Job: pr:pull-kubernetes-federation-e2e-gce

Flakiest tests by flake count:

Recommend Projects

Recommend Topics

Recommend Org