Giter Site home page Giter Site logo

liqotech / liqo Goto Github PK

View Code? Open in Web Editor NEW
1.0K 20.0 101.0 41.04 MB

Enable dynamic and seamless Kubernetes multi-cluster topologies

Home Page: https://liqo.io

License: Apache License 2.0

Dockerfile 0.07% Shell 1.26% Go 98.15% Makefile 0.36% Smarty 0.16%
kubernetes liquid-computing resource-sharing cloud-computing multi-cluster kubernetes-clusters clusters k8s

liqo's People

Contributors

abakusw avatar aka-somix avatar alacuku avatar aleoli avatar andreagit97 avatar callisto13 avatar capacitorset avatar cheina97 avatar damianot98 avatar davidefalcone1 avatar dependabot[bot] avatar fprojetto avatar fra98 avatar fraborg avatar francescodanzi avatar frisso avatar gabrifila avatar giandonatofarina avatar giorio94 avatar giuse2596 avatar kariya-mitsuru avatar lucafrancescato avatar lucarocco avatar mlavacca avatar nappozord avatar palexster avatar qcfe avatar scottboring avatar sharathmk99 avatar vgramer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

liqo's Issues

[Feature] Installer should install dashboard

Is your feature request related to a problem? Please describe.
Liqo Dashboard should be included among the package installed

Describe the solution you'd like
The Liqo dashboard should be installed by default, with an opt-out flag

Describe alternatives you've considered
Have a separate installer could enhance complexity.

Items

Dashboard:

  • Create an Helm Chart for the Dashboard
  • Modify the installer to import the dashboard helm chart

[Feature] User documentation review

Is your feature request related to a problem? Please describe.

Some comments, suggestions and issues which came to my mind when reading the Liqo user documentation:

Major:

  • The Peer to a foreign cluster section does not explain how to peer with a foreign cluster (to me, this is a big missing point before being able to try liqo); (@palexster)
  • In the discovery documentation, it is not clear to me the Trust Remote Cluster section and its subsections; (@aleoli)

Minor:

  • In the network documentation, it is not mentioned that encrypted tunnels are not yet available; (@alacuku)
  • In the Liqo dashboard documentation, I would add an example of ingress configuration to expose it (in case an ingress controller is available); (@nappozord)
  • At the end of the Liqo Dashboard documentation, what does it happen if the requests/limits are not specified? Also, to be picky, requests do not represent the worst-case scenario (as limits would do); (@nappozord)
  • I would merge the Liqo Dashboard documentation located in the architecture section with the one in the User Guide section;
  • The Discovery Protocol documentation (architecture) should be improved from the language point of view, to make it easier to read; (@aleoli)
  • The links in the Resource Sharing/Networking documentation are broken; (@alacuku)
  • Limitations are hard to understand

Style:

  • I would move Liqo in brief before Getting started (at least, it seems more natural to me in that order); (@palexster)
  • The style of the code snippets is not consistent (sometimes colored, sometimes not); (@palexster)
  • I would avoid abbreviations in the code snippets, to make them more self-explanatory and easier to read also for non-experienced users; (@palexster)
  • Some images in the architecture section are not displayed correctly; (@palexster)
  • Some index pages in the architecture section are empty (@palexster)

[Feature] Improve Liqo deployments permissions

Is your feature request related to a problem? Please describe.
When executing the popeye scanner against the namepace where the liqo resources are installed, different warnings are raised. It may be worth to address (at least some of) them.

Additional context
Excerpt of the report

PODS (14 SCANNED)                                                             ๐Ÿ’ฅ 1 ๐Ÿ˜ฑ 13 ๐Ÿ”Š 0 โœ… 0 0ูช
โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…
  ยท liqo/advertisement-operator-66f7c948c6-g5sq8...................................................๐Ÿ˜ฑ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ advertisement-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/crdreplicator-operator-5d577fc976-jpsgm...................................................๐Ÿ˜ฑ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ crdreplicator-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/discovery-6c99c89fbc-2bgcr................................................................๐Ÿ˜ฑ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ discovery
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/liqo-dashboard-7977f68bc4-ml4sg...........................................................๐Ÿ’ฅ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-300] Using "default" ServiceAccount.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ liqo-dashboard
      ๐Ÿ˜ฑ [POP-101] Image tagged "latest" in use.
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ proxy-cert
      ๐Ÿ’ฅ [POP-100] Untagged docker image in use.
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/peering-request-operator-587f86fdd4-96mzv.................................................๐Ÿ˜ฑ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ peering-request-deployment
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ peering-request-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ”Š [POP-108] Unnamed port 8443.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ secret-creation
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/podmutator-7986cd56dc-4znsd...............................................................๐Ÿ˜ฑ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ pod-mutator-deployment
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ podmutator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ secret-creation
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/route-operator-66dkv......................................................................๐Ÿ˜ฑ
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ route-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/route-operator-7mdnf......................................................................๐Ÿ˜ฑ
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ route-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/route-operator-88k29......................................................................๐Ÿ˜ฑ
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ route-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/route-operator-jjwmw......................................................................๐Ÿ˜ฑ
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ route-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/route-operator-ldt9p......................................................................๐Ÿ˜ฑ
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ route-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/schedulingnode-operator-7cf6db4b78-ppvbx..................................................๐Ÿ˜ฑ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ schedulingnode-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/tunnel-operator-5795c49f79-z4v56..........................................................๐Ÿ˜ฑ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ tunnel-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.
  ยท liqo/tunnelendpointcreator-operator-698cf97957-cc5gj...........................................๐Ÿ˜ฑ
    ๐Ÿ”Š [POP-206] No PodDisruptionBudget defined.
    ๐Ÿ˜ฑ [POP-301] Connects to API Server? ServiceAccount token is mounted.
    ๐Ÿ˜ฑ [POP-302] Pod could be running as root user. Check SecurityContext/Image.
    ๐Ÿณ tunnelendpointcreator-operator
      ๐Ÿ˜ฑ [POP-106] No resources requests/limits defined.
      ๐Ÿ˜ฑ [POP-102] No probes defined.
      ๐Ÿ˜ฑ [POP-306] Container could be running as root user. Check SecurityContext/Image.

SERVICES (3 SCANNED)                                                          ๐Ÿ’ฅ 1 ๐Ÿ˜ฑ 0 ๐Ÿ”Š 2 โœ… 0 66ูช
โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…โ”…
  ยท liqo/liqo-dashboard............................................................................๐Ÿ’ฅ
    ๐Ÿ’ฅ [POP-1106] No target ports match service port TCP:https:443.
    ๐Ÿ”Š [POP-1104] Do you mean it? Type NodePort detected.
  ยท liqo/mutatepodtoleration.......................................................................๐Ÿ”Š
    ๐Ÿ”Š [POP-1101] Skip ports check. No explicit ports detected on pod 
        liqo/podmutator-7986cd56dc-4znsd.
  ยท liqo/peering-request-operator..................................................................๐Ÿ”Š
    ๐Ÿ”Š [POP-1102] Use of target port%!(EXTRA string=8443, string=TCP::8443).

Virtual Kubelet node unalligned version prevents kubeadm version upgrade.

Describe the bug
So far, the version of the node created by the virtual kubelet is unaligned with the rest of the cluster. This is causing issues upgrading the cluster with Kubeadm by failing some integrity checks.

Expected Behavior
Two possibilities:

  • Virtual kubelet version should track Apiserver version
  • Virtual kubelet version should expose the library version in use.

Endpoint reflection is broken in 1.19

Describe the bug
Due to the change introduced by endpointSlice in Kubernetes 1.19, endpoint reflection is not working anymore. This is mainly due to kube-proxy which now uses EndpointSlices to configure services instead of Endpoints.

The uninstaller hangs forever while deleting the Liqo CRDs

Describe the bug

When Liqo is uninstalled using the provided script (with --deleteCrd enabled), the procedure hangs forever. In particular, the CRDs networkconfigs.net.liqo.io and tunnelendpoints.net.liqo.io fail to be deleted since the corresponding resources are associated to finalizers that do not longer exist (since all pods have already been deleted in the previous step).

To Reproduce
Steps to reproduce the behavior:

  1. Install Liqo
  2. Establish a peering
  3. Uninstall Liqo (with --deleteCrd)
  4. The deletion process never terminates

Expected behavior
The uninstallation should be completed correctly

Broadcaster fails to update the PeeringRequest

Describe the bug
The remote watcher in the broadcaster module fails to update the PeeringRequest with the status of the Advertisement (Accepted/Refused).

To Reproduce
Steps to reproduce the behavior:

  1. Install Liqo
  2. kubectl logs -n liqo broadcaster-<clusterID>
  3. In the log, read the error message

Expected behavior
The PeeringRequest should be updated with the status of the Advertisement.

Screenshots
Schermata del 2020-09-11 16-40-00

[Epic] Virtual Kubelet Enhancements

Is your feature request related to a problem? Please describe.

So far, the Liqo implementation of virtual-kubelet still misses support for several features:

Virtual Kubelet Interface:

  • kubectl cp
  • kubectl port-forward
  • kubelet metrics exported

Reflector Issues:

  • EndpointSlice Support
  • Default Namespace issues (ref. #157)

Liqo Exploration:

  • Linking advertisement to Virtual Node (#227)

Advertisement reference deletion in foreignCluster is broken

Describe the bug
When the advertisement of a cluster expires, its reference and the reference for the virtualKubelet identity is not correctly purged from the foreign cluster (FG). The FG still continues to reference it.

apiVersion: discovery.liqo.io/v1alpha1
kind: ForeignCluster
metadata:
  creationTimestamp: "2020-09-19T09:52:09Z"
  finalizers:
  - foreigncluster.discovery.liqo.io/peered
  generation: 7
  labels:
    cluster-id: 9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
    discovery-type: WAN
  managedFields:
  - apiVersion: discovery.liqo.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:outgoing:
          f:advertisement:
            .: {}
            f:apiVersion: {}
            f:kind: {}
            f:name: {}
            f:uid: {}
    manager: advertisement-operator
    operation: Update
    time: "2020-09-19T09:52:15Z"
  - apiVersion: discovery.liqo.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .: {}
          v:"foreigncluster.discovery.liqo.io/peered": {}
        f:labels:
          .: {}
          f:cluster-id: {}
          f:discovery-type: {}
        f:ownerReferences:
          .: {}
          k:{"uid":"61f7e7ce-f1b2-44ef-bdf9-32f43b08f068"}:
            .: {}
            f:apiVersion: {}
            f:kind: {}
            f:name: {}
            f:uid: {}
      f:spec:
        .: {}
        f:allowUntrustedCA: {}
        f:apiUrl: {}
        f:clusterID: {}
        f:discoveryType: {}
        f:join: {}
        f:namespace: {}
      f:status:
        .: {}
        f:incoming:
          .: {}
          f:joined: {}
        f:outgoing:
          .: {}
          f:advertisementStatus: {}
          f:identityRef:
            .: {}
            f:apiVersion: {}
            f:kind: {}
            f:name: {}
            f:namespace: {}
          f:joined: {}
          f:remote-peering-request-name: {}
    manager: discovery
    operation: Update
    time: "2020-09-21T08:46:36Z"
  name: 9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
  ownerReferences:
  - apiVersion: discovery.liqo.io/v1alpha1
    kind: SearchDomain
    name: x.y.z
    uid: 61f7e7ce-f1b2-44ef-bdf9-32f43b08f068
  resourceVersion: "307307896"
  selfLink: /apis/discovery.liqo.io/v1alpha1/foreignclusters/9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
  uid: 58eb5e76-d33f-42f3-853e-4cb4dd78485e
spec:
  allowUntrustedCA: false
  apiUrl: https://apiserver.crownlabs.polito.it.:443
  clusterID: 9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
  discoveryType: WAN
  join: true
  namespace: liqo
status:
  incoming:
    joined: false
  outgoing:
    advertisement:
      apiVersion: sharing.liqo.io/v1alpha1
      kind: Advertisement
      name: advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
      uid: a1f20c37-c97f-4043-acf3-fac893d85b8a
    advertisementStatus: Accepted
    identityRef:
      apiVersion: v1
      kind: Secret
      name: vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9
      namespace: liqo
    joined: true
    remote-peering-request-name: 09f709f0-96bd-48ad-9e1a-8efb22bed89e```

kubectl get secret -n liqo
NAME TYPE DATA AGE
9a596a4b-591c-4ac6-8fd6-80258b4b3bf9-token-mzgkn kubernetes.io/service-account-token 3 46h
advertisement-operator-token-kk28r kubernetes.io/service-account-token 3 47h
broadcaster-token-qj7c7 kubernetes.io/service-account-token 3 47h
ca-data Opaque 1 47h
crdreplicator-operator-service-account-token-6q4x7 kubernetes.io/service-account-token 3 47h
dashboard-cert kubernetes.io/tls 2 47h
default-token-flgmz kubernetes.io/service-account-token 3 47h
discovery-sa-token-t4vd4 kubernetes.io/service-account-token 3 47h
liqodash-admin-sa-token-zxtjn kubernetes.io/service-account-token 3 47h
peering-request-operator-token-7882f kubernetes.io/service-account-token 3 47h
peering-request-webhook-certs Opaque 2 47h
pod-mutator-secret Opaque 2 47h
podmutatoraccount-token-zrr8c kubernetes.io/service-account-token 3 47h
route-operator-service-account-token-rsk7s kubernetes.io/service-account-token 3 47h
sh.helm.release.v1.liqo.v1 helm.sh/release.v1 1 47h
sn-operator-token-262z6 kubernetes.io/service-account-token 3 47h
tunnel-operator-service-account-token-wj9fg kubernetes.io/service-account-token 3 47h
tunnelendpointcreator-operator-service-account-token-kslrp kubernetes.io/service-account-token 3 47h

kubectl get advertisements.sharing.liqo.io
No resources found in default namespace.

Impossible to install a dashboard version different from latest

Describe the bug
Whatever LIQO_VERSION is selected, the dashboard is always installed using the latest tag. This is inconsistent with the rest of the components and raises the usual concerns regarding the usage of the latest tag (e.g. a restart may cause a new version to be used).

To Reproduce

  1. Instal liqo: curl -sL https://raw.githubusercontent.com/liqotech/liqo/master/install.sh | LIQO_VERSION=... bash
  2. Check the dashboard container version: kubectl get po liqo-dashboard-7977f68bc4-ml4sg -o yaml | grep ' image:'

Expected behavior
Since the dashboard is located on a different repository, the commit sha will never be the same of the other components. However, I believe that the tagged versions should be kept aligned between the two repo to avoid confusions, while for installations from master the latest commit tag should be retrieved and used as for the other components.

[Doc] Virtual Kubelet

Virtual kubelet

This issue describes the virtual kubelet lifecycle and the pod lifecycle. The last section ends up with a bullet list of the known problems related to the remote pod status reconciliation that can be updated and should be solved in the next PRs.

Virtual kubelet lifecycle

At boot time, the virtual kubelet fetches from the etcd a CR of kind namespacenattingtable (or creates it if it doesn't exist) that contains the natting table of the namespaces for the given virtual node, i.e., the translation between local namespaces and remote namespaces. Every time a new entry is added in this natting table, a new reflection routine for that namespace is triggered; this routine implies:

  • the remote reflection of many different resources, among which:
    • service
    • endpoints
    • configmap
    • secret
  • the remote pod-watcher for the translated remote namespace

Resource reflection

The reflection of the resource implies that each local resource is translated (if needed) and reflected remotely, such that a pod in the remote namespace has a complete view of the local namespace resources as if it was local.

Remote pod-watcher

The remote pod-watcher is a routine that listens for all the events related to a remotely offloaded pod in a given translated namespace; this is needed to reconcile the remote status with the local one, such that the local cluster always knows in which state each offloaded pod is. There are some remote status transitions that trigger the providerFailed status in the local pod instance: providerFailed means that the local status cannot be correctly updated because of an unrecognized remote status transition. We need to deeper investigate for understanding when and why this status is triggered and to avoid it as much as possible.
The currently known reasons that trigger this status are:

  • deletion of an offloaded pod from the remote cluster

DNS Discovery: unable to set Allow Untrusted CA to true

Describe the bug
It is impossible to set the Allow Untrusted CA parameter of a ForeignCluster to true when the DNS discovery is leveraged (i.e. the ForeignCluster is generated by a SearchDomain). Whenever the value is manually set to true editing the resource, it is reset to false by the operator a few seconds later.

To Reproduce
Steps to reproduce the behavior:

  1. Install Liqo
  2. Create a SearchDomain resource to trigger the DNS discovery process
  3. Wait for the creation of the ForeignCluster and set Allow Untrusted CA to true
  4. Wait a few seconds and check that the actual value of the property is false

Expected behavior
The user input should not be discarded.

Additional context
This is the log of the discovery operator from the manual editing to the reset to the original value

I0911 14:48:14.632029       1 foreign-cluster-controller.go:61] Reconciling ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:14.642819       1 foreign-cluster-controller.go:206] Get CA Data
I0911 14:48:14.695077       1 foreign-cluster-controller.go:223] CA Data successfully loaded for ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:14.695138       1 foreign-cluster-controller.go:61] Reconciling ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:14.719931       1 foreign-cluster-controller.go:308] ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e successfully reconciled
I0911 14:48:23.178066       1 search-domain-controller.go:26] Reconciling SearchDomain ***.***.it
I0911 14:48:23.209959       1 foreign-cluster-controller.go:61] Reconciling ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:23.210210       1 foreign.go:53] ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e updated
I0911 14:48:23.219507       1 search-domain-controller.go:113] SearchDomain ***.***.it successfully reconciled
I0911 14:48:23.241776       1 foreign-cluster-controller.go:308] ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e successfully reconciled
I0911 14:48:37.716436       1 foreign-cluster-controller.go:61] Reconciling ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e
I0911 14:48:37.744811       1 foreign-cluster-controller.go:308] ForeignCluster 09f709f0-96bd-48ad-9e1a-8efb22bed89e successfully reconciled

The virtual-kubelet is stuck in "Waiting for approval of CSR"

Describe the bug
After creating a SearchDomain CR to trigger the discovery process, the peering process seems to be completed correctly. The virtual-kubelet pod is created, but it remains stuck in the Init status, waiting for the approval of a CSR. Once the CSR is manually approved, the pod starts running and the virtual node is created.

$ kubectl logs -n liqo virtual-kubelet-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9-658dfxfcpj -c crt-generator
/etc/virtual-kubelet/certs
2020/09/10 09:34:15 [INFO] generate received request
2020/09/10 09:34:15 [INFO] received CSR
2020/09/10 09:34:15 [INFO] generating key: ecdsa-256
2020/09/10 09:34:15 [INFO] encoded CSR
certificatesigningrequest.certificates.k8s.io/virtual-kubelet-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9-658dfxfcpj created
Wait for CSR to be signed
Waiting for approval of CSR: virtual-kubelet-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9-658dfxfcpj

To Reproduce
Steps to reproduce the behavior:

  1. Install Liqo
  2. Create a new SearchDomain resource to trigger the discovery process
  3. Check the virtual-kubelet pod status

Expected behavior
No manual intervention should be required.

Additional context

apiVersion: discovery.liqo.io/v1alpha1
kind: SearchDomain
metadata:
  name: ***.polito.it
spec:
  domain: ***.polito.it
  autojoin: true

TunnelOperator crashes waiting for the TunnelEndpoint

Describe the bug
After joining a node, the tunnel-operator experience a crash while the TunnelEndpoint is not ready.

Traces

kubectl get po -n liqo -o wide
NAME                                                              READY   STATUS            RESTARTS   AGE     IP           NODE                  NOMINATED NODE   READINESS GATES
advertisement-operator-65cc9bb44f-7qhsm                           1/1     Running           0          7m51s   10.200.1.5   liqo2-worker          <none>           <none>
crdreplicator-operator-6877454c8c-dltvt                           1/1     Running           0          7m51s   10.200.1.6   liqo2-worker          <none>           <none>
discovery-7c747664c6-7b8fl                                        1/1     Running           0          7m51s   172.18.0.5   liqo2-worker          <none>           <none>
liqo-dashboard-7c955d968f-jfpwz                                   1/1     Running           0          7m51s   10.200.1.7   liqo2-worker          <none>           <none>
peering-request-operator-5f59d778c7-74mng                         0/1     PodInitializing   0          7m50s   10.200.1.8   liqo2-worker          <none>           <none>
podmutator-64b9588fb-g98bf                                        0/1     PodInitializing   0          7m51s   10.200.1.2   liqo2-worker          <none>           <none>
route-operator-fvxft                                              1/1     Running           0          7m51s   172.18.0.4   liqo2-control-plane   <none>           <none>
route-operator-h6fzr                                              1/1     Running           0          7m51s   172.18.0.5   liqo2-worker          <none>           <none>
schedulingnode-operator-75474f96cc-cwhhj                          1/1     Running           0          7m51s   10.200.1.3   liqo2-worker          <none>           <none>
tunnel-operator-76466f5bd9-j79nd                                  0/1     Error             0          7m51s   172.18.0.5   liqo2-worker          <none>           <none>
tunnelendpointcreator-operator-68dc8f78d-nczsv                    1/1     Running           0          7m51s   10.200.1.4   liqo2-worker          <none>           <none>
virtual-kubelet-6575d0b9-6fba-4f7d-b890-a6417009cb64-6db56xhwmv   0/1     Init:0/1          0          8s      <none>       liqo2-worker          <none>           <none>
kubectl logs -n liqo tunnel-operator-76466f5bd9-j79nd
2020-09-19T09:12:31.402Z	INFO	controller-runtime.metrics	metrics server is starting to listen	{"addr": ":0"}
2020-09-19T09:12:31.408Z	INFO	setup	Starting manager as Tunnel-Operator
2020-09-19T09:12:31.409Z	INFO	controller-runtime.manager	starting metrics server	{"path": "/metrics"}
2020-09-19T09:12:31.409Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "net.liqo.io", "reconcilerKind": "TunnelEndpoint", "controller": "tunnelendpoint", "source": "kind source: /, Kind="}
2020-09-19T09:12:31.510Z	INFO	controller	Starting Controller	{"reconcilerGroup": "net.liqo.io", "reconcilerKind": "TunnelEndpoint", "controller": "tunnelendpoint"}
2020-09-19T09:12:31.510Z	INFO	controller	Starting workers	{"reconcilerGroup": "net.liqo.io", "reconcilerKind": "TunnelEndpoint", "controller": "tunnelendpoint", "worker count": 1}
2020-09-19T09:15:34.735Z	DPANIC	liqonetOperators.TunnelEndpoint	odd number of arguments passed as key-value pairs for logging	{"endpoint": "/tun-endpoint-6575d0b9-6fba-4f7d-b890-a6417009cb64", "ignored key": "is not ready"}
github.com/go-logr/zapr.handleFields
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:106
github.com/go-logr/zapr.(*infoLogger).Info
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:70
github.com/liqotech/liqo/internal/liqonet.(*TunnelController).Reconcile
	/go/src/github.com/liqotech/liqo/internal/liqonet/tunnel-operator.go:59
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:209
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:188
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155
k8s.io/apimachinery/pkg/util/wait.BackoffUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90
E0919 09:15:34.736059       1 runtime.go:76] Observed a panic: odd number of arguments passed as key-value pairs for logging
goroutine 233 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x15d99e0, 0xc0005b4220)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x82
panic(0x15d99e0, 0xc0005b4220)
	/usr/local/go/src/runtime/panic.go:969 +0x166
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0002c8420, 0xc000048f40, 0x1, 0x1)
	/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:230 +0x545
go.uber.org/zap.(*Logger).DPanic(0xc000652660, 0x188062f, 0x3d, 0xc000048f40, 0x1, 0x1)
	/go/pkg/mod/go.uber.org/[email protected]/logger.go:215 +0x7f
github.com/go-logr/zapr.handleFields(0xc000652660, 0xc0002cc090, 0x3, 0x3, 0x0, 0x0, 0x0, 0x30, 0x15e97e0, 0x1)
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:106 +0x5ce
github.com/go-logr/zapr.(*infoLogger).Info(0xc000117668, 0x183877b, 0xc, 0xc0002cc090, 0x3, 0x3)
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:70 +0xb1
github.com/liqotech/liqo/internal/liqonet.(*TunnelController).Reconcile(0xc000367040, 0x0, 0x0, 0xc00034e680, 0x31, 0xc0001175c0, 0xc0005ee2d0, 0xc0005ee248, 0xc0005ee240)
	/go/src/github.com/liqotech/liqo/internal/liqonet/tunnel-operator.go:59 +0x27f
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0000ca5a0, 0x16ae100, 0xc000117580, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x284
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0000ca5a0, 0x203000)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:209 +0xae
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0000ca5a0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:188 +0x2b
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00064e930)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00064e930, 0x1a60cc0, 0xc00052b890, 0x1, 0xc0004e6c00)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xa3
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00064e930, 0x3b9aca00, 0x0, 0x1, 0xc0004e6c00)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc00064e930, 0x3b9aca00, 0xc0004e6c00)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:170 +0x411
panic: odd number of arguments passed as key-value pairs for logging [recovered]
	panic: odd number of arguments passed as key-value pairs for logging

goroutine 233 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x105
panic(0x15d99e0, 0xc0005b4220)
	/usr/local/go/src/runtime/panic.go:969 +0x166
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc0002c8420, 0xc000048f40, 0x1, 0x1)
	/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:230 +0x545
go.uber.org/zap.(*Logger).DPanic(0xc000652660, 0x188062f, 0x3d, 0xc000048f40, 0x1, 0x1)
	/go/pkg/mod/go.uber.org/[email protected]/logger.go:215 +0x7f
github.com/go-logr/zapr.handleFields(0xc000652660, 0xc0002cc090, 0x3, 0x3, 0x0, 0x0, 0x0, 0x30, 0x15e97e0, 0x1)
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:106 +0x5ce
github.com/go-logr/zapr.(*infoLogger).Info(0xc000117668, 0x183877b, 0xc, 0xc0002cc090, 0x3, 0x3)
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:70 +0xb1
github.com/liqotech/liqo/internal/liqonet.(*TunnelController).Reconcile(0xc000367040, 0x0, 0x0, 0xc00034e680, 0x31, 0xc0001175c0, 0xc0005ee2d0, 0xc0005ee248, 0xc0005ee240)
	/go/src/github.com/liqotech/liqo/internal/liqonet/tunnel-operator.go:59 +0x27f
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0000ca5a0, 0x16ae100, 0xc000117580, 0x0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x284
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0000ca5a0, 0x203000)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:209 +0xae
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0000ca5a0)
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:188 +0x2b
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00064e930)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x5f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00064e930, 0x1a60cc0, 0xc00052b890, 0x1, 0xc0004e6c00)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xa3
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00064e930, 0x3b9aca00, 0x0, 0x1, 0xc0004e6c00)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x98
k8s.io/apimachinery/pkg/util/wait.Until(0xc00064e930, 0x3b9aca00, 0xc0004e6c00)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:90 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:170 +0x411

When advertisement expires, the foreign identity is not set and verified by the broadcaster.

Describe the bug
In case of network failure, when advertisement expires, the foreign identity is not set and verified by the broadcaster. The broadcaster just create the new advertisement but does not update the virtual-kubelet identity to be present.

From the log of advertisement operator:

I0921 08:42:15.683278       1 controller.go:330] Adv advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 expired. TimeToLive was 2020-09-21 08:33:38 +0000 UTC
I0921 08:42:15.752570       1 controller.go:83] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 deleted
I0921 08:43:15.752861       1 controller.go:83] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 deleted
I0921 08:44:15.753263       1 controller.go:83] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 deleted
I0921 08:45:15.753614       1 controller.go:83] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 deleted
I0921 08:45:48.685724       1 controller.go:311] Advertisement advertisement-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 accepted
E0921 08:45:48.874187       1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:49.884634       1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:50.894234       1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:51.910722       1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:52.920457       1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:53.937163       1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found
E0921 08:45:54.946737       1 controller.go:252] Cannot find secret vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9 in namespace liqo for the virtual kubelet; error: secrets "vk-kubeconfig-secret-9a596a4b-591c-4ac6-8fd6-80258b4b3bf9" not found

Get ip in kind

Describe the bug
Sometimes, running liqo in kind, discovery component advertises on LAN wrong ip. For example if the correct IP is 172.18.0.x it advertises 192.168.200.x

To Reproduce
Run liqo installer in kind cluster

[Feature] Release Pipeline

Is your feature request related to a problem? Please describe.
We need a Release Pipeline to build docker images and agent artifacts to release the first Liqo version.

[Feature] Graceful deletion of Advertisement and virtual-kubelet

Description
When we delete an Advertisement, the deployment of the linked virtual-kubelet is deleted as well because of OwnerReferece, and this deletion triggers the deletion of the virtual-node.
Doing so, the resources that have been created on the foreign cluster are not cleaned up: therefore, the behaviour we would like to have is the opposite:

  1. delete all resources on foreign cluster
  2. delete the virtual-node
  3. delete the virtual-kubelet deployment
  4. delete the Advertisement

Proposed solution
Set a finalizer on the Advertisement, which triggers the virtual-kubelet: it deletes the entry linked to the Advertisement in the NamespaceNattingTable. This triggers the deletion of all resources created on the foreign cluster by the reflector. After that, we can proceed with the deletion of the Advertisement (and its genealogy).

LAN discovery does not create new ForeignClusters

Describe the bug
When the two peering hosts have at least one overlapping IP, mDNS packets are received but ForeignClusters are not created

To Reproduce
Steps to reproduce the behavior:

  1. Add an interface with the same IP on both peering hosts (in different networks)
  2. Start LAN discovery
  3. Using your favorite network capture tool check that mDNS packets are coming
  4. Check that no new ForeignCluster has been created

Expected behavior
A new ForeignCluster should appear

[Roadmap] E2E Testing

Is your feature request related to a problem? Please describe.

In order to improve the testing of Liqo, we would like to add testing on real Liqo Deployments

Steps

  • Handle cluster lifecycle
    • Create different clusters for different PRs
    • Delete the right VMs
  • Create a feedback loop to build clusters:
    • Deploy one cluster + Liqo Install
    • Get the feedback loop*
  • Add Assertions

kubelet keeps failing when trying to reflect the kubernetes service using default namespace

Describe the bug
When deploying a pod/service in default namespace, the virtual kubelet keeps failing when trying to reflect the kubernetes service.

To Reproduce
Steps to reproduce the behavior:

  1. Label the default namespace liqo.io/enabled=true
  2. Create a pod and a service in default namespace

Expected behavior
Virtual Kubelet should not crash

Screenshots
immagine

immagine

Desktop (please complete the following information):

  • OS: Ubuntu 20.04 LTS (Desktop)
  • clusters: k3s version v1.18.6+k3s1 (6f56fa1d)

Additional context
Running on two peered K3s.

Network Anomalies

Describe the bug
Two clusters have been peered (cluster1-cluster2) and the network connection between them has been established. Network anomalies
are observed in node cluster1-node1 when trying to communicate with services running in the cluster2 cluster. ICMP packets are perfectly routed but not tcp/udp traffic.

To Reproduce

  1. cluster1: kubectl get pods -n liqo-demo -o wide
  2. cluster1: kubectl exec -it -n liqo-demo podRunningOnSpring -- bash
  3. from inside the pod "ping" a pod running on cluster2: ping IPCluster2Pod
  4. it should work fine
  5. as in step 3, try to use curl or wget command on service running on the cluster2 cluster
  6. it should hang and no response received
  7. ssh to the cluster1-node1 node and to the same as in step 3 and 5: the same problem is present
  8. ssh to the cluster1-node2 node and to the same as in step 3 and 5: everything should work fine

Expected behavior
The services running on the cluster2 cluster should be reachable from the cluster1-node1 node.

Debug

  1. cluster1: kubectl get nodes --show-labels and log-in to the node with label: liqo.io/gateway=true (fall)
  2. exec the following command sudo watch -n1 -d "iptables -vnxL -t nat | grep -v -e pkts -e Chain | sort -nk1 | tac | column -t" and keep this terminal running
  3. keep an eye on the line SNAT all -- * gretun_ 0.0.0.0/0 0.0.0.0/0 to:10.244.0.0
  4. from inside pod scheduled in the node1 node in cluster1 "ping" a pod running on cluster2: ping IPCluster2Pod
  5. the counters of the line above should increment of one, which means that the source natting is working fine
  6. as in step 2, try to use curl or wget command on service running on the cluster2 cluster
  7. the counters of line 3 do not change. the source nating for tcp/udp packests coming from cluster1's node1 node are not natted.

Additional context
Add any other context about the problem here.

[Feature] In-going Advertisement Policing

Is your feature request related to a problem? Please describe.
So far, in-going advertisement are automatically accepted. It would be more effective to have advertisement acceptance configuration, letting the user to accept or refuse the incoming advertisement.

Policies:

  • ClusterConfig (Incoming advertisement): (Ref. #181)
    • Autoaccept
    • Manual
    • Auto-deny
  • ClusterConfig (Outgoing advertisement) (Done)
    • % cluster (Equal for everybody)
    • Auto-advertise / No-Advertise
    • Auto-join
  • Documentation (Configure Liqo)
    • How to set configuration
    • How to accept/deny

The SearchDomain autojoin=true flag is not respected if an unidirectional peering already exists

Describe the bug
In the context of DNS discovery, when an unidirectional peering has already been established, the creation of a SearchDomain for the opposite direction does not trigger the autojoin process (although autojoin=true). Instead, it is necessary to manually set the join property in the corresponding foreigncluster resource (which is otherwise set to false).

To Reproduce
Steps to reproduce the behavior:

  1. Install liqo on two clusters (A and B)
  2. Create a SearchDomain resource in A pointing to B (with autojoin=true) and observe that the virtual kubelet is correctly created
  3. Create a SearchDomain resource in B pointing to A (with autojoin=true) and observe that the virtual kubelet is NOT created
  4. Check the corresponding foreigncluster resource in B and observe that the join property is set to false

Expected behavior
Both clusters should correctly perform the autojoin

[Feature] E2E Testing for Liqo Connectivity

Is your feature request related to a problem? Please describe.
So far Liqo connectivity does not have E2E test. We should implement those test to avoid regressions on new versions.

Endpoint Reflector does not consider Pod address natting

Describe the bug
This bug happens in presence of colliding POD CIDRs between two clusters, the EP reflector does not take in consideration the natting table when updating endpoints. This results in a wrong routing across clusters.

Schermata del 2020-09-09 16-57-33

Schermata del 2020-09-09 16-57-58

VirtualKubelet uses 100% on single core

Describe the bug
When a virtual kubelet starts, it uses a core at 100%

To Reproduce

  1. Make clusters to peer
  2. Run top command on the host with running virtual kubelet

Screenshots
Schermata del 2020-08-28 11-30-11
Schermata del 2020-08-28 11-31-10

Desktop (please complete the following information):

  • OS: Ubuntu 20

Additional context
I'm running both clusters on k3s

[Feature] User Documentation

Is your feature request related to a problem? Please describe.
Improve documentation for Liqo Users:

  • Tutorial
  • Tutorial (Deploy local - foreign)
  • Discovery
  • Architecture
  • Infrastructure Setup

[Feature] Move Virtual kubelet manifest from advertisement operator

The virtual kubelet deployment is created by the advertisement operator: the deployment object is hard-coded in the operator source code. This approach leads to a customization problem of the virtual kubelet flags parameters and worsens the maintainability of the virtual kubelet creation process. For these reasons, the virtual kubelet deployment declaration should be decoupled from the advertisement operator code.

[Feature] Reduce permissions of Liqo Components

So far many Liqo components are relying on a "cluster admin" ClusterRole, which is not necessary every time. We should move to better tailored clusterRoles for each component.

Components Affected:

  • PeeringRequestOperator (@aleoli)
  • SchedulingNodeOperator (Nodes, Advertisement, SchedulingNode) (@mlavacca)
  • Discovery (@aleoli)
  • AdvertisementOperator ( Advertisement, Deployment, ForeignCluster) (@aleoli,@mlavacca)
  • MutatingWebhook (Pod, Ns) (@aleoli, @mlavacca)
  • Broadcaster (@aleoli)
    - Local: (Nodes, PeeringRequest)
    - Remote: (Specific Advertisement)
  • VirtualKubelet (@mlavacca)
    - Local: (Pod, Svc, EPslice, Secrets, Configmap, Nodi, Namespace, Advertisement, TunnelEndpoint)
    - Remote: TBD

Wrong virtual node ownerReference

Describe the bug
The virtual node ownerReference doesn't exist anymore: it points to liqo-<clusterID> deployment, which now is named virtual-kubelet-<clusterID>.

[Feature] Reflection Improvement

Some reflection improvements that could handle some corner cases or complete the event management:

  • Handle remote pod update without incurring in Throttling
  • Handle change event type to handle unexpected resource state (e.g., recreation of remote resources after a remote deletion)

[Epic] Discovery and Peering Fundamentals

Improvements and known issues

  • discovery-broadcaster integration (Ref. #18 ): When a new for that foreign cluster is discovered, a new broadcaster deployment for that cluster should be started.
    The broadcaster needs to be modified in the way it takes the foreign kubeconfig (not from a ConfigMap anymore)
  • WAN: Discovery process has to be extended to use DNS protocol
  • Retrieve public kubeconfig: Actually public kubeconfig is stored in ConfigMap served by an nginx Deployment, we have to give public access throw API Server
  • Advertisement acceptance: At the moment, all Advertisements are automatically accepted. We need a logic (to integrate with systray?) which triggers a notify every time we receive a new Advertisement and allows the user to accept/decline the CR
  • Policies: Discuss the implementation of outgoing/incoming policies (customize outgoing Adv, accept incoming Adv...) (Ref. #176)
  • A ForeignCluster should be added as reference when receiveing a PeeringCluster, if not previously discovered.
  • ForeignCluster resource should provide the status of Ingoing and Outgoing peers of a specific ForeignCluster (Ref. #171)
  • So far, the cluster which send an Advertisement has no visibility of the status of remote cluster. We should introduce a feedback loop in ForeignCluster status to notice the provider of advertisement statuses.
  • Advertisement complete lifecycle: Delete an Advertisement if the timeToLive has expired.

Secret recreation error

Describe the bug
When the broadcaster tries to recreate the Secret for the VirtualKubelet, an error occurs due to ResouceVersion field being set.

E0925 10:11:22.708028       1 broadcaster.go:344] Unable to create secret vk-kubeconfig-secret-433f86df-2734-49e4-9dd7-9f6fd364b88f on remote cluster 23ca2126-3b7c-4a35-8186-78c4c76e5f36; error: resourceVersion should not be set on objects to be created
E0925 10:11:22.708051       1 broadcaster.go:176] resourceVersion should not be set on objects to be created Error while sending Secret for virtual-kubelet to cluster 23ca2126-3b7c-4a35-8186-78c4c76e5f36

To Reproduce

  1. Peer 2 clusters
  2. Delete the Advertisement on cluster1 (or directly delete the secret vk-kubeconfig-secret-<clusterID>
  3. On cluster2, kubectl logs -n liqo broadcaster-<clusterID>

[Improvement] Network configuration support for Unidirectional Peering

Network Remodeling

Current Network Model

The network configuration is exchanged between two peering clusters using the advertisement.protocol.liqo.io CRD. It was fine before the discovery protocol was implemented. In some cases the advertisement.protocol.liqo.io CRD is not symmetrically exchanged between two peering clusters, having a state where one of the two clusters is missing the network configuration of the other cluster. The following pictures depicts the three different states:
JoinFlow

Known Limitations

  • Both clusters should exchange their advertisements to successfully establish a network interconnection

Enhancement Proposal

A possible solution could be to separate the network parameters from the advertisement protocol, and exchange them using a different CRD having in such manner symmetric sharing of the network configuration between two peering clusters.

Improvements

  • Clusters can establish their interconnection without having to exchange their advertisements
  • Early detection of network incompatibility

New Architecture

DispatcherDetails

  1. The network configuration of the local cluster is saved in a local CRD (TEP1->2) containing also the clusterID of the remote cluster to whom is destined;
  2. This CRD is replicated to the remote cluster by the dispatcher, running in the local cluster who knows how to interact with the remote cluster, and discriminates multiple peering clusters by their clusterID
  3. In the remote cluster this CRD (TEP1->2) is processed and its status updated with the NAT information.
  4. The Dispatcher (cluster 1) reflects this change in the local resource (TEP1->2)

At this point the same steps are performed by the peering cluster and in Cluster 1 we have two CRDs: TEP1->2, TEP2->1. The first one contains the information of the local cluster sent to the peering cluster in the spec section and in the status we have the NATing information given by the remote cluster (Cluster2). The second one in its spec has the network parameters of the Cluster2. Combining the status of TEP1->2 and the spec of TEP2->1 we have all the needed information to establish a connection with the remote Cluster2

  1. Combining the status of TEP1->2 and the spec of TEP2->1 we have all the needed information to establish a connection with the remote Cluster2

The Dispatchers are in charge to reflect the changes of TEP CRD between two cluster. Having a bidirectional channel of communication with a remote cluster permits to reflect only the the spec section of a local CRD toward remote cluster. The other channel is used to reflect the status of the copy of the local CRD, that lives in a remote cluster, in to the local cluster. The blue arrows indicates that the connection handles only the spec fields and the red arrows stands for the connection handling the status fields.

PRs:

  • NetworkConfig CRD(#154)

    • Introducing a new CRD which is used to exchange the network parameters between two peering clusters.
  • Init PR of CRDReplicator(#162)

    • First version of the operator
  • CRDReplicator enhancement 1(#195)

    • Implementing the remote watchers
  • CRDReplicator enhancement 2(#223)

    • Bug fixing, removing possible race conditions
  • Update tunnelEndpointCreator(#218)

    • the operator should create a networkConfig CRD when receiving a peering request or an advertisement from a peering request
    • reconcile the networkConfig CRDs and create an instance of tunnelEndpoint CRD to model the connection with a remote peering cluster.

[Feature] Gracefully handle advertisement deletion

Is your feature request related to a problem? Please describe.
Advertisement deletion is handled via the setting of field AdvertisementStatus = Deleting and do not benefit from Kubernetes garbage collection mechanisms.

Describe the solution you'd like
The solution will rely on Finalizers, adopted in the context of Advertisement resource.

[Feature] Improve Advertisement Generation

Is your feature request related to a problem? Please describe.
The advertisement is the base resource to establish peering between clusters. It represent the basic resource used to create the virtual nodes. So far, we have a pretty simple approach to generate the Advertisement.

Describe the solution you'd like
We should make possible to:

  • Avoid duplicate images (Ref. #281)
  • Filter advertised images

Network Connectivity Improvement

  • RouteOperator

    • Dynamically join new nodes, when added to the cluster, to the vxlan overlay network;
  • TunnelOperator

    • A driver to install tunnels using different technologies. Only GRE tunnel is supported now;
    • Auto negotiation of network parameters between the cluster, like the private ip tunnel; (Ref. #151, #173 )
  • All modules

    • Add persistence of the network configuration using a new CRD (Ref. #151)
    • Support dynamic changes to the network configuration without the need to destroy and recreate the network custom resources (Ref. #280, #266, #268)

[EPIC] Move in

To do:

  • Update repository
  • Rename all the packages
  • Move CI to self-hosted runner
  • Fix all CI pipellines
  • Fix GO fmt errors

Network misbehavior

After Liqo installation, Kindnet crashes stating it is unable to reconcile routes. This is the function that seems to crash dealing with Liqonet: https://github.com/kubernetes-sigs/kind/blob/d7f948dd8c00084d6ee30eb953471ce3ce375455/images/kindnetd/cmd/kindnetd/main.go#L137

KindNet Logs:

I1004 11:25:59.390304       1 main.go:65] hostIP = 172.18.0.6
podIP = 172.18.0.6
I1004 11:25:59.390670       1 main.go:74] setting mtu 1500 for CNI 
I1004 11:25:59.778503       1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:25:59.778531       1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24 
I1004 11:25:59.778830       1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
I1004 11:25:59.778843       1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:25:59.778849       1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24 
I1004 11:25:59.778936       1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
I1004 11:26:00.779140       1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:26:00.779363       1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24 
I1004 11:26:00.779707       1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
I1004 11:26:02.779986       1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:26:02.780022       1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24 
I1004 11:26:02.780227       1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
I1004 11:26:05.780436       1 main.go:168] Handling node with IP: 10.200.1.14
I1004 11:26:05.780464       1 main.go:169] Node liqo-3dd6f45d-b53a-4f67-8416-505c4d5ddba5 has CIDR 10.200.3.0/24 
I1004 11:26:05.780638       1 main.go:124] Failed to reconcile routes, retrying after error: network is unreachable
panic: Maximum retries reconciling node routes: network is unreachable

goroutine 1 [running]:
main.main()
	/go/src/cmd/kindnetd/main.go:128 +0x893

Ip Route:

default via 172.18.0.1 dev eth0 
10.75.0.0/16 via 192.168.200.7 dev liqonet 
10.141.0.0/16 via 192.168.200.7 dev liqonet 
10.200.0.2 dev vethc8f0b0ee scope host 
10.200.0.3 dev veth7ad31e05 scope host 
10.200.0.4 dev veth6292c0c6 scope host 
10.200.1.0/24 via 172.18.0.7 dev eth0 
172.18.0.0/16 dev eth0 proto kernel scope link src 172.18.0.6 
192.168.200.0/24 dev liqonet proto kernel scope link src 192.168.200.6 

Iptables:

root@liqo-cluster1-control-plane:/# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
LIQO-INPUT  udp  --  anywhere             anywhere             udp
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */
KUBE-EXTERNAL-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes externally-visible service portals */
KUBE-FIREWALL  all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
LIQO-FORWARD  all  --  anywhere             anywhere            
KUBE-FORWARD  all  --  anywhere             anywhere             /* kubernetes forwarding rules */
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */
KUBE-FIREWALL  all  --  anywhere             anywhere            

Chain KUBE-EXTERNAL-SERVICES (1 references)
target     prot opt source               destination         

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere             /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
DROP       all  -- !127.0.0.0/8          127.0.0.0/8          /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT

Chain KUBE-FORWARD (1 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere             ctstate INVALID
ACCEPT     all  --  anywhere             anywhere             /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT     all  --  anywhere             anywhere             /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             anywhere             /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-PROXY-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-SERVICES (3 references)
target     prot opt source               destination         

Chain LIQO-FORWARD (1 references)
target     prot opt source               destination         
LIQO-FRWD-CLS-f38be2c9  all  --  anywhere             10.141.0.0/16       
LIQO-FRWD-CLS-3dd6f45d  all  --  anywhere             10.75.0.0/16        

Chain LIQO-FRWD-CLS-3dd6f45d (1 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             10.75.0.0/16        

Chain LIQO-FRWD-CLS-f38be2c9 (1 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             10.141.0.0/16       

Chain LIQO-INPT-CLS-3dd6f45d (1 references)
target     prot opt source               destination         
ACCEPT     all  --  10.200.0.0/16        10.75.0.0/16        

Chain LIQO-INPT-CLS-f38be2c9 (1 references)
target     prot opt source               destination         
ACCEPT     all  --  10.200.0.0/16        10.141.0.0/16       

Chain LIQO-INPUT (1 references)
target     prot opt source               destination         
ACCEPT     udp  --  anywhere             anywhere             udp dpt:4789
LIQO-INPT-CLS-f38be2c9  all  --  anywhere             10.141.0.0/16       
LIQO-INPT-CLS-3dd6f45d  all  --  anywhere             10.75.0.0/16  

virtual-kubelet is stuck in Init

Describe the bug
Once created two k3s clusters from scratch and installed in both liqo, the "liqo-<...>" pod in the second cluster (in order of liqo installation) is stuck on the "Init:0/1" status.

To Reproduce
Steps to reproduce the behavior:

  1. Install k3s and liqo in the first cluster and wait for all the pod to be up and running;
  2. Install k3s and liqo in the second cluster;
  3. Run the kubectl describe pod -n liqo liqo-<...> in the second cluster;
  4. See the following output:
Name:           liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-68f74654dc-gt52d
Namespace:      liqo
Priority:       0
Node:           rar-k3s-01/10.0.2.4
Start Time:     Tue, 01 Sep 2020 17:40:22 +0200
Labels:         app=virtual-kubelet
                cluster=8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
                pod-template-hash=68f74654dc
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-68f74654dc
Init Containers:
  crt-generator:
    Container ID:  
    Image:         liqo/init-vkubelet:latest
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/bin/local/kubelet-setup.sh
    Args:
      /etc/virtual-kubelet/certs
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      POD_IP:     (v1:status.podIP)
      POD_NAME:  liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-68f74654dc-gt52d (v1:metadata.name)
    Mounts:
      /etc/virtual-kubelet/certs from virtual-kubelet-crt (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq (ro)
Containers:
  virtual-kubelet:
    Container ID:  
    Image:         liqo/virtual-kubelet:latest
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/bin/virtual-kubelet
    Args:
      --cluster-id
      8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
      --provider
      kubernetes
      --nodename
      liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
      --kubelet-namespace
      liqo
      --provider-config
      /app/kubeconfig/remote
      --home-cluster-id
      d9df783b-cd9b-4d25-ae37-231e21dc9739
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      APISERVER_CERT_LOCATION:  /etc/virtual-kubelet/certs/server.crt
      APISERVER_KEY_LOCATION:   /etc/virtual-kubelet/certs/server-key.pem
      VKUBELET_POD_IP:           (v1:status.podIP)
      VKUBELET_TAINT_KEY:       virtual-node.liqo.io/not-allowed
      VKUBELET_TAINT_VALUE:     true
      VKUBELET_TAINT_EFFECT:    NoExecute
    Mounts:
      /app/kubeconfig/remote from remote-kubeconfig (rw,path="kubeconfig")
      /etc/virtual-kubelet/certs from virtual-kubelet-crt (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq (ro)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  remote-kubeconfig:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  vk-kubeconfig-secret-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
    Optional:    false
  virtual-kubelet-crt:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason       Age                   From                 Message
  ----     ------       ----                  ----                 -------
  Normal   Scheduled    <unknown>             default-scheduler    Successfully assigned liqo/liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-68f74654dc-gt52d to rar-k3s-01
  Warning  FailedMount  37m                   kubelet, rar-k3s-01  Unable to attach or mount volumes: unmounted volumes=[remote-kubeconfig], unattached volumes=[liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq remote-kubeconfig virtual-kubelet-crt]: timed out waiting for the condition
  Warning  FailedMount  35m (x10 over 39m)    kubelet, rar-k3s-01  MountVolume.SetUp failed for volume "remote-kubeconfig" : secret "vk-kubeconfig-secret-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef" not found
  Warning  FailedMount  35m                   kubelet, rar-k3s-01  Unable to attach or mount volumes: unmounted volumes=[remote-kubeconfig], unattached volumes=[virtual-kubelet-crt liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq remote-kubeconfig]: timed out waiting for the condition
  Warning  FailedMount  34m (x2 over 34m)     kubelet, rar-k3s-01  MountVolume.SetUp failed for volume "remote-kubeconfig" : failed to sync secret cache: timed out waiting for the condition
  Warning  FailedMount  34m (x2 over 34m)     kubelet, rar-k3s-01  MountVolume.SetUp failed for volume "liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq" : failed to sync secret cache: timed out waiting for the condition
  Warning  FailedMount  34m (x4 over 34m)     kubelet, rar-k3s-01  MountVolume.SetUp failed for volume "remote-kubeconfig" : secret "vk-kubeconfig-secret-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef" not found
  Warning  FailedMount  33m (x2 over 33m)     kubelet, rar-k3s-01  MountVolume.SetUp failed for volume "remote-kubeconfig" : failed to sync secret cache: timed out waiting for the condition
  Warning  FailedMount  33m (x2 over 33m)     kubelet, rar-k3s-01  MountVolume.SetUp failed for volume "liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq" : failed to sync secret cache: timed out waiting for the condition
  Warning  FailedMount  18m (x6 over 31m)     kubelet, rar-k3s-01  Unable to attach or mount volumes: unmounted volumes=[remote-kubeconfig], unattached volumes=[virtual-kubelet-crt liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq remote-kubeconfig]: timed out waiting for the condition
  Warning  FailedMount  13m (x3 over 25m)     kubelet, rar-k3s-01  Unable to attach or mount volumes: unmounted volumes=[remote-kubeconfig], unattached volumes=[remote-kubeconfig virtual-kubelet-crt liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq]: timed out waiting for the condition
  Warning  FailedMount  3m14s (x21 over 33m)  kubelet, rar-k3s-01  MountVolume.SetUp failed for volume "remote-kubeconfig" : secret "vk-kubeconfig-secret-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef" not found

Also, there is no liqo-<...> node in that cluster.

Expected behavior
The pod should be in "Running" status and there should be a liqo-<...> node in that cluster.

Additional context
kubectl describe foreignclusters.discovery.liqo.io

Name:         8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
Namespace:    
Labels:       cluster-id=8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
              discovery-type=LAN
Annotations:  <none>
API Version:  discovery.liqo.io/v1alpha1
Kind:         ForeignCluster
Metadata:
  Creation Timestamp:  2020-09-01T15:40:06Z
  Generation:          8
  Managed Fields:
    API Version:  discovery.liqo.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:outgoing:
          f:advertisement:
            .:
            f:apiVersion:
            f:kind:
            f:name:
            f:uid:
    Manager:      advertisement-operator
    Operation:    Update
    Time:         2020-09-01T15:40:20Z
    API Version:  discovery.liqo.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .:
          f:cluster-id:
          f:discovery-type:
      f:spec:
        .:
        f:allowUntrustedCA:
        f:apiUrl:
        f:clusterID:
        f:discoveryType:
        f:join:
        f:namespace:
      f:status:
        .:
        f:incoming:
          .:
          f:availableIdentity:
          f:identityRef:
            .:
            f:apiVersion:
            f:kind:
            f:name:
            f:namespace:
            f:uid:
          f:joined:
        f:outgoing:
          .:
          f:advertisementStatus:
          f:caDataRef:
            .:
            f:apiVersion:
            f:kind:
            f:name:
            f:namespace:
            f:uid:
          f:joined:
          f:remote-peering-request-name:
        f:ttl:
    Manager:      discovery
    Operation:    Update
    Time:         2020-09-01T15:40:49Z
    API Version:  discovery.liqo.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:incoming:
          f:peeringRequest:
            .:
            f:name:
            f:uid:
    Manager:         peering-request-operator
    Operation:       Update
    Time:            2020-09-01T15:40:49Z
  Resource Version:  1092
  Self Link:         /apis/discovery.liqo.io/v1alpha1/foreignclusters/8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
  UID:               e4ccbde0-f96a-4d80-bac8-776c87e49e02
Spec:
  Allow Untrusted CA:  true
  API URL:             https://10.0.2.5:6443
  Cluster ID:          8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
  Discovery Type:      LAN
  Join:                true
  Namespace:           liqo
Status:
  Incoming:
    Available Identity:  true
    Identity Ref:
      API Version:  v1
      Kind:         Secret
      Name:         pr-d9df783b-cd9b-4d25-ae37-231e21dc9739
      Namespace:    liqo
      UID:          9318644e-7ed4-49d9-a2bb-c1ccd50fe4c4
    Joined:         true
    Peering Request:
      Name:  8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
      UID:   dd249ebe-78cf-46b8-8c6f-8eb3810eea6f
  Outgoing:
    Advertisement:
      API Version:         sharing.liqo.io/v1alpha1
      Kind:                Advertisement
      Name:                advertisement-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef
      UID:                 7b5e6737-7b9a-4966-9052-970bdb8c996b
    Advertisement Status:  Deleting
    Ca Data Ref:
      API Version:                      v1
      Kind:                             Secret
      Name:                             8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-ca-data
      Namespace:                        liqo
      UID:                              08efcb08-9384-476c-9bb2-e5dd2dc1bdfd
    Joined:                             true
    Remote - Peering - Request - Name:  d9df783b-cd9b-4d25-ae37-231e21dc9739
  Ttl:                                  3
Events:                                 <none>

kubectl get secrets -n liqo

NAME                                                         TYPE                                  DATA   AGE
default-token-r5chc                                          kubernetes.io/service-account-token   3      52m
tunnelendpointcreator-operator-service-account-token-rszjk   kubernetes.io/service-account-token   3      52m
broadcaster-token-rkkx6                                      kubernetes.io/service-account-token   3      52m
peering-request-operator-token-qx4md                         kubernetes.io/service-account-token   3      52m
route-operator-service-account-token-t44br                   kubernetes.io/service-account-token   3      52m
sn-operator-token-4qknf                                      kubernetes.io/service-account-token   3      52m
tunnel-operator-service-account-token-ksr46                  kubernetes.io/service-account-token   3      52m
liqodash-admin-sa-token-lc7pv                                kubernetes.io/service-account-token   3      52m
discovery-sa-token-76r9c                                     kubernetes.io/service-account-token   3      52m
crdreplicator-operator-service-account-token-zbkkm           kubernetes.io/service-account-token   3      52m
podmutatoraccount-token-9l6z9                                kubernetes.io/service-account-token   3      52m
advertisement-operator-token-xk6fn                           kubernetes.io/service-account-token   3      52m
sh.helm.release.v1.liqo.v1                                   helm.sh/release.v1                    1      52m
ca-data                                                      Opaque                                1      51m
pr-d9df783b-cd9b-4d25-ae37-231e21dc9739                      Opaque                                1      51m
8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-ca-data                 Opaque                                1      51m
8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-8lbfs             kubernetes.io/service-account-token   3      51m
peering-request-webhook-certs                                Opaque                                2      51m
pod-mutator-secret                                           Opaque                                2      51m
liqo-8d40ca93-fd45-4b1d-af9c-f55cce5ed7ef-token-v4scq        kubernetes.io/service-account-token   3      51m

[Feature] Setting the name of the configmap that contains the clusterID through the clusterconfig CRD

Is your feature request related to a problem? Please describe.
The name of the configmap that contains the clusterID is hardcoded. Throughout the code when the configmap is needed the name of the configmap is also hardcoded.

Describe the solution you'd like
The possibility to set the name of the configmap in the clusterConfig CRD. So all the components that need the clusterID can find the name of the configmap in the configuration.

[Feature] Uninstall Script

Is your feature request related to a problem? Please describe.

So far, it is very complex to correctly uninstall all the liqo resources.

Describe the solution you'd like

Similarly to install.sh script, we should have a uninstall.sh script to safely remove all liqo objects.

Describe alternatives you've considered

Liqo uninstall is uncomfortable and requires to have helminstalled and liqo repository downloaded.

Steps:

  • Implement Uninstall function (Ref. #185)
  • Linting Shellcheck (Ref. #229)
  • Sleep deletion + Auto-crt approval (Ref. #232)
  • Default mode: Install the latest "stable" release
  • Checkout tagged releases

Upon node reboot, liqo routes are not re-created

Describe the bug
Upon the reboot of a node in the cluster, routes for external traffic are not re-created in Liqo.

To Reproduce
Steps to reproduce the behavior:

  1. Install Liqo
  2. Check the routes
  3. Reboot a node
  4. Check the routes again

Expected behavior
Routes should be re-created after a node become unavailable.

[Doc] Liqo Doc

Introduction

Liqo enables resource sharing across Kubernetes clusters. To do so, it encapsulates (1) a logic to discover/advertise resources in a neighborhood (e.g. LAN) and (2) a protocol to negotiate resource exchange. In this document, we describe how the cluster peering logic works.

API Glossary

  • PeeringRequest
  • Advertisement
  • ForeignCluster

Liqo Functioning

Sharing resources with Liqo relies on three different phases:

  1. Discovery: The cluster looks for available clusters where offload new resources (e.g. neighborhood, dns, manual insertion) and exchange credentials with each other to start communicate.
  2. Advertisement management: Clusters shares updates about the resourcing they are willing to export and their capabilities (i.e. Advertisements)
  3. Resource Sharing: When a cluster is interested in resources proposed by a certain advertisement, it accepts the advertisement. This triggers the establishment of network interconnections and the spawning of a new virtual-kubelet.

Discovery

This issue describes how two clusters discover each other and start sharing resources.
The discovery service exploits DNS ServiceDiscovery protocol, which works both on a LAN and WAN scenarios. In first case with mDNS, in second the one with standard DNS.
Resource sharing is based on periodic Advertisement exchanges, where each cluster exposes its capabilities, allowing others to use them to offload their jobs.

Discovery service allows two clusters to know each other, ask for resources and begin exchanging Advertisements.
The protocol is described by the following steps:

  1. each cluster creates and manages a ConfigMap containing a kubeconfig file with create-only permission on FederationRequest resources
  2. each cluster registers its master IP and ConfigMap URL to a mDNS service
  3. the requesting cluster sends on local network a mDNS query to find available servers
  4. when someone replies, the requesting cluster downloads its exposed kubeconfig
  5. the client cluster stores this information in ForeignCluster CR along with their clusterID
  6. when the Federate flag in the ForeignCluster CR becomes true (either automatically or manually), an operator is triggered and uses the stored kubeconfig to create a new FederationRequest CR on the foreign cluster. FederationRequest creation process includes the creation of new kubeconfig with management permission on Advertisement CRs
  7. on the server cluster, an admission webhook accept/reject FederationRequests
  8. the FederationRequest is used to start the sharing of resources

Advertisement management

The Advertisement operator can be split in two main components.

  • Broadcaster: the module which creates and sends the Advertisement message
  • Controller: the module which is triggered when receiving an Advertisement and spawns a virtual node (using Virtual Kubelet, described in #3)

Broadcaster

The broadcaster is in charge of sending to other clusters the Advertisement CR, containing the resources made available for sharing and (optionally) their prices. It reads from a ConfigMap the foreign cluster kubeconfig, which allows it to manage the Advertisement.
After creating it, a remote watcher is started, which is a goroutine that watches the Advertisement Status on the remote cluster. This way, the home cluster can know if its CR has been accepted by the foreign cluster and if the podCIDR has been remapped by the network module.

Controller

The controller is the module that receives Advertisement CRs and creates the virtual nodes with the announced resources. Doing so, the remote clusters (emulated by the virtual nodes) are taken into account by the scheduler, which can offload the jobs it receives on them.

Resource Sharing

Components

  • TunnelOperator
  • RouteOperator
  • VirtualKubelet

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.