Giter Site home page Giter Site logo

Comments (13)

tmjd avatar tmjd commented on August 23, 2024 1

If you're getting what looks like a functional cluster from make cluster-create then I think you're on the right path. After I run that command I get

> export KUBECONFIG=kubeconfig.yaml 
> kubectl get pods -A
NAMESPACE            NAME                                         READY   STATUS    RESTARTS   AGE
kube-system          coredns-558bd4d5db-4s6jh                     0/1     Pending   0          54s
kube-system          coredns-558bd4d5db-j26ws                     0/1     Pending   0          54s
kube-system          etcd-kind-control-plane                      1/1     Running   0          68s
kube-system          kube-apiserver-kind-control-plane            1/1     Running   0          68s
kube-system          kube-controller-manager-kind-control-plane   1/1     Running   0          68s
kube-system          kube-proxy-67d9l                             1/1     Running   0          35s
kube-system          kube-proxy-8h8v4                             1/1     Running   0          35s
kube-system          kube-proxy-rvw7f                             1/1     Running   0          54s
kube-system          kube-proxy-z8b9p                             1/1     Running   0          35s
kube-system          kube-scheduler-kind-control-plane            1/1     Running   0          68s
local-path-storage   local-path-provisioner-5545dd49d7-wvj9w      0/1     Pending   0          54s

Also if I look at the crds in the cluster I see the following. I'm including this because I'm wondering if they were not created as they should have been based on the error you've received.

> kubectl get crds | grep operator
amazoncloudintegrations.operator.tigera.io              2023-10-17T13:28:27Z
apiservers.operator.tigera.io                           2023-10-17T13:28:27Z
applicationlayers.operator.tigera.io                    2023-10-17T13:28:27Z
authentications.operator.tigera.io                      2023-10-17T13:28:27Z
compliances.operator.tigera.io                          2023-10-17T13:28:27Z
egressgateways.operator.tigera.io                       2023-10-17T13:28:27Z
imagesets.operator.tigera.io                            2023-10-17T13:28:27Z
installations.operator.tigera.io                        2023-10-17T13:28:27Z
intrusiondetections.operator.tigera.io                  2023-10-17T13:28:27Z
logcollectors.operator.tigera.io                        2023-10-17T13:28:27Z
logstorages.operator.tigera.io                          2023-10-17T13:28:27Z
managementclusterconnections.operator.tigera.io         2023-10-17T13:28:27Z
managementclusters.operator.tigera.io                   2023-10-17T13:28:27Z
managers.operator.tigera.io                             2023-10-17T13:28:27Z
monitors.operator.tigera.io                             2023-10-17T13:28:27Z
policyrecommendations.operator.tigera.io                2023-10-17T13:28:27Z
tenants.operator.tigera.io                              2023-10-17T13:28:27Z
tigerastatuses.operator.tigera.io                       2023-10-17T13:28:27Z

from operator.

tmjd avatar tmjd commented on August 23, 2024 1

Sorry there hasn't been any response here for a while.

For 1:
If you want to make the suggested change that would be good.

For 2:
I think you are suggesting switch to

sh -c "GOBIN=$(CURDIR)/$(BINDIR) go install sigs.k8s.io/kind"

I don't think that is something we would want in general, since it would no longer be containerized which is something we want to maintain. I'd be ok with a conditional based on BUILDOS, perhaps if BUILDOS != linux then instruct user to copy a functional kind binary to $(BINDIR)/kind

For 3:
You could request the projectcalico/calico to push the node image for arm on master builds.
Another option would be to have a make target that switches the versions to ease creating a build with latest (or some other tag). Maybe something like the following

set-calico-version:
  sed -i -e "s/version: .*$$/version: $(VERSION)/" config/calico_versions.yml
  make gen-versions-calico

from operator.

MuhtasimTanmoy avatar MuhtasimTanmoy commented on August 23, 2024

Hello @tmjd. I was able to fix the previous error and get to the exact state that you are in.

At that moment, the cluster nodes were in a NotReady state, and coredns pods were in a pending state due to not being able to get an IP Address from Pod Network as there was no CNI.

So, after installing the default custom resource with
kubectl create -f ./config/samples/operator_v1_installation.yaml I have the following state.

NAMESPACE            NAME                                         READY   STATUS              RESTARTS   AGE
calico-system        calico-kube-controllers-6c6d97c87b-4bcdx     0/1     ContainerCreating   0          22m
calico-system        calico-node-2dtfx                            0/1     ImagePullBackOff    0          22m
calico-system        calico-node-j24j9                            0/1     ImagePullBackOff    0          22m
calico-system        calico-node-qz6xg                            0/1     ImagePullBackOff    0          22m
calico-system        calico-node-smnjn                            0/1     ImagePullBackOff    0          23m
calico-system        calico-typha-66cdfb85cf-qgw79                1/1     Running             0          23m
calico-system        calico-typha-66cdfb85cf-qks8w                1/1     Running             0          22m
calico-system        csi-node-driver-d2zsn                        0/2     ContainerCreating   0          22m
calico-system        csi-node-driver-lzqkb                        0/2     ContainerCreating   0          22m
calico-system        csi-node-driver-tcnlm                        0/2     ContainerCreating   0          22m
calico-system        csi-node-driver-z5ml9                        0/2     ContainerCreating   0          22m
kube-system          coredns-558bd4d5db-bpt8f                     0/1     ContainerCreating   0          27m
kube-system          coredns-558bd4d5db-cqvjj                     0/1     ContainerCreating   0          27m
kube-system          etcd-kind-control-plane                      1/1     Running             0          27m
kube-system          kube-apiserver-kind-control-plane            1/1     Running             0          27m
kube-system          kube-controller-manager-kind-control-plane   1/1     Running             0          27m
kube-system          kube-proxy-952lc                             1/1     Running             0          27m
kube-system          kube-proxy-jsphk                             1/1     Running             0          27m
kube-system          kube-proxy-nldmc                             1/1     Running             0          27m
kube-system          kube-proxy-vq5vm                             1/1     Running             0          27m
kube-system          kube-scheduler-kind-control-plane            1/1     Running             0          27m
local-path-storage   local-path-provisioner-778f7d66bf-dmknx      0/1     ContainerCreating   0          27m

Here the csi-node-driver, calico-kube-controllers and local-path-provisioner is currently wating for calico-node to be up and running. However, it is getting the ImagePullBackOff error.

Events from the pods show

>  kubectl get events -A  | grep -i calico-node-qz6xg

calico-system        27m         Normal    Scheduled                 pod/calico-node-qz6xg                           Successfully assigned calico-system/calico-node-qz6xg to kind-worker3
calico-system        27m         Normal    Pulling                   pod/calico-node-qz6xg                           Pulling image "docker.io/calico/pod2daemon-flexvol:master"
calico-system        26m         Normal    Pulled                    pod/calico-node-qz6xg                           Successfully pulled image "docker.io/calico/pod2daemon-flexvol:master" in 17.0363043s
calico-system        26m         Normal    Created                   pod/calico-node-qz6xg                           Created container flexvol-driver
calico-system        26m         Normal    Started                   pod/calico-node-qz6xg                           Started container flexvol-driver
calico-system        26m         Normal    Pulling                   pod/calico-node-qz6xg                           Pulling image "docker.io/calico/cni:master"
calico-system        24m         Normal    Pulled                    pod/calico-node-qz6xg                           Successfully pulled image "docker.io/calico/cni:master" in 2m26.246357483s
calico-system        24m         Normal    Created                   pod/calico-node-qz6xg                           Created container install-cni
calico-system        24m         Normal    Started                   pod/calico-node-qz6xg                           Started container install-cni
calico-system        22m         Normal    Pulling                   pod/calico-node-qz6xg                           Pulling image "docker.io/calico/node:master"
calico-system        22m         Warning   Failed                    pod/calico-node-qz6xg                           Failed to pull image "docker.io/calico/node:master": rpc error: code = NotFound desc = failed to pull and unpack image "docker.io/calico/node:master": no match for platform in manifest: not found
calico-system        23m         Warning   Failed                    pod/calico-node-qz6xg                           Error: ErrImagePull
calico-system        3m37s       Normal    BackOff                   pod/calico-node-qz6xg                           Back-off pulling image "docker.io/calico/node:master"
calico-system        22m         Warning   Failed                    pod/calico-node-qz6xg                           Error: ImagePullBackOff
calico-system        27m         Normal    SuccessfulCreate          daemonset/calico-node                           Created pod: calico-node-qz6xg

Specifically this error

Failed to pull image "docker.io/calico/node:master": rpc error: code = NotFound desc = failed to pull 
and unpack image "docker.io/calico/node:master": no match for platform in manifest: not found

So, what needs to be done to fix this when it is trying to fetch docker.io/calico/node:master?

Note that, docker pull docker.io/calico/node:latest works whereas node:master does not.

from operator.

MuhtasimTanmoy avatar MuhtasimTanmoy commented on August 23, 2024

I was able to resolve the ImagePullBackOff error by making slight changes to the package/components/calico.go by replacing the version with "latest".

Version: "master",

So currently everything is up and running:

>  kubectl get pods -A

NAMESPACE            NAME                                         READY   STATUS    RESTARTS   AGE
calico-system        calico-kube-controllers-5c6d8778f5-btvp6     1/1     Running   0          17m
calico-system        calico-node-22lc7                            1/1     Running   0          17m
calico-system        calico-node-jvn5n                            1/1     Running   0          17m
calico-system        calico-node-lvm22                            1/1     Running   0          17m
calico-system        calico-node-qffqg                            1/1     Running   0          17m
calico-system        calico-typha-745f498dff-dn26b                1/1     Running   0          17m
calico-system        calico-typha-745f498dff-flbw6                1/1     Running   0          17m
calico-system        csi-node-driver-4h6w5                        2/2     Running   0          60s
calico-system        csi-node-driver-59gb8                        2/2     Running   0          17m
calico-system        csi-node-driver-b5c29                        2/2     Running   0          17m
calico-system        csi-node-driver-gzvtg                        2/2     Running   0          17m
kube-system          coredns-558bd4d5db-pz5fc                     1/1     Running   0          21m
kube-system          coredns-558bd4d5db-wrc7c                     1/1     Running   0          21m
kube-system          etcd-kind-control-plane                      1/1     Running   0          21m
kube-system          kube-apiserver-kind-control-plane            1/1     Running   0          21m
kube-system          kube-controller-manager-kind-control-plane   1/1     Running   0          21m
kube-system          kube-proxy-mkdbs                             1/1     Running   0          21m
kube-system          kube-proxy-pd7b2                             1/1     Running   0          20m
kube-system          kube-proxy-q4hq8                             1/1     Running   0          20m
kube-system          kube-proxy-sgx4l                             1/1     Running   0          20m
kube-system          kube-scheduler-kind-control-plane            1/1     Running   0          21m
local-path-storage   local-path-provisioner-778f7d66bf-44cw5      1/1     Running   0          21m

So, in summary, to set up the cluster in Apple Silicon (M1) I needed to make the following three changes.

  1. In the makefile, change the following line to include $(BUILDOS)/$(ARCH)/kubectl to make it OS-independent.
    curl -L https://storage.googleapis.com/kubernetes-release/release/v1.25.6/bin/linux/$(ARCH)/kubectl -o $@
  2. In the makefile, change the installation of the kind binary to sh -c "GOBIN=$(CURDIR)/$(BINDIR) go install sigs.k8s.io/kind" only.
    $(CONTAINERIZED) $(CALICO_BUILD) sh -c "GOBIN=/go/src/$(PACKAGE_NAME)/$(BINDIR) go install sigs.k8s.io/kind"
  3. In the package/components/calico.go file, change the version as described above.

Would you briefly direct if these changes need to be reflected in the source via pull request to support local development on M1 or if this issue needs other approaches due to some side effects?

from operator.

tmjd avatar tmjd commented on August 23, 2024

I expect 1 and 2 would be fine. 3 wouldn't be ideal because I think latest is probably the 'latest' released images which would not be the same as using master images. I'm guessing the issue is that only the amd64 images are built and pushed for master builds so the arm images are not available.

from operator.

MuhtasimTanmoy avatar MuhtasimTanmoy commented on August 23, 2024

For 3, yes, the 'latest' released image may cause issues compared to stable and tested 'master' images. But as this docker.io/calico/node:master image is unavailable for the arm what should be the workaround as this is a blocker for creating a cluster?

Additionally, is a pull request needed with changes made in 1 and 2?

from operator.

MuhtasimTanmoy avatar MuhtasimTanmoy commented on August 23, 2024

For 1: ok

For 2:

I don't think that is something we would want in general, since it would no longer be containerized which is something we want to maintain. I'd be ok with a conditional based on BUILDOS, perhaps if BUILDOS != linux then instruct user to copy a functional kind binary to $(BINDIR)/kind

As the kind binary is being used to create local cluster on host machine rather then in a container as given below, should the binary be containerized?

Though might miss some cornercases.

## Create a local kind dual stack cluster.
KIND_KUBECONFIG?=./kubeconfig.yaml
K8S_VERSION?=v1.21.14
cluster-create: $(BINDIR)/kubectl $(BINDIR)/kind
	# First make sure any previous cluster is deleted
	make cluster-destroy

	# Create a kind cluster.
	$(BINDIR)/kind create cluster \
	        --config ./deploy/kind-config.yaml \
	        --kubeconfig $(KIND_KUBECONFIG) \
	        --image kindest/node:$(K8S_VERSION)

Does this look ok in the case of conditional based on BUILDOS? (tested on darwin)

$(BINDIR)/kind:
ifeq ($(BUILDOS), darwin)
    sh -c "GOBIN=/go/src/$(PACKAGE_NAME)/$(BINDIR) go install sigs.k8s.io/kind"
else
    $(CONTAINERIZED) $(CALICO_BUILD) sh -c "GOBIN=/go/src/$(PACKAGE_NAME)/$(BINDIR) go install sigs.k8s.io/kind"
endif

For 3:
Added the following due to this issue with sed.

# https://stackoverflow.com/questions/4247068/sed-command-with-i-option-failing-on-mac-but-works-on-linux/4247319#4247319
set-calico-version:
ifeq ($(BUILDOS), darwin)
    sed -i '' -e 's/version: .*/version: $(VERSION)/' config/calico_versions.yml
else
    sed -i -e 's/version: .*/version: $(VERSION)/' config/calico_versions.yml
endif
    make gen-versions-calico

Should go with the following changes?

from operator.

tmjd avatar tmjd commented on August 23, 2024

I'm good with what you're suggesting For 2, though I'll point out that I don't think you should include GOBIN in the command.

Seems reasonable for 3 also.

from operator.

MuhtasimTanmoy avatar MuhtasimTanmoy commented on August 23, 2024

On another thought, shouldn't adopting nix would solve compatibility issues altogether?
Reference: Using Nix with Dockerfiles

from operator.

MuhtasimTanmoy avatar MuhtasimTanmoy commented on August 23, 2024

I'll point out that I don't think you should include GOBIN in the command

$(BINDIR)/kind:
ifeq ($(BUILDOS), darwin)
    sh -c go install sigs.k8s.io/kind"
else
    $(CONTAINERIZED) $(CALICO_BUILD) sh -c go install sigs.k8s.io/kind"
endif

Like this?

I will give a PR with these fix then.

from operator.

tmjd avatar tmjd commented on August 23, 2024

I'd guess there is probably no need for the sh -c either.
(you've got a trailing " that you'll need to get rid of too)

On another thought, shouldn't adopting nix would solve compatibility issues altogether?

I'm not sure, we still need to build kind that can work on darwin or linux, does nix help with that?

from operator.

MuhtasimTanmoy avatar MuhtasimTanmoy commented on August 23, 2024

Does this look ok?

$(BINDIR)/kind:
ifeq ($(BUILDOS), darwin)
   go install sigs.k8s.io/kind
else
    $(CONTAINERIZED) $(CALICO_BUILD) go install sigs.k8s.io/kind
endif

I'm not sure, we still need to build kind that can work on darwin or linux, does nix help with that?

Being universal, it should. I have used it for consistent environment for building Docker images.

from operator.

tmjd avatar tmjd commented on August 23, 2024

That does not look ok, I didn't notice you were modifying the "non-darwin" command, it should remain what it has been.

Have you tried what you're suggesting for the "darwin" option? It doesn't look like it would work to me. The result of the commands should result in a kind binary (that works on the host system) at $(BINDIR)/kind. You probably do need a GOBIN but it would be different from the "non-darwin" command.

Please put up a PR that you've tested, ensure you run make clean before testing to make sure that you don't have any binaries that would make it look like everything is working.

Being universal, it should. I have used it for consistent environment for building Docker images.

But this is not building a Docker image, we're installing a binary that is used. So I don't understand how using nix would help us fetch a darwin binary on darwin and a linux binary on linux.

from operator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.