nutanix / docker-machine Goto Github PK
View Code? Open in Web Editor NEWRancher Node Driver for Nutanix AHV
Home Page: https://www.nutanix.com/products/acropolis/virtualization
License: Mozilla Public License 2.0
Rancher Node Driver for Nutanix AHV
Home Page: https://www.nutanix.com/products/acropolis/virtualization
License: Mozilla Public License 2.0
Hi there, roughly once a week, i need to restart all the rancher-mgmt pods in order to add a new downstream cluster to rancher via the nutanix driver. I added two screenshots down below in oder to demonstrate this use case. After restarting the rancher-mgmt pods, i am able to provision a new cluster.
NCC Version: 4.5.0.2
LCM Version: 2.4.5.2
Prism Version: pc.2022.4.0.1
2.6.5
3.2.0
name: worker
nutanixConfig:
cloudInit: "#cloud-config"
cluster: "ahv2"
disksize: ""
endpoint: "endpoint"
insecure: "true"
port: "9440"
storageContainer: ""
username: "user"
vmCategories: []
vmCpuPassthrough: "true"
vmCores: "1"
vmCpus: "16"
vmImageSize: "100"
vmImage: "k8s-base-image"
vmMem: "65536"
vmNetwork: [
"subneta",
"subnetb"
]
vmSerialPort: "false"
labels:
zone: ahv2
Dears,
If i provision a cluster using rancher, does it appear in nutanix karbon as a cluster or we can see the VMs only in prism central and the management will be done using rancher?
thank you
Hi there, we are a Nutanix customer and would like to use and probably also make changes to this driver. But the project does not seem to have any license, which makes even using this impossible for us. Would you mind adding a license, preferably MIT?
2.6.8
# Copy-paste your Node template here
{
"amazonec2Config": null,
"annotations": {
"ownerBindingsCreated": "true"
},
"baseType": "nodeTemplate",
"cloudCredentialId": null,
"created": "2022-09-12T20:57:33Z",
"createdTS": 1663016253000,
"creatorId": "user-zslj4",
"driver": "nutanix",
"engineEnv": { },
"engineInstallURL": ["https://releases.rancher.com/install-docker/20.10.sh"](https://releases.rancher.com/install-docker/20.10.sh),
"engineLabel": { },
"engineOpt": { },
"engineRegistryMirror": [ ],
"id": ["cattle-global-nt:nt-sfb56"](https://rancherdev.rd.zedev.net/v3/nodeTemplates/cattle-global-nt:nt-sfb56),
"labels": {
"cattle.io/creator": "norman"
},
"links": {
"nodePools": ["…/v3/nodePools?nodeTemplateId=cattle-global-nt%3Ant-sfb56"](https://rancherdev.rd.zedev.net/v3/nodePools?nodeTemplateId=cattle-global-nt%3Ant-sfb56),
"nodes": ["…/v3/nodes?nodeTemplateId=cattle-global-nt%3Ant-sfb56"](https://rancherdev.rd.zedev.net/v3/nodes?nodeTemplateId=cattle-global-nt%3Ant-sfb56),
"remove": ["…/v3/nodeTemplates/cattle-global-nt:nt-sfb56"](https://rancherdev.rd.zedev.net/v3/nodeTemplates/cattle-global-nt:nt-sfb56),
"self": ["…/v3/nodeTemplates/cattle-global-nt:nt-sfb56"](https://rancherdev.rd.zedev.net/v3/nodeTemplates/cattle-global-nt:nt-sfb56),
"update": ["…/v3/nodeTemplates/cattle-global-nt:nt-sfb56"](https://rancherdev.rd.zedev.net/v3/nodeTemplates/cattle-global-nt:nt-sfb56)
},
"name": "NTX-Dev-Rancher-Clusters",
"nutanixConfig": {
"cloudInit": "#cloud-config\nruncmd:\n- yum update -y\n- systemctl enable --now iscsid\npackage_upgrade: true\npackages:\n- iscsi-initiator-utils\n- nfs-utils\nusers:\n-
"cluster": "NTX-DEV",
"diskSize": "0",
"endpoint": "ntx-dev.zepower.com",
"insecure": true,
"port": "9440",
"storageContainer": "",
"username": "[email protected]",
"vmCategories": [ ],
"vmCores": "1",
"vmCpuPassthrough": true,
"vmCpus": "4",
"vmImage": "CentOS-7-x86_64-GenericCloud-1907",
"vmImageSize": "40",
"vmMem": "8192",
"vmNetwork": [
"Software Development Apps (VLAN 125)"
]
},
"principalId": "local://user-zslj4",
"state": "active",
"transitioning": "no",
"transitioningMessage": "",
"type": ["nodeTemplate"](https://rancherdev.rd.zedev.net/v3/schemas/nodeTemplate),
"useInternalIpAddress": true,
"uuid": "2fce90de-c911-440d-acc8-afef8fc5661b"
}
creation of a VM
Error creating machine: Error in driver during machine creation: Post "https://ntx-dev.zepower.com:9440/api/nutanix/v3/clusters/list": dial tcp 172.20.10.110:9440: connect: connection timed out:Timeout waiting for ssh key
We have been using Nutanix with Rancher in prod and pre prod for almost a year. We use to be able to connect to this cluster, it is a dev Nutanix cluster so we do not connect that often but this week I needed to create a cluster and I am getting this error. This was something that we used to do without a problem a few months a go.
Describe the solution you'd like
Afaiu this Integrates with Prism Central. It would be very useful to have documentation about how to create project and service account for this one with by following least privilege principle?
Rancher: 2.6.4
Cluster Name: xyz
Endpoint: https:test.com
usre id: abc
Pass: *****
Port: 9440
VMImage: ntnx-1.2
VMNetwork: v1681_162.11.180.0-24
Error: [cmdCreateInner] error setting machine configuration from flags provided: nutanix-vm-network cannot be empty:Timeout waiting for ssh key
It should create a RKE1 Nutanix cluster sucessfully
We are trying to add Nutanix Node and Cluster Driver. After that we are try to create node template to get Nutanix driver args information from Nutanix. We could see the following parameters are not fetching from nutanix cluster.
nutanix-endpoint
nutanix-username
nutanix-password
nutanix-vm-network
nutanix-vm-image
The Below node driver information provide
Download URL: https://github.com/nutanix/docker-machine/releases/download/v3.1.0/docker-machine-driver-nutanix_v3.1.0_linux
#Custom UI URL: https://nutanix.github.io/rancher-ui-driver/v3.1.0/component.js (If we add this link we are seeing Error: "There was an error trying to load custom driver nutanix. Please verify the custom node driver settings.undefined")
Checksum: e8f4f2e7ae7e927534884b5a3a45a38a5bd2c2872de1d65375f6e009bed75dba
Whitelist Domains: nutanix.github.io
Please provide the version of:
2.8.2
3.6.0
rke2 cluster
nodes get created on nutanix aos but stuck on waiting for node ref in rancher rke2 cluster setup
Dears,
Once i try to create a new cluster, i am receiving the below error:
Error creating machine: Error in driver during machine creation: Panic in the driver: runtime error: invalid memory address or nil pointer dereference:Timeout waiting for ssh key
i have tried different versions of node driver and still same error please advise.
Is your feature request related to a problem? Please describe.
Kubernetes security hardening is hard to do manually.
Describe the solution you'd like
Rancher 2.6 added tech preview support for RKE2 provisioning which makes simpler to deploy security hardened Kubernetes cluster and it would be nice to have support for it on here too.
Additional context
Here is real world example cloud-init file which can be used to deploy RKE2 server to top of Ubuntu cloud images (tested with Ubuntu 20.04 LTS) with CIS hardening template enabled.
users:
- name: rancher
groups:
- sudo
sudo: ALL=(ALL) NOPASSWD:ALL
ssh_authorized_keys:
- ssh-ed25519 <removed> user
shell: /bin/bash
packages:
- net-tools
write_files:
- path: /etc/issue
content: |2+
**
**
**
** cloud-init is still running, please wait for the reboot..
**
**
**
- path: /etc/rc.local
permissions: "0755"
content: |
#!/bin/bash
sleep 30s
cp /etc/rancher/rke2/rke2.yaml /home/rancher/.kube/config
chown -R rancher /home/rancher/.kube/
- path: /etc/rancher/rke2/config.yaml
permissions: "0644"
content: |
profile: cis-1.6
tls-san:
- k8s.example.com
node-taint:
- "CriticalAddonsOnly=true:NoExecute"
runcmd:
- |
curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="server" INSTALL_RKE2_VERSION="v1.21.5+rke2r2" sh -
systemctl enable rke2-server.service
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
rm kubectl
mkdir /home/rancher/.kube
useradd -r -c "etcd user" -s /sbin/nologin -M etcd
ln -s /usr/local/share/rke2/rke2-cis-sysctl.conf /etc/sysctl.d/60-rke2-cis.conf
bootcmd:
- ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf
hostname: k8s-server
manage_etc_hosts: localhost
growpart:
mode: auto
devices:
- /dev/sda1
apt:
sources: {}
power_state:
message: Rebooting once after cloud-init has finished
mode: reboot
AOS 6.5.5.5
pc.2022.6.0.9
2.8.3
3.6
Successfull Node deployment without errors.
Since Rancher 2.8.3 with Machine image version "v0.15.0-rancher110" Cluster nodes are looping on creating/deleting server on new and existing clusters using centos Rhel stream 9 latest image.
"command: sudo hostname testcl01-pool1-a9f476fd-slm2c && echo "testcl01-pool1-a9f476fd-slm2c" | sudo tee /etc/hostname err: inappropriate ioctl for device output: )"
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
add GPU support
2.7.5
3.3.0
rancher-node-with-nutanix-overlay-network-logs
Expecting Rancher Virtual Machines to be deployed to Overlay Network
Error creating machine: Error in driver during machine creation: Panic in the driver: runtime error: invalid memory address or nil pointer dereference:Timeout waiting for ssh key
Seems similar to the following issues with cluster_reference
not being available within overlay network subnet status response (as opposed to other non-overlay networks)
Could we ensure the Feature List in the README.md is updated with an extensive list of features that are capable with the node driver? Thank you!
Prism 2023.3.0.2
Nutanix AOS 6.6.2.6
Hypervisors AHV
Rancher 2.7.5
https://github.com/nutanix/docker-machine/releases/download/v3.3.0/docker-machine-driver-nutanix_v3.3.0_linux
https://github.com/nutanix/docker-machine/releases/download/v3.4.0/docker-machine-driver-nutanix
API json - not able to extract yaml in Rancher UI
{
"amazonec2Config": null,
"annotations": {
"ownerBindingsCreated": "true"
},
"baseType": "nodeTemplate",
"cloudCredentialId": null,
"created": "2023-08-30T07:47:53Z",
"createdTS": 1693381673000,
"creatorId": "user-sbz2q",
"driver": "nutanix",
"engineEnv": { },
"engineInstallURL": ["https://releases.rancher.com/install-docker/23.0.sh"](https://releases.rancher.com/install-docker/23.0.sh),
"engineLabel": { },
"engineOpt": { },
"engineRegistryMirror": [ ],
"id": ["cattle-global-nt:nt-rcgzn"](https://<REDACTED>/v3/nodeTemplates/cattle-global-nt:nt-rcgzn),
"labels": {
"cattle.io/creator": "norman"
},
"links": {
"nodePools": ["…/v3/nodePools?nodeTemplateId=cattle-global-nt%3Ant-rcgzn"](https://<REDACTED>/v3/nodePools?nodeTemplateId=cattle-global-nt%3Ant-rcgzn),
"nodes": ["…/v3/nodes?nodeTemplateId=cattle-global-nt%3Ant-rcgzn"](https://<REDACTED>/v3/nodes?nodeTemplateId=cattle-global-nt%3Ant-rcgzn),
"self": ["…/v3/nodeTemplates/cattle-global-nt:nt-rcgzn"](https://<REDACTED>/v3/nodeTemplates/cattle-global-nt:nt-rcgzn),
"update": ["…/v3/nodeTemplates/cattle-global-nt:nt-rcgzn"](https://<REDACTED>/v3/nodeTemplates/cattle-global-nt:nt-rcgzn)
},
"logOpt": { },
"name": "Nutanix-PoC",
"nutanixConfig": {
"cloudInit": "#cloud-config",
"cluster": "<REDACTED>",
"diskSize": "0",
"endpoint": "<REDACTED>",
"insecure": true,
"password": "<REDACTED>",
"port": "9440",
"project": "<REDACTED>",
"storageContainer": "",
"username": "<REDACTED>",
"vmCategories": [ ],
"vmCores": "1",
"vmCpuPassthrough": false,
"vmCpus": "2",
"vmImage": "Base-Ubuntu22.04-linux---SCSI.0-1",
"vmImageSize": "0",
"vmMem": "4096",
"vmNetwork": [
"Nutanix - IPAM"
],
"vmSerialPort": false
},
"principalId": "local://user-sbz2q",
"state": "active",
"storageOpt": { },
"transitioning": "no",
"transitioningMessage": "",
"type": ["nodeTemplate"](https://<REDACTED>/v3/schemas/nodeTemplate),
"useInternalIpAddress": true,
"uuid": "4e76ae64-a002-4ad3-b694-00c70e25842d"
}
Provisioning Rancher RKE1 cluster with Docker Machine Driver v.3.3.0 or v.3.4.0 to create small single node cluster on Nutanix platform.
Provisioned VM/node is not booting and hangs after VM instance created due to no contact to a disk. Console of created VM on Nutanix cluster shows:
Booting from DVD/CD...
Boot failed: Could not read form CDROM (code 004)
Booting from Hard Disk...
Is your feature request related to a problem? Please describe.
Allow to add labels to VM
I used:
It got registered on the Rancher server. I am trying to build an RKE1 cluster, I created my Node Template:
{
"annotations": {
"ownerBindingsCreated": "true"
},
"baseType": "nodeTemplate",
"cloudCredentialId": null,
"created": "2021-12-22T00:07:34Z",
"createdTS": 1640131654000,
"creatorId": "user-xtj9l",
"driver": "nutanix",
"engineEnv": { },
"engineInstallURL": "https://releases.rancher.com/install-docker/18.09.sh",
"engineLabel": { },
"engineOpt": { },
"engineRegistryMirror": [ ],
"id": "cattle-global-nt:nt-tphlk",
"labels": {
"cattle.io/creator": "norman"
},
"links": {
"nodePools": "…/v3/nodePools?nodeTemplateId=cattle-global-nt%3Ant-tphlk",
"nodes": "…/v3/nodes?nodeTemplateId=cattle-global-nt%3Ant-tphlk",
"remove": "…/v3/nodeTemplates/cattle-global-nt:nt-tphlk",
"self": "…/v3/nodeTemplates/cattle-global-nt:nt-tphlk",
"update": "…/v3/nodeTemplates/cattle-global-nt:nt-tphlk"
},
"name": "RK1-test",
"nutanixConfig": {
"cloudInit": "#cloud-config\nusers:\n- name: tony\n sudo: ['ALL=(ALL) NOPASSWD:ALL']\n ssh-authorized-keys:\n - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDDNhhR0Wf4GSz1K5cLdIYPcrKG27irKGgbkzyb3JS/x1irCysGPi9SIj5gChBGNGv99p9gZGPGFgL+CYdXdCORgyT........
"cluster": "NTX-DEV",
"diskSize": "0",
"endpoint": "ntx-dev.URL.com",
"insecure": false,
"password": "XXXX",
"port": "9440",
"storageContainer": "VM",
"username": "nutanix_support",
"vmCategories": [ ],
"vmCores": "1",
"vmCpuPassthrough": false,
"vmCpus": "2",
"vmImage": "CentOS-7-x86_64-GenericCloud-1907",
"vmImageSize": "300",
"vmMem": "4096",
"vmNetwork": [
"Software Development Apps (VLAN 125)"
]
},
"principalId": "local://user-xtj9l",
"state": "active",
"transitioning": "no",
"transitioningMessage": "",
"type": "nodeTemplate",
"useInternalIpAddress": true,
"uuid": "4cf59fe2-bb41-4ded-99d7-fb11e527e0f2"
}
and I am getting this error:
Error creating machine: Error in driver during machine creation: error: {:Timeout waiting for ssh key
is this a problem with the driver? as there is no way for me to add an ssh key when I create a template
Version pc.2022.4.0.1
NCC Version: 4.6.0
LCM Version: 2.6.0.1
Rancher stable 2.7.1
v3.3.0
You should be able to add cloud credentials for nutanix driver via "Cluster management" > "Cloud Credentials" in the rancher menu.
Or while creating a new rke2 cluster.
We cannot add the cloud credentials for the nutanix driver from either the cloud credentials or the add menu when creating an rke2 cluster.
Nutanix option is missing in the cloud credentials menu
Its also not possible to add the credentials while creating the rke2 cluster, as the needed keys do not show up and the key fields are disabled
Please note: The deployment process, credential creation and connection to Nutanix prism is working for the rke1 cluster creation process via the node templates (which are not usable for rke2 clusters)
Can anybody help?
Is your feature request related to a problem? Please describe.
Allow to automatically add serial port to VM to allow console output on some cloud image template
Describe the solution you'd like
propose a checkbox to create a virtual serial port
After we solved the Driver connection with the update on ticket #18 (comment) I was able to create clusters on the DEV env, now I am trying to do it on a different cluster (UAT and PROD) but I am getting this errors:
2022/01/05 22:07:08 [INFO] [node-controller-rancher-machine] Using SSH client ty pe: external
2022/01/05 22:07:08 [INFO] [node-controller-rancher-machine] Using SSH private k ey: /management-state/node/nodes/cp1/machines/cp1/id_rsa (-rw-------)
2022/01/05 22:07:08 [INFO] [node-controller-rancher-machine] &{[-F /dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=non e -o LogLevel=quiet -o PasswordAuthentication=no -o ServerAliveInterval=60 -o St rictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected] -o Id entitiesOnly=yes -i /management-state/node/nodes/cp1/machines/cp1/id_rsa -p 22] /usr/bin/ssh }
2022/01/05 22:07:08 [INFO] [node-controller-rancher-machine] About to run SSH co mmand:
2022/01/05 22:07:08 [INFO] [node-controller-rancher-machine] exit 0
2022/01/05 22:07:08 [INFO] [node-controller-rancher-machine] SSH cmd err, output : exit status 255:
2022/01/05 22:07:08 [INFO] [node-controller-rancher-machine] Error getting ssh c ommand 'exit 0' : ssh command error: command: exit 0 err: exit status 255 output :
2022/01/05 22:07:10 [INFO] [node-controller-rancher-machine] SSH cmd err, output : exit status 255:
2022/01/05 22:07:10 [INFO] [node-controller-rancher-machine] Error getting ssh c ommand 'exit 0' : ssh command error: command: exit 0 err: exit status 255 output :
2022/01/05 22:07:11 [INFO] [node-controller-rancher-machine] Getting to WaitForS SH function...
2022/01/05 22:07:11 [INFO] [node-controller-rancher-machine] (cp1) Calling .GetS SHHostname
2022/01/05 22:07:11 [INFO] [node-controller-rancher-machine] (cp1) Calling .GetS SHPort
2022/01/05 22:07:11 [INFO] [node-controller-rancher-machine] (cp1) Calling .GetS SHKeyPath
here is the whole file:
logs.txt
I checked with the networking team and they have opened all the ports and allow the IP from the Rancher server to this Nutanix cluster. Also it does build and deletes VM's. There is just this SSH error that I am not sure why I keep getting and it does not allow the cluster build to be completed. After testing it works perfect on the dev cluster I want to now start playing around in AUT to possible in a future take it to prod. One thing to add is that on Dev we are using CentOs and on UAT we are using Ubuntu. Not sure if it is an OS thing.
Thanks
Francisco Yanez
DevOps Manager from ZE Power Group (NUTANIX partner)
I am receiving the below error once i try to provision using rancher, the VMs gets created and IPs assigned but it stuck on the below error:
Cluster must have at least one etcd plane host: failed to connect to the following etcd host(s) [10.50.29.184]
Is your feature request related to a problem? Please describe.
No
Describe the solution you'd like
Enable deploying RKE2 cluster from Rancher to AOS
Describe alternatives you've considered
None
Additional context
None
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.